WORKSHOP ON COMPLEX NETWORKS AND THEIR APPLICATIONS TSRB AUDITORIUM, 85 Fifth Street, GEORGIA TECH COLLEGE OF COMPUTING Monday January 22nd 8:30-8:50 Coffee 8:50-9:00 Welcome from Dana Randall, Fan Chung, Ashish Goel, Milena Mihail and Chris Wiggins 9:05-10:00 Main Plenary Talk by John Doyle, Caltech Title: THE ARCHITECTURE OF ROBUSTNESS Abstract: This talk focuses on architectural and organizational principles of networked systems, building on insights about the fundamental nature of complex biological and technological networks drawn from three converging research themes. 1) With molecular biologys detailed description of components and growing attention to systems biology the organizational principles of biological networks are becoming increasingly apparent. 2) Advanced technologys complexity is now approaching biologys. While the components differ, there is striking convergence at the network level of architecture and the role of layering, protocols, and feedback control in structuring complex multiscale modularity. New theories of the Internet and related networking technologies have led to test and deployment of new protocols for high performance networking (www.hot.caltech.edu, netlab.caltech.edu). 3) A new mathematical framework for the study of complex networks suggests that this apparent network-level evolutionary convergence within/between biology/technology is not accidental, but follows necessarily from the universal system requirements to be efficient, adaptive, evolvable, and robust to perturbations in their environment and component parts. (www.cds.caltech.edu/sostools, www.cds.caltech.edu/~doyle) 10:05-11:00 David Bader, Georgia Tech Title: SOLVING MASSIVE GRAPH PROBLEMS USING PETASCALE COMPUTING Abstract: Graph theoretic problems are representative of fundamental kernels in traditional and emerging computational sciences such as chemistry, biology, and medicine, as well as applications in national security. Yet they pose serious challenges for parallel machines due to non-contiguous, concurrent accesses to global data structures with low degrees of locality. Few parallel graph algorithms outperform their best sequential implementation due to long memory latencies and high synchronization costs. In this talk, we consider several graph theoretic kernels for connectivity and centrality and discuss how the features of petascale architectures will affect algorithm development, ease of programming performance, and scalability. 11:05-12:00 Ellen Zegura, Georgia Tech Title: THE GENI NSF INITIATIVE 12:-2:00 Lunch Break 2:00-2:25 Santosh Vempala, MIT & Georgia Tech Title: CORE-DENSE GRAPHS AND HYPERGRAPHS Abstract: Core-dense graphs are a common generalization of dense graphs and metrics. We discuss their motivation, definition and extension to hypergraphs. The class of maximum constraint satisfaction problems with r literals per constraint (MAX-r-CSP) corresponding to core-dense hypergraphs has a polynomial-time approximation scheme. This unifies and extends classes of CSP's known to have PTAS's. Our main tools are a method for low-rank tensor approximation and a simple scaling. The algorithm also applies to problems with a constant number of additional global constraints, e.g., maximum weighted bisection. This is joint work with W. Fernandez de la Vega, R. Kannan and M. Karpinski. 2:30-2:55 Amin Saberi, Stanford University Title: TOWARDS TOPOLOGY AWARE NETWORKS Abstract: Overlay networks are the main design strategy by which we evolve the Internet. We focus on questions of maintaining good topology connectivity and topology awareness by decentralized protocols in unstructured P2P networks. 3:00-3:25 Elliot Anshelevich, RPI Title: STRATEGIC NETWORK FORMATION THROUGH PEERING AND SERVICE AGREEMENTS Abstract: We introduce a game theoretic model of network formation in an effort to understand the complex system of business relationships between various Internet entities (e.g., Autonomous Systems, enterprises, residential customers). This system is at the heart of Internet connectivity. Connection contracts are local between two entities and may either be of peer-peer or a customer-provider variety. Entities bid (or demand payment) for the formation of these contracts, and in the resulting network some traffic is routed and other traffic dropped. As often occurs in practice, we also include a mechanism that penalizes providers if they drop traffic emanating from one of their customers. By incorporating some of the qualities of Internet business relationships, we hope that our model will have predictive value. For a natural objective function, we prove that the price of stability is at most 2. With respect to social welfare, however, the prices of anarchy and stability can both be unbounded, leading us to consider how much we must perturb the system to obtain good stable solutions. We thus focus on the quality of Nash equilibria achievable through centralized incentives: solutions created by an ``altruistic entity" ( e.g., the government) able to increase individual payouts for successfully routing a particular demand. Finally, we give a characterization of Nash equilibria as flows of utility with certain constraints, which helps to visualize the structure of stable solutions and provides us with useful proof techniques. This is joint work with Bruce Shepherd and Gordon Wilfong 3:30-3:55 Michael Mahoney, Yahoo Research Title: SCALABLE ALGORITHMS FOR VECTOR SPACE COMPUTATIONS IN COMPLEX DATA ENVIRONMENTS Abstract: In numerous internet applications, data with complex interrelationships are modeled by matrices and analyzed with vector space or convex analysis methods. Consider, e.g., term-document matrices or the adjacency of a large graph. A difficulty applying these techniques in large or complex data environments is that traditional algorithms require at least quadratic or cubic time. Recall, e.g., overconstrained least squares approximation (or more generally overconstrained Lp regression) and the related low-rank matrix approximation. Several randomized algorithms for these ubiquitous problems that run qualitatively faster than traditional algorithms are described; applications to learning in internet data contexts are discussed; and several limitations and extensions of these ideas are described. 4:00-4:15 Coffee Break 4:15-4:40 David Kempe, USC Title: OPTIMIZATION PROBLEMS IN SOCIAL NETWORKS A social network - the graph of relationships and interactions within a group of individuals - plays a fundamental role as a medium for the spread of information, ideas, influence, or diseases among its members. An idea or innovation will appear, and it can either die out quickly or make significant inroads into the population. Similarly, an infectious disease may either affect a large share of the population, or be confined to a small fraction. The collective behavior of individuals and the spread of diseases in a social network have a long history of study in sociology and epidemiology. In this talk, we will investigate graph-theoretic optimization problems relating to the spread of information or diseases. Specifically, we will focus on two types of questions: influence maximization, wherein we seek to identify influential individuals to start a cascade of an innovation to maximize the expected number of eventual adopters; and infection minimization, wherein we seek to remove nodes so as to keep a given infected component small. We will present constant factor and bicriteria algorithms for versions of these problems, and also touch on open problems and issues regarding competition among multiple innovators. (This talk represents past joint work with Jon Kleinberg, Eva Tardos, Ara Hayrapetyan, Martin Pal, and Zoya Svitkina, as well as ongoing work with Shishir Bharathi and Mahyar Salek.) 4:45-5:10 Kevin Lang, Yahoo Research Title: PARTITIONING REAL-WORLD "POWER-LAW" GRAPHS We discuss some properties of real-world "power-law" graphs that make them challenging inputs for graph partitioning, especially the balance problem, namely that the best quotient cuts and normalized cuts are highly unbalanced. We exhibit a number of graphs with this problem, and discuss some experiments and measurements that might shed more light on its nature. For example, the graph might have some kind of fractal structure; or it might contain respectable clusters that happen to have a wide range of sizes; or there could be a very large number of tiny nearly disconnected pieces. We also discuss how the main theoretical and practical graph partitioning algorithms behave on these graphs, and present some ideas for improving them. Algorithms will include Metis, Leighton-Rao, Spectral and SDP-based methods, plus KRV and some KRV-like algorithms that are more practical. 5:15-5:40 Ravi Kumar, Yahoo Research Title: STRUCTURE AND EVOLUTION OF ONLINE SOCIAL NETWORKS Abstract: Online social networks have become a major phenomenon on the web. We analyze the structure and evolution of the LiveJournal blogspace and the Flickr social network. We then formulate a simple model of social networks that can explain the success of Milgram's famous experiment that gave rise to `six-degrees of separation'. 5:45-6:15 Lincoln Lu, University of South Carolina Title: USING LOVASZ LOCAL LEMMA IN THE SPACE OF RANDOM MATCHING Abstract: The Lovasz Local Lemma is known to have an extension for cases where independence is missing but negative dependencies are under control. We show that this is often the case for random matchings, and we provide easy-to-check conditions for the non-trivial task of verifying a negative dependency graph for random matchings. This gives an alternative probabilistic tool for random permutations, random injections, random regular graphs, and the configuration model for random graphs of given fixed degree sequence. We will show some applications in these different areas, including existence results for hypergraph packing and Tur\'an type extremal problems. --------------------------------------------------------------- Tuesday January 23rd 8:45-9:00 Coffee 9:05-10:00 Plenary Talk, Mark Newman, University of Michigan Title: COMPLEX STRUCTURES IN COMPLEX NETWORKS Abstract: A typical approach for investigating the structure of complex networks is to focus on some structural property or feature of interest and construct a measure for detecting and quantifying that particular feature. This has been a successful approach; many types of structure in networks have been discovered in this way. But it is possible for large networks to contain far more complex structural features than those we can easily think of and describe, including patterns and correlations on many different scales. This talk will present some new methods for revealing structural features of networks in an automated fashion that do not require us to know in advance the particular type of structure we are looking for. These methods allow us to perform "exploratory" data analysis, in which the data themselves lead to new hypotheses about network structure. 10:05-11:00 Plenary Talk, Andrew Tomkins, Yahoo Research Title: WEB SEARCH AND ONLINE COMMUNITIES Abstract: In this talk, I'll cover some emerging trends in web search, with a digression into some non-traditional search problems. I'll also cover some recent results on the dynamics of online community formation. Finally, I'll describe ongoing work at Yahoo at the intersection of web search and community, and will discuss how community and interpersonal relationships will color the way we think about search in the future. 11:05-12:00 Elchanan Mossel, U.C. Berkeley Title: STOCHASTIC MODELS ON NETWORKS, GAMES AND RECONSTRUCTION Abstract: I will survey some recent results in network reconstruction and games on network. Special emphasis will be given to to discussion of the interplay between hard combinatorial constraints and the stochastic nature of the models and the data. 12:00-1:30 Lunch Break 1:30-1:55 Christopher Moore, University of new mexico and Santa Fe Institute Title: INFERRING HIERARCHICAL STRUCTURE FROM NETWORK DATA AND PREDICTING MISSING CONNECTIONS Abstract: Many real-world networks appear to have a hierarchical structure, containing communities and subcommunities at many different scales. These could correspond to interest groups on the Web, functional units in a metabolism, or ecological communities in a food web. We present a mathematically principled way to infer these structures from real-world network data; specifically, we use a Markov chain Monte Carlo algorithm to sample hierarchical structures with probability proportional to the likelihood with which they would generate the observed connections. Our method does much more than simply clustering the network: instead, it automatically detects both assortative and disassortative behavior at all scales of organization. Moreover, it allows us to rate how "surprising" a given connection is, and it performs far better than chance if we use it to predict connections that have not yet been observed. It may therefore be useful to experimenters for whom observing connections is costly. While this work is primarily experimental, it presents a number of interesting questions for theorists: how quickly does this Markov chain mix on different kinds of graphs? What can we say about network structure in a model where connections must be queried, rather than being observed all at once? Finally, what is it about real-world networks that makes these structures so important? This is joint work with Aaron Clauset (UNM/SFI) and Mark Newman (Michigan). 2:00-2:25 Dimitris Achlioptas, U.C. Santa Cruz Title: MOVING AWAY FROM G(n,p). Abstract: The most popular model of random networks, G(n,p), was introduced in 1959 by Erdos and Renyi: given a set of n vertices, include each possible edge independently with probability p. The main attraction of this model is its extreme technical simplicity, stemming from the perfect independence between edges. Unfortunately, this simplicity is tantamount to a lack of geometry: for example, the shortest path metric of G(n,p) is often used to provide dimension lower bounds for embedding into normed spaces. Moreover, the growth of many real networks, rather than being monotone in ``time", reflects the unfolding of an optimization tradeoff. We would like to build models of random networks that reflect these issues. And we want to do so without ``building-in" emergent properties, e.g., by specifying the degree distribution. So, we propose studying random graphs whose evolution is influenced by simple, but globally-acting constraints. We will present (mostly experimental) results for two concrete such notions: i) a miniscule twist on G(n,p), motivated by the evolution of chromosomes, which seems to make a big difference and, more ambitiously, ii) random graphs with constrained total wire (edge) length. Joint work with Ricardo Menchaca-Mendez. 2:30-2:55 Raissa D'Souza, U.C. Davis Title: THE OPTIMIZATION ORIGINS OF PREFERENTIAL ATTACHMENT Abstract: We show how the mechanism of preferential attachment can emerge from an underlying network optimization framework. The preferential attachment (PA) model so obtained has two novel features, saturation and viability, which have natural interpretations in the underlying network. Like PA, saturation has previously been assumed at an axiomatic level. The combination of PA and saturation leads to power-law degree distributions with exponential cutoff which give excellent fits to a broad range of empirical observations of networks. Here we show how a simple underlying optimization framework can give rise to both known mechanisms and likewise to a new concept of viability, and suggest that such models form a good starting point for the analysis of many networks. In addition we discuss the fit provided to a broad range of data, including previously unexplained data on the Internet obtained from "whois" tables. This is joint work with Jennifer Chayes, Christian Borgs, Noam Berger and Bobby Kleinberg. 3:00-3:25 Josh Cooper, University of South Carolina Title: WHERE DO POWER LAWS COME FROM: A MODEL-FREE ETIOLOGY Abstract: We offer a much sought-after answer to the question, "Where do power laws come from?" which makes no reference to any particular model. Many random graph models have been studied in the past few years that give plausible growth processes for real-world massive graphs, and which provably yield power law degree sequences with high probability. However, there are so many different processes that produce power law degree distributions "in nature" that it is hard to see how even a large collection of parametric models could account for all the observed behaviors. Furthermore, many of the currently proposed models are not robust, in the sense that small perturbations of the growth rules cause the power law to disappear. We offer an alternative explanation that is inherently "model-free." In particular, we give a natural, rigorous definition of the oft-used term "scale-free" and show that it actually characterizes power law degree distributions. Since it is easy to see that massive graphs are generally scale-free in this sense, the resulting ubiquity of power laws is a consequence. We also find as a consequence a multitude of open questions ripe for study: scale-free hypergraphs and estimators for distributional exponents, for example. Joint work with Lincoln Lu. 3:30-3:45 Coffee Break 3:45-4:10 Aric Hagberg, Los Alamos Title: DESIGNING THRESHOLD NETWORKS WITH GIVEN STRUCTURAL AND DYNAMICAL PROPERTIES Abstract: Threshold networks contain a highly-compressible layered structure and allow fast computation of many network properties such as degree distribution, clustering, and degree correlation. We show how to construct arbitrarily large, sparse, threshold networks with (approximately) any prescribed degree distribution or Laplacian spectrum. Control of the spectrum allows careful study of synchronization properties of threshold networks including the relationship between heterogeneous degrees and resistance to synchrony. 4:15-4:40 Juan Vera, Georgia Tech Title: A GEOMETRICAL PREFERENTIAL ATTACHMENT MODEL OF NETWORKS. Abstract: We study a random graph $G$ that combines certain aspects of geometric random graphs and preferential attachment graphs. The vertices of $G$ are $n$ sequentially generated points x1,x2, ...,xn chosen uniformly at random from the unit sphere. After generating xt, we randomly connect it to m points chosen among x1, x2,...,x{t-1} with probability proportional to degree and some function of the distance. We show that G has a power law degree sequence and small diameter. Unlike the preferential attachment graph, this geometric preferential attachment graph has small separators, similar to some experimental observations. Join work with Abraham Flaxman and Alan Frieze 4:45-5:10 Joel Friedman, University of British Columbia Title: TROUBLE WITH WEB MATRICES AND PAGERANK Abstract: We shall explain the basic PageRank algorithm. In our experience, the less "damping" used, the more "unstable" the PageRank vector is. Attempting to understand this points out a lot of perplexing properties of web matrices. First, this instability is not present in all Markov chains. Secondly, it is not always easy to link this instability to power laws or other properties of web matrices that we know. Specifically, if A is a Markov matrix and v is a probability vector (e.g., the "teleportation"), as i increases, A^i v seems to have erratic properties, including the fact that websites with a lot of A^i v weight for some i values may be insignificant for larger or smaller values of i. Also, it is not clear how the "instability" is affected by the choice of normalization. Studying these problems suggests new notions in numerical analysis, such as a "cross condition matrix" for eigenvalue stability. This is joint work with Chen Greif. 5:15-5:40 Frank McSherry, Microsoft Research Title: FULL WEB PAGERANKING ON A LAPTOP Abstract: While the enormous scale of the world wide web makes it a vastly interesting and informative artifact, its size quickly becomes a burden, imposing substantially on the types of analyses that can be performed. Even simple analyses like PageRank require dedicated compute clusters to run effectively. In this talk, I will describe some recent experience designing and building a system that supports full-web PageRank analyses on a commodity laptop. The work brings together algorithmic improvements, structural observations about the web graph, as well as some old fashioned engineering. I will report on a set of PageRanking measurements taken across a 4B page crawl, as performed on my laptop over the course of a day or two. The techniques generalize to full web SVD, SALSA, and loopy belief propagation. 5:45-6:10 Reid Andersen, U.C. San Diego Title: LOCAL GRAPH PARTITIONING USING PAGERANK VECTORS Abstract: A local graph partitioning algorithm finds a cut near a specified starting vertex, with a running time that depends largely on the size of the small side of the cut, rather than the size of the input graph. We give a local partitioning algorithm using a variation of PageRank with a specified starting distribution. We derive a mixing result for PageRank vectors similar to that for random walks, and show that the ordering of the vertices produced by a PageRank vector reveals a cut with small conductance. In particular, we show that for any set $C$ with conductance $\Phi$ and volume $k$, a PageRank vector with a certain starting distribution can be used to produce a set with conductance $O(\sqrt{\Phi \log k})$. We present an improved algorithm for computing approximate PageRank vectors, which allows us to find such a set in time proportional to its size. By combining small sets found by this local partitioning algorithm, we obtain a cut with conductance $\phi$ and approximately optimal balance in time $O(m \log^4 m/\phi^2)$. Joint work with Fan Chung and kevin Lang --------------------------------------------------------------------- Wednesday January 24th 8:45-9:00 Coffee 9:05-10:00 Plenary Talk, Brendan Frey, University of Toronto Title: RECURRING MATHEMATICAL AND COMPUTATIONAL PROBLEMS IN BIOLOGY Abstract: In my work on several scientific and engineering projects involving the computational modeling of molecular biology data, I have observed a smallnumber of recurring mathematical and computational problems. These problems often involve exponentially large search spaces, probability, machine learning and graph-based methods, so computer scientists, engineers and mathematicians should be able to help solve them. In this talk, I will describe these recurring problems, explain why they are important in biology, and offer some insights into how to proceed in finding solutions. 10:05-11:00 Chris Wiggins, Columbia University Title: COMPLEX NETWORKS IN BIOLOGY 11:05-11:30 Quaid Morris, University of Toronto Title: UNTANGLING BIOLOGICAL NETWORKS USING MAXIMUM ENTROPY PRIORS ON GRAPHS Abstract: I will talk about the problem of untangling real biological interaction networks from noisy measurements of these networks. In these networks, nodes represent genes and edges represent a particular type of interaction between a pair of genes. This problem has become increasingly important in the last few years as more and more experimental techniques are being developed to measure different types of interaction networks on a large-scale, and existing techniques are being applied to different organisms. In this talk, I will concentrate on protein-protein interaction (PPI) networks which are undirected networks in which the edges connect genes whose proteins physically interact in the cell. I will present an energy-based model which uses maximum entropy priors on both the true interaction (signal) network and the network of false positive (noise) interactions. Our structural priors score networks in terms of their degree distributions and clustering coefficients. Exact inference in our model is intractable; I will describe an efficient approximate inference algorithm to compute edge appearance posteriors. I will also describe the evaluation of our model on a real-world problem and some surprising results generated out of that evaluation. 11:35-12:00 Joel Bader, Johns Hopkins Title: NETWORK INFERENCE AND ANALYSIS FOR SYSTEMS IN BIOLOGY Abstract: Genome sequencing has become easy, but understanding how genes and proteins assemble into a network remains hard. I will discuss our group's work on this problem. We have been developing data-mining algorithms for inferring gene modules from high-throughput genetic screens. The algorithms work on generic multi-partite graph structures and should be applicable to a wide range of problems. In a second approach, we have been developing a computational scheme to compute gene regulatory network wiring diagrams directly from DNA and protein sequence using all-atom simulations. Finally, I will describe an extension of capture-recapture statistics that we have developed to estimate false-positive and false-negative rates for network edges (biological interactions) revealed by noisy high-throughput experiments. 12:00-2:00 Lunch break 2:00-2:25 Olga Troyanskaya, Princeton Title: MODELING BIOLOGICAL SYSTEMS FROM HETEROGENEOUS GENOMICS DATA Abstract: Broad availability of diverse functional genomic data should enable fast and accurate prediction of protein function, interactions, and regulation. However, these data remain vastly underutilized and have not caused a truly substantial change in our understanding of biology. I will discuss some reasons for such gap between data and understanding in genomics, including computational and biological diversity of the data, high noise levels, and need to involve biology researchers in the analysis. I will also present our resent work in addressing this problem, including context-sensitive prediction of biological networks, integration of gene expression datasets, and development of a functional genomics evaluation framework. 2:30-2:55 Guillermo Cecci, IBM Title: DEPLETION OF FEEDBACK LOOPS IN LARGE SCALE BIOLOGICAL NETWORKS Abstract: The local structure of biological and man-made complex systems can be studied by enumerating network motifs. Since searching for large size arbitrary motifs is cumbersome, we limit our focus on cycles and utilize a parallelized program to search for large-size cycles and characterize their statistical properties in different networks. Studying biological complex systems and man-made complex systems abstracted to directed networks, we will show that the statistics of directional correlations (i.e. links coming in and out of each node) can be captured by a simple physical model, a Pott's spin system. Most of the networks display strong negative correlations in the directions of links along cycles, and are locally described as ``anti-ferromagnets'', i.e. nodes tend to be either sources of sinks of directional links. However, this negative correlation is determined not only by local properties, but also by global ones, as architectural randomizations tend to be more positively correlated than the original networks. We will also present an evolutive algorithm that recreates most of these novel topological properties, along with traditional ones, and emphasizes the interplay of local and global interactions in the determination of directionality. Finally, we will show that this particular topological feature has a dramatic impact on the dynamics of the networks, as it tends to minimize dynamical interference of signals reverberating across nodes, and in particular significantly reduces the number of feedback loops. 3:00-3:25 Meredith Betterton, University of Colorado, Boulder Title: ACTIVATING INTERACTIONS AND THE DYNAMICS OF BIOLOGICAL NETWORKS Abstract: Many studies of biological networks have analyzed network `wiring diagrams' (topological features). Such work has suggested that specific types of network structure may increase network robustness and therefore confer a selective advantage. However, knowledge of network topological features does not allow one to predict network dynamical behavior---for example, whether deleting a protein from a signaling network would maintain steady-state behavior, or induce oscillations or chaos. In this talk I show that the balance between activating and inhibiting connections is important in determining whether a network reaches steady state or oscillates. Using a simplified mathematical model that captures the essential behavior of a network of interacting genes or proteins, we studied randomly sampled networks, networks subjected to selection for specific properties, and topologies of real biological networks. In all cases, the ratio of activating to inhibiting connections affects whether the network reaches steady state or oscillates. Therefore, the fraction of activating connections may be a control parameter that cells use to predispose a network to oscillate or reach steady state. 3:30-3:55 Alexander Hatremink, Duke University Title: PROCTOR: AN ALGORITHM FOR RECONSTRUCTING THE INTERNAL INTERACTION TOPOLOGY OF PROTEIN COMPLEXES Abstract: Recent advances in high-throughput experimental techniques have given us a wealth of protein interaction data, rich in quantity and variety. While the quantity and variety of data present special challenges for modeling, they also present unique opportunities for gaining insight into the proteome by allowing us to leverage multiple perspectives. We present PROCTOR (PROtein Complex TOpology Reconstruction), an algorithm for inferring the internal interaction topology of protein complexes from both direct protein- protein interaction data (e.g., yeast two-hybrid: Y2H) and from protein co-complex data (e.g., affinity purification-mass spectrometry: AP-MS). While other methods have attempted to use both these kinds of data, they usually require that co-complex data first be transformed into pairwise protein-protein interaction data under a ^Ñspoke^Ò or ^Ñclique^Ò model, a transformation we do not require because we learn the hidden internal topology. We evaluate PROCTOR by combining all available data from eight high- throughput proteomic datasets, resulting in coverage of over 90% of the yeast proteome. PROCTOR exhibits significantly better sensitivity and specificity than other algorithms for predicting domain-domain interactions from Y2H and AP-MS data. Additionally, it provides us with an improved view of the global protein-protein interaction network in yeast by reducing both the false positive and false negative error rates of the observed data. Finally, PROCTOR elucidates the topology of AP-MS purifications, both recovering known complexes (e.g., Arp2/3 and RNA polymerase II) and suggesting new complexes. In doing so, PROCTOR provides us with a global view of how proteins interact with each other via their domains to form macromolecular complexes. 4:00-4:25 Alexei Vasqez, Simons Center for Systems Biology, Institute of Advanced Study, Title: DEGREE CORRELATIONS IN REAL AND MODEL NETWORKS: MEASURES, ORIGINS AND CONSEQUENCES Abstract: Recent studies have shown that several graphs representing real systems are characterized by a power law degree distribution, resulting in significant changes in other graph properties such as the average distance between nodes, the size of the giant component and spreading processes. In this work I show that once the degree distribution is specified, the degree correlations between adjacent vertices strongly modulates those graph properties, in some cases resulting in striking differences depending on the type and magnitude of these correlations. To support this claim I review some recent work done in collaboration with other authors about the characterization of degree correlations and their impact on the size of the giant component, percolation problems, spreading processes and computational complexity. 4:30-4:55 Ed Coffman, Columbia University Title: SELF-ASSEMBLY NETWORKS Abstract: We consider a file distributed over a network in which each node, except the root node, has at most some small segment of the file; the root is responsible for keeping the entire file. When a client at some node v requests the entire file, the file segment at v begins downloading immediately to the client, and, at the same time, the client's request is forwarded to all of v's neighbors, which in turn send their segments down to v and recursively continue the process throughout the network. When set up correctly, segments making up the remainder of the file will arrive seamlessly at the client's node. We discuss the interesting modeling and algorithmic questions posed by the design of such systems. Our model entails a biomolecular metaphor. The analysis focuses on tree networks.