This web-site was developed to support our publication:
Neil R Clark, Ruth Dannenfelser, Christopher M Tan, Michael E Komosinski and Avi Ma'ayan
Sets2Networks: network inference from repeated observations of sets
BMC Systems Biology 6, 89 (2012) PMID: 22824380.
Please cite our paper if you are using our algorithm and or tool.
Background
The skeleton of complex systems can be represented as networks where vertices represent entities, and edges represent the relations between these entities. Often it is impossible, or expensive, to determine the network structure by experimental validation of the binary interactions between every vertex pair. It is usually more practical to infer the network from surrogate observations. Network inference is the process by which an underlying network of relations between entities is determined from indirect evidence. While many algorithms have been developed to infer networks from quantitative data, less attention has been paid to methods which infer networks from repeated observations of related sets. This type of data is ubiquitous in the field of systems biology and in other areas of complex systems research, hence such methods would be of great utility and value.
Results
Here we present a general method for network inference from repeated observations of sets of related entities. Given experimental observations of such sets, we infer the underlying network connecting these entities by generating an ensemble of networks consistent with the data. The frequency of occurrence of a given link throughout this ensemble is interpreted as the probability that the link is present in the underlying real network conditioned on the data. Exponential random graphs are used to generate and sample the ensemble of consistent networks, and we take an algorithmic approach to numerically executing the inference method. The effectiveness of the method is demonstrated on synthetic data before employing this inference approach to problems in systems biology and systems pharmacology, as well as to construct a co-authorship collaboration network. We predict direct protein-protein interactions from high-throughput mass-spectrometry proteomics; build networks that connect pluripotency regulators based on ChIP-seq and loss-of-function/gain-of-function followed by expression data; extract a network that connects 53 cancer drugs to each other and to 34 severe adverse events by mining the FDA's Adverse Events Reporting Systems (AERS); and construct a co-authorship network that connects Mount Sinai School of Medicine investigators. The predicted networks and online software to create networks from entity-set libraries are provided online at http://www.maayanlab.net/S2N.
Conclusions
As empirical data about sets of related entities accrues, there are more constraints on possible network realizations that can fit the data; in the language of statistical mechanics, the size of the microstate ensemble shrinks, until the underlying network resolves. The network inference method presented here can be applied to resolve different types of networks in current systems biology and systems pharmacology as well as in other fields of research.
Powerpoint slides describing the project presented by Dr. Neil R. Clark at the SBBQ International Conference at Iguassu, Brazil on 5/23/2012
Powerpoint slides describing the project presented by Professor Avi Ma'ayan at the National Systems Biology Centers's Annual Meeting in Chicago, USA 7/20/2012
Poster describing the project presented by Professor Avi Ma'ayan at the National Systems Biology Centers's Annual Meeting in Chicago, USA 7/20/2012
The synthetic network is first converted into gene sets by following a series of random walks. After running Sets2Networks on the file an inferred network is derived which closely resembles the synthetic network.
Download the original synthetic network or the gene sets.
We applied the S2N algorithm to predict new protein-protein interactions for 50 CORUM complexes. The higher the confidence of the prediction the lighter the color in the left heatmap. The heatmap on the right contains known PPI interactions in the background.