Description

The goal of this project is to build a high quality gene set library from the Gene Expression Omnibus (GEO) by identifying disease-associated experiments in a mammalian system (cells or tissue) where gene expression profiling was used to compare normal and disease tissue. The recommended method for identifying the gene set from these studies is the utilization of a new tool developed by the Ma'ayan Lab called GEO2Enrichr. GEO2Enrichr is a Chrome Extension that adds functionality to the GEO database web-site to make extraction of gene set and downstream analysis easy. Your goal is to identify such studies and then use GEO2Enrichr to create and submit signatures into this web-site form. You will receive 1 point for each unique entry into the database. If we find a mistake in any of your entries we will subtract 30 points from your overall score for each such mistake.

The hashtag for this project is #DISEASES_BD2K_LINCS_DCIC_COURSERA.

Useful links for finding disease ids:
Disease Ontology
MalaCards
orphanet

Predicted GEO series containing disease signatures based on textural contents of GEO series.

We used the collected disease signatures as postive training samples and other signatures as negative samples to learn a Gradient Boosting classifier to identify GEO series studying disease versus control states based on the textural description of the GEO series. The classifier was then applied to all microarray studies on GEO performed in human or mouse to prioritize GEO series studying diseases. The results of the classifier is listed below.

Submission form:
Sign in is required for submission.