Our proprietary NGS-QC concept has been applied for establishing the first QC indicator database covering the largest collection of ChIP-seq and enrichment-related datasets retrieved in the public domain. Indeed, the NGS-QC database is currently hosting quality descriptors for more than 81546 publicly available datasets and our team is extensively working to cover all ChIP-seq and enrichment-related NGS profiling assays available in the public domain.
The current certified database is comprised by ChIP-seq datasets related to various transcription factors, but also relevant histone modification marks. Furthermore, enrichment-related datasets like FAIRE-seq, MeDIP-seq, Mnase-seq, MRE-seq and even RNA-seq assays are also part of this collection. While most of the current certified entries are part of the public repository GEO, we have also certified datasets produced by large consortia efforts like ENCODE. In the following months, other dataset repositories might also be included in our collection with the aim of presenting the largest collection of Quality certified datasets.
Following are a few guidelines to make best use of the database
1. Interactive, Dynamic, User-friendly and Precise query search
A multiselect and autosuggestive query builder helps to take the user to more precise results of their interest. For instance, one can search for all ChIP-seq experiments of 'H3K4me3' with 'AAA' quality descriptors performed on 'Mouse' samples.
2. Scatter plot
An interactive and dynamic scatter plot provides the end user a summary and exact range of the quality descriptors of all the experiments that come under their searched queries. Different experiments can have the same quality descriptors but they are at different positions in their quality values. The scatter plot can help to identify where exactly its quality is falling.
Furthermore, the scatter plot can be used to restrict/refine results to certain quality range after an initial search. One can zoom in to scatter plot (using the mouse scroll wheel) or select a particular region (using the 'zoom into selection area' button over top-right corner of the plot) to restrict the results within that specified region. Click the 'update results' button below the plot to show the experiments according to the selected region.
3. Refinement Panel
Apart from quality based refining, the refine panel on the left side can be used to refine the results based on other important fields like Antibody related fields (e.g. vendor, reference), experiment (e.g. ChiP-Seq, RNAseq), TMRs (Total Mapped Reads), submission date, Study ID (e.g. GSE ID from GEO), etc.
4. Violin plot
Violin plots are a modification of box plots that add plots of the estimated kernel density to the summary statistics displayed by box plots. They can be used to summarize the quality distribution of individual target molecules that are present in the database.
Violin plots in the main database page are interactive and can be clicked to access the results of the selected target molecule.