Are the initial sequence comparisons made using amino acid or nucleotide sequences?
The initial hidden markov models are generated using amino acid sequences.
What is the source of the gene expression data set?
EvoCor currently uses the human component of the Tissue Specific Pattern of mRNA Expression set (GSE1133) as well as the GNF Mouse GeneAtlas V3 (GSE10246). The mouse expression data can be accessed here and the human expression data can be accessed here.
What is used to determine the Pearson correlation coefficient? Is it simply the detection of the gene in the arrays or differences in levels between an experiment and a control?
The GNF Mouse GeneAtlas V3 is a collection of tissue samples from naive male C57BL6 mice. The Pearson correlation coefficient represents the correlation between the query gene and the input gene across all tissue samples. The human Tissue Specific Pattern of mRNA Expression Set is a collection of tissue samples from 79 different human tissues.
Can other datasets, not embedded in EvoCor, be used to look for genes that co-evolve and are co-expressed?
At this point, only the datasets originally included in EvoCor are usable via the web interface. If you would like to extend EvoCor to use different datasets, please contact us.
How is the Hamming distance calculated?
The Hamming Distance is calculated by creating two binary vectors each of length 182--one vector for the input gene and one vector for the query gene. Each position in the vector corresponds to a '1' or a '0' depending on whether a sequence homolog can be found using HMMER3.
How are the amino acid sequence Hidden Markov Models constructed?
We constructed the profile Hidden Markov Models using the standalone version of HMMER. Eddy SR. Accelerated Profile HMM Searches. PLoS Comput Biol. 2011 Oct;7(10):e1002195
How frequently is EvoCor updated to incorporate new gene sequence information?
We will update amino acid sequence information from NCBI semi-annually.
What type of gene expression analysis is been compared - microarray or RNA seq?
EvoCor currently uses microarray expression data. We are currently in the process of updating the gene expression dataset with RNA deep sequencing, rather than microarray, data. Stay tuned.
Can I use my own expression data sets to look for genes that co-evolve and are co-expressed?
This is not currently a feature of EvoCor, but may be included in a future update.
Why is there no list generated for my gene of interest?