To infer gene functions across species, we have developed a highly accurate and computationally efficient method for large-scale application of high-throughput orthology predictions. The method contains three major steps: 1) all-against-all comparisons for every pair of genes, 2) pair-wise predictions for every two genomes, and 3) the generation of clusters that contain orthologous genes across multiple genomes. Our database currently contains predicted orthologs for over 900 bacterial genomes.
Yu, C., V. Desai, L. Cheng, and J. Reifman. QuartetS-DB: a large-scale orthology database for prokaryotes and eukaryotes inferred by evolutionary evidence. BMC Bioinformatics. 2012 June 22; 13:143. [PDF, 22726705]
Yu, C., N. Zavaljevski, V. Desai, and J. Reifman. QuartetS: a fast and accurate algorithm for large-scale orthology detection. Nucleic Acids Research. 2011 May 13; 39(13):e88. [PDF]
Yu, C., V. Desai, N. Zavaljevski, and J. Reifman. Large-scale orthology predictions for inferring gene functions across multiple species. Proceedings of the HPCMP Users Group Conference. Schaumburg, IL. 2010 June 14-17; 269-272. [PDF]
Yu, C., and P. A. Wilson. A tool for creating and parallelizing bioinformatics pipelines. Proceedings of the HPCMP Users Group Conference. Pittsburgh, PA. 2007 June 18-22; 417-420. [PDF]