Top of page Skip to content
Content starts here


Sequencing and Annotation

The annotation of genomes from high-throughput sequencing platforms needs to be rapid, high-throughput, automated, and fully integrated with any downstream analysis. We have deployed software systems on DoD high-performance computing assets for protein function annotation (PIPA), enzyme classification (CatFam), and strain identification (SNIT). Furthermore, we have developed a complete storage and annotation system for microbioal genome sequences (AGeS), with multiple functionalities.



Yu, C., H. J. Woo, X. Yu, T. Oyama, A. Wallqvist, and J. Reifman. A strategy for evaluating pathway analysis methods. BMC Bioinformatics. 2017 October 13; 18:453. [PDF, PubMed]

Woo, H. J., C. Yu, and J. Reifman. Collective genetic interaction effects and the role of antigen-presenting cells in autoimmune diseases. PLOS ONE. 2017 January 12; 12(1):e0169918. [PDF, PubMed]

Woo, H. J., C. Yu, K. Kumar, B. Gold, and J. Reifman. Genotype distribution-based inference of collective effects in genome-wide association studies: insights to age-related macular degeneration disease mechanism. BMC Genomics. 2016 August 30; 17:695. [PDF, PubMed]

Hang, J., V. Desai, N. Zavaljevski, Y. Yang, X. Lin, R. V. Satya, L. J. Martinez, J. M. Blaylock, R. G. Jarman, S. J. Thomas, and R. A. Kuschner. 16S rRNA gene pyrosequencing of reference and clinical samples and investigation of the temperature stability of microbiome profiles. Microbiome. 2014 September 16; 2:31. [PDF, PubMed]

Vijaya Satya, R., N. Zavaljevski, and J. Reifman. A new strategy to reduce allelic bias in RNA-Seq readmapping. Nucleic Acids Research. 2012 September 1; 40(16):e127. [PDF, PubMed]

Vijaya Satya, R., N. Zavaljevski, and J. Reifman. SNIT: SNP identification for strain typing. Source Code for Biology and Medicine. 2011 September 8; 6:14. [PDF, PubMed]

Kumar, K., V. Desai, L. Cheng, M. Khitrov, D. Grover, R. V. Satya, C. Yu, N. Zavaljevski, and J. Reifman. AGeS: a software system for microbial genome sequence annotation. PLOS ONE. 2011 March 7; 6(3):e17469. [PDF, PubMed]

Yu, C., N. Zavaljevski, V. Desai, and J. Reifman. Genome-wide enzyme annotation with precision control: catalytic families (CatFam) databases. Proteins. 2009 February 1; 74:449-460. [PDF, PubMed]

Yu, C., N. Zavaljevski, V. Desai, S. Johnson, F. J. Stevens, and J. Reifman. The development of PIPA: an integrated and automated pipeline for genome-wide protein function annotation. BMC Bioinformatics. 2008 January 29; 9:52. [PDF, PubMed]

Yu, C., and P. A. Wilson. A tool for creating and parallelizing bioinformatics pipelines. Proceedings of the HPCMP Users Group Conference. Pittsburgh, PA. 2007 June 18-22; 417-420. [PDF, DTIC]

Yu, C., N. Zavaljevski, F. J. Stevens, K. Yackovich, and J. Reifman. Classifying noisy protein sequence data: a case study of immunoglobulin light chains. Bioinformatics. 2005 June; 21(Suppl 1):i495-501. [PDF, PubMed]

Chen, D., D. Hua, X. Cheng, and J. Reifman. Gene selection for multiclass prediction of microarray data. Proceedings of the IEEE Computer Society Bioinformatics Conference. Stanford, CA. 2003 August 11-14; 492-495. [PDF, DOI]

Zavaljevski, N., F. J. Stevens, and J. Reifman. Support vector machines with selective kernel scaling for protein classification and identification of key amino acid positions. Bioinformatics. 2002 May; 18(5):689-696. [PDF, PubMed]

Reifman, J., N. Zavaljevski, and F. J. Stevens. Support vector machines for protein functional classification. Proceedings of the International Conference on Bioinformatics. Bangkok, Thailand. 2002 February 6-8; O-BH-02. [PDF, DTIC]