Research Interest

Current Research Topics:
Next-generation Sequencing data analysis, Genomics, URA automated mass annotation of proteins, Mass Spectra Proteomics Informatics, Semantic Web, Description Logics, Ontology Engineering, Temporal Database, Network Security, High Performance Computing, Bioinformatics, Data Integration

Past Research Topics:
Non-destructive Integrity Evaluation in Micro-piles, Petroleum Geology, Structural Geology, Geophysics


Current Projects

  • Bioinformatics Optimization for Recombinant Protein Expression
    The goal of this project is to develop bioinformatics approaches to improve recombinant protein expression in the plant-based transient expression system.

  • Peptide Match
    Find exact match for one or a batch of peptide sequence query in the UniProtKB database. The search can be performated against the whole UniProtKB sequences or a subset of s equences belonging to an organism from the UniProt complete proteomes. The batch match results can be downloaded for further analysis.

  • UniProt Representative Proteomes
    Representative Proteomes (RPs), are proteomes that are selected from Representative Proteome Groups (RPGs) containing similar proteomes calculated based on co-membership in U niRef50 clusters. Representative Proteome is the proteome that can best represent all the proteomes in its group in terms of the majority of the sequence space and information. RPs at 75%, 55%, 35% and 15% co-membership threshold are pro vided to allow users to decrease or increase the granularity of the sequence space based on their requirements (Chen et al., 2011). Representative genomes (RGs) are also constructed based on the corresponding RPs.

  • Unified Rule Application (URA)
    URA automated annotation project uses UniRules which describe conditional annotation templates (UniRules) to define annotation which can be generated for (selected) predicted featur es. The goal of this research is to build a system, which can automatically annotate millions of TrEMBL protein sequence entries.

  • Omics Data Integration
    To develop systematic statistical and computational methods to adequately integrate Omics data to support Meta-analysis of multiple high-throughput biological experiments.

Completed Projects

  • Transcriptome Analysis of Adipose Tissues in Fat and Lea n Line Chickens by RNA-Seq
    The aim of the study is to greatly expand our knowledgebase of the chicken's transcriptome by deep massively parallel RNA sequencing (RNA-Seq) of adipose tissues to iden tify differentially expressed genes (DEGs) in fully fed and fasted FL and LL chickens at 7 wk of age. To the best of o ur knowledge, this is the first description of the comprehensive picture of chicken Adipose transcriptome using RAN-Se q technology.

  • ISCU Myopathy Microarray Project
    Performing microarray analysis (R and Bioconductor) on ISCU myopathy biopsies with a two-pronged goal of identifying known and novel genes that might contribute to mitochondrial ove rload, and identifying transcription factors that mediate the transcriptional response to mis-regulated mitochondrial iron homeostasis.

  • Trimming and de novo Assembly of Illimina High-throughput Short-read
    Studying the trimming of NGS data (quality trimming, adapter trimming, ambigutiy trimming, length trimming etc) and assess their impacts on the quality of de novo assembly.

  • NECC Cyber-Enable Research Project
    Supporting next generation sequencing data analysis and functional interpretation for dogfish shark genome and workforce development.

  • Bacterial Pathogen Diagnostics
    Given a set of nucleotide sequences, sequence analysis approach is used to identify the presence of any Category A bacterial pathogen proteins and identify the bacterial species of the host DNA.

  • aCGH analysis of CNVs in chickens
    Detecting of DNA Copy Number Variation in chickens divergently selected for fatness or leanness and high or low growth by array comparative genomic hybridization (aCGH) method on Roche Nimblegen 385K tiling arrays.

  • Literature Mining of Pathogenesis-Related Proteins in Human Pathogens for Database Annotation
    To develop an automated text mining system to facilitate literature-based curation of pathogenesis-related proteins in pathogens of military relevance.


Previous Research Experience

January 2007 - December 2008
Department of Computer Science and Engineering, University of South Carolina, Columbia SC, USA
  • Worked on various reasoning and management issues related to design, maintenance and evolution of very large Description Logics based ontology
  • Designed and developed a scalable and efficient manage framework for engineering very large Description Logics based ontology
  • Extended Description Logic SHIQ with temporal logic to reason the evolution of ontology and developed related tableau-based reasoning algorithm
  • Developed and implemented a suite of metrics for evaluating the semantic implications of changes in evolving ontologies
  • Categorized the types of changes in the OWL ontologies and their semantic implications

January 2006 - December 2006
Department of Biostatistics, Bioinformatics and Epidemiology, Medical University of South Carolina, Charleston SC, USA
  • Studied whether any cooperative regions of brain activate together and such activity correlates to the deception activity
  • Developed a parallel Bayesian probabilistic component analysis algorithm for fMRI deception data analysis

May 2004 - December 2005
Department of Biostatistics, Bioinformatics and Epidemiology, NHLBI Proteomics Center, Medical University of South Carolina, Charleston SC, USA
  • Funded by NHLBI MUSC Proteomics Center (N01-HV-28181)
  • Developed and maintained web-based infrastructure for proteomics data management and analysis
  • Designed and developed a universal LIMS (Laboratory Information Management System) (http://www.s3db.org)
  • Developed a tool for visualizing 2D gel electrophoresis data (http://www.agml.org)

June 2003 - May 2004
Department of Computer Science and Engineering, University of South Carolina, Columbia SC, USA
  • Designed a dual authentication protocol for IEEE 802.11 wireless LANs

August 2000 - August 2001
Department of Civil Engineering, University of South Carolina, Columbia SC, USA
  • Studied the non-destructive integrity evaluation in micro-piles
  • Designed and developed the multi-channel non-destructive integrity evaluation program