Proteogenomics: Proteomics for Genome Annotation | Proteo Annotator

Proteogenomics: Proteomics for Genome Annotation

Genome sequencing projects have revolutionized biology, producing vast catalogs of predicted genes across model and non-model organisms. However, these predictions often require validation, as computational gene models alone cannot always capture alternative splicing, novel open reading frames, or previously unannotated coding regions. This is where proteogenomics plays a transformative role. By integrating mass spectrometry (MS)-based proteomics with genomic and transcriptomic data, proteogenomics provides direct evidence of protein expression, refining and improving genome annotation.

The significance of proteogenomics extends beyond improving genome annotation

Refinement of gene models:

Provides peptide-level validation of predicted coding regions, reducing false positives in genome annotations.

Discovery of novel proteins:

Detects uncharacterized proteoforms and alternative transcripts, expanding the known protein repertoire.

Cross-omics validation:

Bridges transcriptomics with functional protein evidence, ensuring predicted transcripts are biologically relevant.

Biomedical impact:

In humans, proteogenomics enhances the understanding of cancer, rare diseases, and microbial pathogenesis by uncovering novel therapeutic targets and biomarkers.