Medical College of Wisconsin
CTSICores SearchResearch InformaticsREDCap

Improved rat genome gene prediction by integration of ESTs with RNA-Seq information. Bioinformatics 2015 Jan 01;31(1):25-32

Date

09/14/2014

Pubmed ID

25217576

Pubmed Central ID

PMC4296142

DOI

10.1093/bioinformatics/btu608

Scopus ID

2-s2.0-84922366358   3 Citations

Abstract

MOTIVATION: RNA-Seq (also called whole-transcriptome sequencing) is an emerging technology that uses the capabilities of next-generation sequencing to detect and quantify entire transcripts. One of its important applications is the improvement of existing genome annotations. RNA-Seq provides rapid, comprehensive and cost-effective tools for the discovery of novel genes and transcripts compared with expressed sequence tag (EST), which is instrumental in gene discovery and gene sequence determination. The rat is widely used as a laboratory disease model, but has a less well-annotated genome as compared with humans and mice. In this study, we incorporated deep RNA-Seq data from three rat tissues-bone marrow, brain and kidney-with EST data to improve the annotation of the rat genome.

RESULTS: Our analysis identified 32 197 transcripts, including 13 461 known transcripts, 13 934 novel isoforms and 4802 new genes, which almost doubled the numbers of transcripts in the current public rat genome database (rn5). Comparisons of our predicted protein-coding gene sets with those in public datasets suggest that RNA-Seq significantly improves genome annotation and identifies novel genes and isoforms in the rat. Importantly, the large majority of novel genes and isoforms are supported by direct evidence of RNA-Seq experiments. These predicted genes were integrated into the Rat Genome Database (RGD) and can serve as an important resource for functional studies in the research community.

AVAILABILITY AND IMPLEMENTATION: The predicted genes are available at http://rgd.mcw.edu.

Author List

Li L, Chen E, Yang C, Zhu J, Jayaraman P, De Pons J, Kaczorowski CC, Jacob HJ, Greene AS, Hodges MR, Cowley AW Jr, Liang M, Xu H, Liu P, Lu Y

Authors

Allen W. Cowley Jr PhD Professor in the Physiology department at Medical College of Wisconsin
Matthew R. Hodges PhD Associate Professor in the Physiology department at Medical College of Wisconsin
Mingyu Liang PhD Center Director, Professor in the Physiology department at Medical College of Wisconsin
Pengyuan Liu PhD Adjunct Professor in the Physiology department at Medical College of Wisconsin




MESH terms used to index this publication - Major topics in bold

Animals
Expressed Sequence Tags
Genetic Variation
Genome
High-Throughput Nucleotide Sequencing
Mice
Molecular Sequence Annotation
RNA
Rats
Transcriptome
jenkins-FCD Prod-484 8aa07fc50b7f6d102f3dda2f4c7056ff84294d1d