Analysis of factors influencing transcript quantification from RNA-Seq paired ended experiments
Factors that may influence transcript quantification
-
multiple splice forms
-
polymorphisms
-
intron signal (intronic seq)
-
sequencing errors
-
alignment errors
-
annotation errors
-
Differential GC content across exons
-
Random exon priming
-
Positional bias (degradation 5->3')
-
Fragment length?
-
read length
multiple splice forms
- A gene with 1 isoform is trivial
- Concurrent expression of multiple isoforms obviously increases the complexity and uncertainty of read assignment. Particularly when combined with multiple levels of transcript expression
- False negatives may occur for splice-regulatory variants as the error is proportional on the number and uniformity of expression levels of isoforms.
Read length
- Increasing read length decreases numbers of ambiguously mapped reads. This increases the accuracy of quantification.
- Transcript identification accuracy is co-dependent on coverage and read length. For coverage > 10million, read length becomes less important as there are plenty of reads to unambiguously identify transcripts.
- Transcript quantification accuracy may increase with read length due to increased numbers of correctly detected splice junctions. For read lengths > 50bp identification and quantification accuracy of junctions does not significantly improve
http://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-015-0695-9
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4531809/
http://bioinformatics.oxfordjournals.org/content/31/24/3938.full
deck
By acoutoal
deck
- 354