API Usage

riboraptor

# Scores¶

$\inline \frac{\big(\frac{Counts_{CDS}}{Counts_{UTR}}\big)_{Ribo}}{\big(\frac{Counts_{CDS}}{Counts_{UTR}}\big)_{RNA}}$

A score to determine if protein translation is complete. Defined as the ratio between reads in coding region to reads in the 3’UTR region normalized by the corresponding ratio in mRNA data allowing for incorrect 3’UTR annotations (CDS to 3’UTR ratio in mRNA is not 1 and is reflective of different protein producing potential)

• ORFScore [Bazzini2014] : Compares count of number of RPFs in each frame to a uniform distribution using Chi-Squared statistic to identify actively translated ORFs.
$\log_2\big(1 + \sum_{i=1}^3 \frac{(F_i-\bar{F})^2}{F} \big) \times \begin{cases} -1 & (F_1 < F2) \cup (F_1 < F_3),\\ 1 & \text{otherwise} \end{cases}$

where \$F_i\$ represents number of reads in frame \$i\$ and \$bar{F}\$ represents \$mean(F1,F2,F3)\$

$0.5 \times \sum_{l=26}^{l=34} f(l) - f_{ref}(l)$

where the \$f_{ref}\$ is contructed by counting the number of fragments of a particular read length over only annotated protein-coding genes. Cutiff is determined by identifying outliers using Tukey’s method.

The FLOSS cutoff, calculated as a function of the total number of reads in the transcript histogram, was established by considering a rolling window of individual annotated genes and the computing the upper extreme outlier cutoff for each window using Tukey’s method (Q3 + 3*IQR, where Q3 is the 3rd quartile and IQR is the interquartile range).