Skip to main content

Table 2 Accuracy assessment of MapReduce-Inchworm compared to the original Inchworm using three simulated read datasets for mouse RNA-Seq

From: K-mer clustering algorithm using a MapReduce framework: application to the parallelization of the Inchworm module of Trinity

    

Number of pair-end reads

    

100 M

150 M

200 M

REF-EVAL

Contig

Recall

Original

0.3098

0.3122

0.3126

MapReduce

0.3274

0.3308

0.3314

Precision

Original

0.3258

0.3283

0.3280

MapReduce

0.3389

0.3419

0.3422

Nucleotide

Recall

Original

0.9752

0.9764

0.9779

MapReduce

0.9763

0.9783

0.9793

Precision

Original

0.9847

0.9845

0.9845

MapReduce

0.9870

0.9869

0.9869

  

N1

Original

32,712

39,273

43,862

  

MapReduce

33,687

40,452

45,344

  

N2

Original

16

20

26

  

MapReduce

4

12

8

  1. Statistics from the REF-EVAL component of DENONATE [41], for three simulated read datasets. Recall is the fraction of reference elements that are correctly recovered by an assembly. Precision is the fraction of assembly elements that correctly recover a reference element. At the Contig level, a 99% alignment cutoff has been used to identify a recovered transcript (left-hand bars in Fig. 3). Original refers to the results of Trinity run with the original version of Inchworm. MapReduce refers to the results of Trinity run with the MapReduce-Inchworm method presented here. Also shown are the N1 and N2 statistics, as given by the script FL_trans_analysis_pipeline.pl. N1 represents the total number of assembled transcripts that give full-length matches to the reference. N2 represents the number of fused transcripts. For comparison, there are 80,867 reference transcripts