Skip to main content

Table 4 Jobs in the GeneTree pipeline for Ensembl release 49

From: eHive: An Artificial Intelligence workflow system for genomic analysis

Analysis Number of jobs Failed jobs Granularity
GenomeDumpFasta 39 - 1 per genome
GenomeLoadMembers 39 - 1 per genome
GenomeSubmitPep 39 - 1 per genome
CreateBlastRules 39 - 1 per genome
SubmitPep_* 682412 - 1 per peptide
blast_* 26614068 - All vs all peptides
UpdatePAFids 1 - 1 per pipeline
PAFCluster 1 - 1 per pipeline
Muscle 26484 7 1 per genetree
BreakPAFCluster 95 - As many as required
TreeBeST 26477 9 1 per genetree
OrthoTree 26468 - 1 per genetree
CreateHomology_dNdSJob 1 - 1 per pipeline
Homology_dNdS 3646340 1364 1 per orthologous gene pair
Threshold_on_dS 1 - 1 per pipeline
TOTAL 31022503 1380  
  1. This table shows the final number of jobs run for each analysis during the execution of the GeneTree pipeline for 39 species. All the SubmitPep_xxxxx and blast_yyyyy jobs have been grouped for simplicity. The table also shows the number of jobs that failed. Muscle and TreeBeST jobs were recovered using the BreakPAFCluster analysis. This breaks the cluster and creates new Muscle jobs.