Feature | UniGene | TIGR | CLOBB |
---|---|---|---|
Underlying Clustering Method | megaBLAST | WU-BLAST & CAP3 | NCBI BLAST |
Stringency | Dependent on stage of clustering | Very High >= 95% identity over > 40 bp | High >= 95% identity over 30 bp |
Overlap allowed | N/A | < 20 bp | < 10% of sequence length Those with > 10% of sequence length are allowed if they contain > 10% unassigned bases |
Clusters are always contiguous? | No | Yes | Yes |
Dealing with potential chimeric clusters | Initial clustering performed with gene sequences – merging of these initial distinct clusters rejected | CAP3 does not include identified chimeric sequences | Definition of type III matches and 'superclusters' prevents chimeric sequences from merging unsuitable clusters. |
Continuity (addition of new sequences) | New builds are compared with previous builds | Post processing | Incremental within algorithm |
Historical information | Availability of previous builds | Notes showing retirement of clusters | 'superclusters' and merge events can be tagged |
Portability and adapatibility | Low | Low | High |
Ease of retention of manual curation | Medium | Medium | High |