Skip to main content

Table 8 Domain types for Genomics data in BoaG

From: Shared data science infrastructure for genomics data

Type

Attributes

Details

Genome

taxid

Taxonomy ID of each species

refseq

Refseq ID of the GFF file

Sequence

List of sequence reads in each GFF file [26].

AssemblerRoot

List of assembly programs associated with this genome

accession

Accession number

Sequence

header

Header of Sequence

FeatureRoot

List of features including exon,gene,mRNA, and CDS associated with this sequence

seq

Actual DNA sequences from FASTA files

FeatureRoot

refseq

This field shows the key ID

feature

This field is the list of features associated with this ID

Feature

accession

Accession code of the Sequence

seqid

Sequence ID

source

A text qualifier that describes the algorithm or procedure that generated this feature.

ftype

Type of the feature

start

starting point of the feature

end

End point of the feature

score

Score of the feature. This is a floating point number.

strand

+ and - for positive and negative strand respectively

phase

Phase of the feature. The phase is one of the integers 0, 1, or 2

Attribute

List of attributes for each feature

parent

Shows the parent of the attribute

Attribute

id

Attribute ID

tag

Attribute tag including gbkey etc.

value

Value of the tag

AssemblerRoot

Assembler

List of assembly programs

total-length

Total length or genome size (base pair)

total-gap-length

Total gap length after genome assembly

scaffold-N50

Scaffold N50 metric

scaffold-count

Scaffold count metric

contig-N50

Contig N50 metric

contig-count

Contig count metric

Assembler

name

Assembly program used to assemble the genome

desc

Program attributes: program name, program version, etc.