Skip to main content
Fig. 1 | BMC Bioinformatics

Fig. 1

From: A big data approach to metagenomics for all-food-sequencing

Fig. 1

Workflow: (a) Partitioning: reference sequences are divided into the sets G1 and G2. Each reference is further partitioned into slightly overlapping windows wi. (b) Database construction: the s smallest k-mers of each window are computed and inserted into the database. (c) Classification: a database is queried with the s smallest k-mers of a read. The returned hits are used to count the number of hits within each window. Target reference genomes are identified by high scores in the window count statistics. In case of several partitions, the top hits from querying each database need to be merged in order to assign a read to a reference genome. After all reads have been processed, coverage check and quantification are performed

Back to article page