Skip to main content
Figure 5 | BMC Bioinformatics

Figure 5

From: N-gram analysis of 970 microbial organisms reveals presence of biological language models

Figure 5

Comparative Zipf-like analysis for 4-grams. Top 40 most frequently used 4-grams in (A) Bartonella tribocorum CIP 105476, (B) Alibrio salmonicida LFI1238, (C) Mycoplasma tuberculosis H37Ra, (D, Borrelia duttoni Ly. Line colors as in Figure 2. For this larger n, organisms begin to show signature n-grams that occur frequently within their proteome but rarely occurring in other organisms.

Back to article page