Skip to main content

Table 2 Ordered list of discriminative words for experiment 1.

From: Word correlation matrices for protein sequence analysis and remote homology detection

#

Score

Word

Count

1

7.066

CCSGSC

3

2

6.930

CCSRKC

2

3

6.419

CRSGKC

4

4

5.451

CCRSCN

2

5

5.354

GRSGKC

1

6

5.215

CSRKCN

2

7

5.142

GRGSRC

1

8

4.979

CSGRGS

1

9

4.812

CCTGSC

4

10

4.789

SYNCCR

2

  1. List of 10 most discriminative words for positive training sequences of experiment 1 according to SCOP superfamily 7.3.5 using word length K = 6. Words are sorted according to their word score. The first and second column correspond to rank and score of a word, respectively. The third column contains the word as amino acid sequence in IUPAC one-letter code. In the fourth column, the number of occurrences of a particular word in the positive training sequences are shown.