Fig. 7From: SeqPredNN: a neural network that generates protein sequences that fold into specified tertiary structuresSeqPredNN performance across CATH domains. a Plot of the median estimated SeqPredNN error for CATH architectures with different frequencies in the training data. The medians are indicated by dots, and the area of the dots represent the number of single-domain proteins for each architecture in the test dataset. The vertical lines indicate the first and third quartile for each architecture. The least-squares regression line is fitted to all the individual protein domains datapoints (not presented here) with the standard error in the shaded region. b The distribution of sequence recovery rates for each CATH architecture represented in the test dataset as box-and-whisker plots were superimposed on the density of sequence recovery values. Outliers are presented as black dots. c The distribution of sequence recovery rates for each CATH classBack to article page