Skip to main content

Table 1 Annotating Genes with Positive Samples (AGPS)

From: Gene function prediction using labeled and unlabeled data

Input:

   - positive training data P1

   - validation set P2

   - unlabeled data Ku

   - unknown gene Ug

Output:

   - Prediction results

Stage 1: Learning

      U = Ku + P2;

      Stage 1.1: Initial negative set generation

         - Construct classifier f1 based on P1 and U with one-class SVMs;

         - Classify U using f1. The predicted negative set N 1 is used as the initial negative training set in Stage 1.2;

         - U = U - N 1 .

      Stage 1.2: Negative set expansion

         - Classifier set FC = [ ], negative set NS = [ ], i = 1.

         - repeat

            - i = i + 1;

            - Construct classifier f i based on P1 and N 1 with two-class SVMs;

            - FC(i - 1) = f i , NS(i - 1) = N1;

            - Classify U by f i , N 2 is the predicted negative set, where |N 2 | ≤ k|P1|;

            - N 1 = [N 2 ; N SV ], where N SV is the negative SVs of f i in the previous step;

            - U = U - N2.

         - until |U| <k|P1|

      Stage 1.3: Classifier and negative set selection

         - Classify U with classifiers from FC, and select the classifier FC(i) with the best prediction accuracy;

         - Return negative set TNNS(i).

Stage 2: classification

      Classify Ug with P and TN, where P = P1 + P2.