From: Gene function prediction using labeled and unlabeled data
Input: |
---|
- positive training data P1 |
- validation set P2 |
- unlabeled data Ku |
- unknown gene Ug |
Output: |
- Prediction results |
Stage 1: Learning |
U = Ku + P2; |
Stage 1.1: Initial negative set generation |
- Construct classifier f1 based on P1 and U with one-class SVMs; |
- Classify U using f1. The predicted negative set N 1 is used as the initial negative training set in Stage 1.2; |
- U = U - N 1 . |
Stage 1.2: Negative set expansion |
- Classifier set FC = [ ], negative set NS = [ ], i = 1. |
- repeat |
- i = i + 1; |
- Construct classifier f i based on P1 and N 1 with two-class SVMs; |
- FC(i - 1) = f i , NS(i - 1) = N1; |
- Classify U by f i , N 2 is the predicted negative set, where |N 2 | ≤ k|P1|; |
- N 1 = [N 2 ; N SV ], where N SV is the negative SVs of f i in the previous step; |
- U = U - N2. |
- until |U| <k|P1| |
Stage 1.3: Classifier and negative set selection |
- Classify U with classifiers from FC, and select the classifier FC(i) with the best prediction accuracy; |
- Return negative set TN ← NS(i). |
Stage 2: classification |
Classify Ug with P and TN, where P = P1 + P2. |