Figure 1From: Defining functional distances over Gene OntologyScheme of the method used for obtaining the Metric Model based on Gene Ontology annotations. (1) Profile vectors are built by retrieving the Molecular Function Gene Ontology annotations (MF-GO terms) of Interpro domains from the file interpro2go. (2) From the profiles, a co-occurrence matrix is calculated by counting how many times two MF-GO terms occur in the same set of Interpro domains. (3) The co-occurrence vectors are feature vectors that describe the functional links of each MF-GO term. The similarity between the MF-GO terms is calculated by the cosine distance between the vectors. (4) The similarity values are arranged in a matrix S. The similarity matrix was considered as the Adjacency Matrix of a weighted graph G. The terms can be clustered by means of the partition of the graph. To obtain the best partition of G, a Spectral Clustering algorithm is applied. The Spectral Clustering algorithm projects the terms in a K dimensional space which can be clustered with standard clustering techniques. (5) The GO terms are grouped in a Hierarchical Tree representing the Functional Distance D f that satisfy the mathematical properties of a Metric Space.Back to article page