Skip to main content

Table 1 The training and testing datasets variant prioritization of iMEGES

From: iMEGES: integrated mental-disorder GEnome score by deep neural network for prioritizing the susceptibility genes for mental disorders in personal genomes

Dataset

Positive

Negative

Description

Training dataset 1

574

27,735

The most likely causal dsQTL SNPs were downloaded from deltaSVM [30]

Training dataset 2

1614

161,400

Regulatory associated mutations were downloaded from HGMD from 2012, and random SNVs with allele frequency ≥ 1% in the 1000 Genomes Project

Training dataset 3

31,118

36,540

eQTLs SNPs were collected from 11 studies on 7 tissues/cell lines

Training dataset 4

78,613

593,335

Non-coding eQTLs from GRASP was considered to be associated, while SNPs from 1000 Genomes Project not to be associated

Testing dataset 1

3439

66,916

Based on P-values of imputed SNPs from Psychiatric Genome Consortium (PGC) schizophrenia GWAS

Testing dataset 2

8002

19,322

Based on P-values of imputed SNPs from Psychiatric Genome Consortium (PGC) autism spectrum disorder (ASD)

Testing dataset 3

76

156

Manually curated regulatory SNPs with experimental validation.

Testing dataset 4

75

402

The synonymous variants compiled by [72]