Neural sentence embedding models for semantic similarity estimation in the biomedical domain

BMC Bioinformatics

Table 3 Average estimated cosine similarities for sentence pairs included in the negation and antonym subset and a reference set of highly similar sentences per model. Lower values indicate lower estimated semantic similarity; higher values indicate higher estimated semantic similarities

	Sent2vec	Skip-thoughts	PV-DM	PV-DBOW	fastText CBOW	fastText skip-gram
Subset of highly similar sentences (n = 11)	0.706	0.899	0.652	0.568	0.938	0.971
Negation subset (n = 13)	0.967	0.999	0.930	0.936	0.945	0.979
Antonym subset (n = 7)	0.983	0.999	0.968	0.960	0.976	0.989

PV-DM Paragraph Vector Distributed Memory, PV-DBOW Paragraph Vector Distributed Bag of Words, CBOW Continuous Bag of Words

ISSN: 1471-2105