Information extraction from scientific texts in Russian

Information extraction from scientific texts in Russian
Article's languageRussian
Abstract

This article describes methods for automatic term extraction and linking to Wikidata. The advantage of the proposed methods is the potential possibility of their applicability to any field of knowledge where only unmarked texts and small term dictionaries are available. To carry out the experiments, a corpus of scientific texts RuSERRC was collected and marked up. The corpus and models are published on GitHub and may be useful to other research teams.

DOI10.31144/si.2307-6410.2021.n19p57-70
UDK004.912
Issue # 19,
Pages57-70
File batura2021.pdf (501.9 KB)