Выделение именованных сущностей из текстов распорядительных документов с помощью глубоких нейронных сетей

Named entity recognition in texts of administrative documents with deep neural networks

Article's language
Russian
Abstract
Named Entity Extraction (NER) is the task of extracting information from text data that belongs to predefined categories, such as organizations names, place names, people's names, etc. Within the framework of the presented work, was developed an approach for the additional training of deep neural networks with the attention mechanism (BERT architecture). It is shown that the preliminary training of the language model in the tasks of recovering the masked word and determining the semantic relatedness of two sentences can significantly improve the quality of solving the problem of NER. One of the best results has been achieved in the task of extracting named entities on the RuREBus dataset. One of the key features of the described solution is the closeness of the formulation to real business problems and the selection of entities not of a general nature, but specific to the economic industry.
DOI
10.31144/si.2307-6410.2020.n16.p137-148
Pages
137-148
File
Number