Системная Информатика, № 23

Download

TRIZ-oriented model for program assistant

The article presents some of the author's developments in the field of TRIZ - the theory of solving inventive problems. The developments are expected to be turned into an assistant program intended for a wide range of users and a wide range of tasks. A model for setting the problem is proposed, in the form of an extended scenario analysis, then the complexity that hinders the implementation of this scenario is identified, and ways to resolve contradictions are proposed. Basically, the proposed methodology for setting and solving the problem corresponds to the theory developed and developed by G.S. Altshuller and his students. New is a model that unites different “branches” of TRIZ theory and practice.
Download

Automation of the construction of the terminological core of ontology in computer linguistics based on a corpus of texts

The paper proposes an approach to the automatic construction of the terminological core of ontology in computer linguistics. The issues of creating a top-level ontology, which defines possible classes of terms for their further search and systematization, are considered. An algorithm for generating and initially populating a subject dictionary is proposed. It includes two main stages. At the first step, a system of lexical-semantic classes based on ontology classes is built. The second step is filling the dictionary with terms and their correlation with dictionary classes based on available resources: a universal ontology of scientific knowledge, a thesaurus and a portal on computer linguistics. For conducting experiments, a corpus of analytical articles on computational linguistics was collected from the Habr website. Moreover, datasets with term marking were created, including 1065 sentences in Russian. Experiments were carried out to solve two problems: term detection and their classification based on ontology classes. For the first task, three neural network models were considered: xlm-roberta-base, roberta-base-russian-v0 and ruRoberta-large. The best results were obtained with the last model: 0.91 F-measures. An analysis of the classifier errors showed a high frequency of errors of incomplete selection of the term. For the second task, the ruRoberta-large model was chosen due to its results for the first task. The average F-measure value for the 12 used ontology classes was 0.89. A general architecture of a system for creating and populating ontologies is proposed, integrating linguistic approaches and machine learning methods.