Today's number
Today's numberCurrent number (#№ 24, 2024)
Download full number (1.08 MB)
On the Creation of the First Computers in the USSR
The paper is presented a response to the S.P. Prokhorov’s paper “The Fundamental Contribution of the USSR Academy of Sciences to the Development of Computer Sciences and Computer Technologies” in the “Bulletin of the Russian Academy of Sciences”, 2023, vol. 93, no. 10, pp. 980–988.
SoRuCom-23 – VI International Conference on the History of Computer Science
The paper is presented the results of the VI International conference on the history of computer science “Development of computer technology in Russia, the countries of the former USSR and CMEA” SoRuCom-23. It was organized on September 25–27, 2023 at the Nizhny Novgorod campus of the National Research University “Higher School of Economics”.
The Сomparison of equivalences of dense-time Petri nets with respect to memory policies
Dense-Time Petri nets (TPNs) are an extension of Petri nets where each transition has its own clock and time interval. This model is considered in the context of a weak semantics, in which time elapsing does not force transitions to fire. For TPNs, equivalences are investigated in the dichotomies of "interleaving — partial order" and "linear — branching time". The relationships between equivalences are analyzed with respect to intermediate and persistent atomic memory policies, which determine the rules for resetting the clocks of transitions.
This paper presents the object model and the language of domain-specific transition systems, a new formalism designed for specification and approbation of formal methods which ensure software reliability.
Many variants of the communication sliding window
protocol were specified and verified using various techniques like
theorem proving, model checking and their combinations. In this
paper we consider a specification of the Sliding window protocol
as a multi-agent affine model. Temporal and epistemic properties
of the protocol are expressed in Logic of Knowledge and Time.
This paper presents the approach to creation of computational platforms in the basics of UGENE Workflow Designer. This platform is based on the internal programming language. It demonstrates a method of keeping the system consistent and handy to maintain and develop if you follow the concepts of the platform internal model and the concepts of the programming language syntax.
UGENE Workflow Designer is designed to solve the bioinformatics tasks; and the UWL language is the DSL of bioinformatics. The paper describes the feature of how to reuse the basics of the platform in some other application areas.
During the semicentennial history of Computer Science and Information Technologies, several thousands of computer languages have been created. The computer language universe includes languages for different purposes (programming, specification, modeling, etc.). In each of these branches of computer languages it is possible to track several approaches (imperative, declarative, object-oriented, etc.), disciplines of processing (sequential, non-deterministic, distributed, etc.), and formalized models, such as Turing machines or logic inference machines. The listed arguments justify the importance of of an adequate classification for computer languages. Computer language paradigms are the basis for the classification of the computer languages. They are based on joint attributes which allow us to differentiate branches in the computer language universe. We present our computer-aided approach to the problem of computer language classification and paradigm identification. The basic idea consists in the development of a specialized knowledge portal for automatic search and updating, providing free access to information about computer languages. The primary aims of our project are the research of the ontology of computer languages and assistance in the search for appropriate languages for computer system designers and developers. The paper presents our vision of the classification problem, basic ideas of our approach to the problem, current state and challenges of the project, and design of query language.
Maximal employment of formal and sound methods is one of the distinctive features of C-light project. This concerns not only the fundamental verification bases, like Plotkin's operational semantics or Hoare's logics, but also implementation aspects. The localization of possible errors is one of them. In the majority of known verification systems such localization is simply coded by developers as a part of system functionality without any formal basis for it. The recent reinforcement of the C-light project by the semantical labeling technique and the choice of LLVM/Clang for the system input block allowed us to obtain some new results, which are surveyed in this paper.
The article presents a strip-method of linear pre-distortion of signals and images using multiple types of transformation matrices. Matrices that are good enough to minimize the amplitude of the noise were determined. The dependence of the quality of data recovery on the location and size of the noise were investigated. Also we found the possibility of using the strip-method in combination with image compression and developed an algorithm that implements this approach.
The authors discuss an intelligent access to information resources. An analysis of information systems which support textual or ontology-based data representation is provided. The authors propose an approach to develop an information system that would support both of the aforementioned ways to represent knowledge. A possible architecture and database schema for such a system are described.
Java bytecode decompilation is a process of reverse translation that restores Java source code by the corresponding bytecode. Java bytecode is an intermediate representation based on abstract stack machine. It may have arbitrary control flow graph, whereas the Java language contains control structures that always form a strict hierarchy. Decompilation aims at restoring all control structures, including Java exception handling blocks. In the Excelsior RVM (Java Virtual Machine with a static compiler), bytecode is decompiled in a structural intermediate representation for further optimization. When building exception handling blocks, the Excelsior RVM’s compiler assumes that the bytecode must be emitted by the standard Java source to bytecode compiler and uses a few heuristics to make reverse transformation. However, it is not always possible if the bytecode is produced by other instruments. This paper presents a decompilation algorithm that produces exception handling blocks given any correct bytecode. The algorithm has been implemented, integrated into the Excelsior RVM and tested on real-world applications.
This paper presents some popular classical methods used for time series analysis and prediction. At first relatively simple models of averaging and smoothing are described, then autoregressive and moving average models, and a “mixed” autoregressive-moving-average model as a result of crossing of the latter two mentioned models. At last an autoregressive integrated moving average model is described.
Before RNA transcription starts, special segments of DNA form a complex of regulatory proteins called transcription factors. This complex allows RNA polymerase to be bound to DNA and to start reading RNA. It is a difficult problem to search binding sites on DNA because of many factors influencing the binding. In particular, other sites in the vicinity of a given site may influence binding. To reveal those dependencies the authors introduce histograms of density distribution of binding sites, called genomic profiles.
The software package developed in the scope of this work allows building genomic profiles using the prediction of binding sites by a weight matrices algorithm for different implementations: for multicore CPU’s or NVidia GPU’s using CUDA. In addition, the software allows clustering genomic profiles using K-means clustering and hierarchical clustering. The algorithm allows to build samples – random transcription factors hierarchies based on the existing experimental structural classification to estimate the correspondence between genomic profiles construction and the existing classification. The analysis of correspondence of genomic profiles with biological classification of transcription factors was developed using the sampling algorithm.
The contribution of the paper is to clarify connections between real-time models of concurrency. In particular, we defined a category of timed causal trees and investigated how it relates to other categories of timed models. Moreover, using a larger model called timed event trees, we constructed an adjunction from the category of timed causal trees to the category of timed event structures. Thereby we showed that timed causal trees are more trivial than timed event structures because they reflect only one aspect of true concurrency, causality, and they apply causality without a notion of event. On the other hand, the first model is more expressive than the latter in that possible runs of a timed causal tree can be defined in terms of a tree without restrictions, but the set of the possible runs of any event structure must be closed under the shuffling of concurrent transitions.
This paper describes how the SPIN model checker can be applied to solving puzzles, such as riverIQGame (an advanced “wolf, goat and cabbage” puzzle) or “Irregular IQ Cube” (also known as ‘Square-1’).
This article examines the issue of storage of digital archives of newspapers. Technology is proposed covering scanning-preparation-publication cycle, and as the key challenges
presented: presenting online of editions of newspapers and search for articles by keywords.
The paper considers the problems of information collection for thematic intelligent scientific internet resources providing the systematization and integration of scientific knowledge, information resources, related to certain area of knowledge, and methods of intelligent processing of data contained in them as well as the content-based access to them. The approach to automatization of information collection combining metasearch and knowledge extraction methods based on using ontology and thesaurus of the modeled area of knowledge is proposed.
The language and implementation of string objects are defined. Strings are coded via arrays of chars in the style the C and C++ languages. List constructs are accessible for strings. The list language has been revised to be proper for strings. The paper presents some part of the program library for strings.
Automata-based software engineering is intended to development of simple, reliable, and efficient programs for reactive systems. An automata-based program implements the finite state machine in the form of the hypergraph of control states. A production rule language used to declare the use case requirements is proposed as the specification language for automata-based programs. Methods of program development are presented as the collection of golden rules defining true balance of integration of automata-based, predicate, and object-oriented programming. Methods are illustrated on the example programs.
This paper provides comparison of “Discovery” software system versus Microsoft Association Rules, Decision Trees and Neural Network embedded in Microsoft SQL Server Analysis Services. It is shown that system «Discovery», firstly, has theoretical advantages over these algorithms, secondly, practically works better on the data, where these advantages appear explicitly, and thirdly, well-behaved shows on data taken from a known repository UCI.
These results demonstrate the advantages of the Discovery over the methods embedded into Microsoft SQL Server Analysis Services.
This paper pursues the problems of scalability of parallel graph-marking algorithms in the context of garbage collection systems. The authors give a set of constraints inherent to any graph-marking approach and attempt to construct a scalable marking algorithm following these constraints. The resulting algorithm has been implemented in a Java Virtual Machine and evaluated on real-world Java applications. The obtained results indicate a substantial acceleration of marking in most cases.
A new formalism for description of ontologies of systems and their changes - conceptual transition systems - is presented. The basic definitions of the theory of conceptual transition systems are given. These systems were demonstrated to allow to specify both typical and new kinds of ontological elements constituting ontologies. The classification of ontological elements based on such systems is described.
In automata-based software engineering, the requirements for a human-in-the-loop control system are specified. A quad-rotor control system is considered as the example of a control system.
The language CTSL of specification of conceptual transition systems which are a formalism for description of dynamic discrete systems on the basis of their conceptual structure is proposed. The basic kinds of conceptual transition systems are considered. The basic predefined elements and operations of the CTSL language are defined.
The paper presents the materials of the Summer School of Young Programmers held in 2014. The School was started in 1976, on the initiative of Andrei Ershov, an outstanding advocate of programming and computer science development. During the years of its existence, the Summer School has varied the forms and methodical specifics of conducting classes. The number of participants differed from year to year as did the equipment and technology employed. Originally, classes were held for large groups of beginners and "advanced" schoolchildren; consultants from Novosibirsk Academgorodok were involved. The next step was workshops. It took 40 years to go from a few minutes of practice on the BESM- 6 to the many hours of programming on personal computers. However, the most important aspect of the project – the atmosphere of cooperation aimed at creating a sustainable commitment to self-education, assistance in vocational guidance based on practical experience, providing a glimpse of the chosen profession, a conscious choice of way of life, creativity, and socialization – has been carefully preserved. The participants of the Summer School of Young Programmers share not only their profession but also their way of thinking.
There is a gap between program languages theoreticians and practical programmers: the former are strong in science of Abstract Algebra and Formal Logic, the later – in the craft of software development. The paper sketches a very simple approach to algebraic and logic foundations of formal program semantics classification, an approach that should fit practical programmers just with rudimental experience in logic and algebra. In particular, paper presents operational, denotational, axiomatic and second-order semantics for a toy language (that just looks like a programming language).
The paper describes a mathematical model of submission documents. The model includes a description of different types of segments, which are defined in the text using markers. An algorithm for segmenting the document text is given for this model.
We discuss the possibility of studying processes based on their representation as development spirals. The geometric concepts related to the spiral are introduced. This allows modeling the development by projections of processes trajectories on different planes of the space factors.
Subdefinite Models is one of the methods to solve Constraint Satisfaction Problem. The base notion of the method is Sudefiniteness, which defines the set representation of inaccurate values. The paper contains formal definitions of different kinds of Sudefinitness, their comparisons, and descriptions of their properties.
An approach to analyze the compatibility of real-time multi-task applications with various combinations of scheduling modes and protocols of access to shared resources when run on multi-core platforms. The approach is based on the recently introduced notion of application density derived from estimation of application feasibility for various values of the processor performance. The software architecture of a relatively simple simulation tool for estimation of the task response time (and therefore, application feasibility) is described, which provides more exact data compared to the known analytical methods when they are applicable. Results of running this tool on a number of benchmarks, including balanced Liu-Layland configurations, are presented along with their analysis and interpretation. The suggested approach allows to indentify an optimal combination of the scheduling mode and access protocol for the given application structure.
This article presents a method for the analysis and verification of Use Case Maps (UCM) models with scenario control structures – protected components and failure handling constructs. UCM models are analyzed and verified with the help of coloured Petri nets (CPN) and the SPIN model checker. An algorithm for translating UCM scenario control structures into CPN is described. The presented algorithm and the verification process are illustrated by the case study of a network protocol.
Extended Finite State Machines (EFSMs) are widely used when deriving tests for checking whether a software implementation meets functional requirements. These tests usually are derived keeping in mind appropriate test purposes such as covering paths, variables, etc. of the specification EFSM. However, it is well known that such tests do not detect many functional faults in an EFSM implementation. In this paper, we propose an approach for increasing the fault coverage of test suites initially derived against the specification EFSM. For this reason, the behavior of the specification EFSM is implemented in Java using a template that is very close to the EFSM description. At the next step, the fault coverage of an initial test suite derived against the specification EFSM is calculated with respect to faults generated by µJava tool. Since the EFSM software implementation is template based, each undetected fault can be easily mapped into a mutant EFSM of the specification machine. Thus, a distinguishing sequence can be derived not for two programs that is very complex but for two machines and there are efficient methods for deriving such a distinguishing sequence for Finite State Machine (FSM) abstractions of EFSMs. As an FSM abstraction, an l-equivalent of an EFSM can be considered that in fact, is a subtree of the successor tree of height l that describes the EFSM behavior under input sequences of length up to l. Such l-equivalents are classical FSMs and if l is not large then a distinguishing sequence can be derived simply enough. The initial test suite augmented with such distinguishing sequences detects much more functional faults in software implementations of a system described by the specification EFSM.
Finite state transducers over semigroups can be regarded as a formal model of sequential reactive programs. In some cases verification of such programs can be reduced to minimization and equivalence checking problems for this model of computation. To solve efficiently these problems certain requirements are imposed on a semigroup these transducers operate on. Minimization of a transducer over a semigroup is performed in three stages: at first the greatest common left-divisors are computed for all states of a transducer, next a transducer is brought to a reduced form by pulling all such divisors ''upstream'', and finally a minimization algorithm for finite state automata is applied to the reduced transducer. As a byproduct of this minimization technique we obtain an equivalence checking procedure for transducers operating on certain classes of semigroups.
This paper presents three approaches to solve the problem of improving sentiment classification for dynamically updating text collections. The paper describes three methods essentially differing from each other. In this case the supervised machine learning and unsupervised machine learning were applied for sentiment classification. The results of methods along with cases, which method is most applicable are shown in the paper. All the experiments were set and the results were obtained on sufficiently representative text collections.
A closed information system is an information system such that its environment does not change it, and there is an information transfer from it to its environment and from its environment to it. In this paper two formalisms (information query systems and conceptual configuration systems) for abstract unified modelling of the artifacts (concept sketches and
models) of the conceptual design of closed information systems, early phase of information systems design process, are proposed. Information query systems defines the abstract unified information model for the artifacts, based on such general concepts as state, information query and answer. Conceptual configuration systems are a formalism for conceptual modelling of information query systems. They defines the abstract unified conceptual model for the artifacts. The basic definitions of the theory of conceptual configuration systems are given. These systems were demonstrated to allow to model both typical and new kinds of ontological elements. The classification of ontological elements based on such systems is described. A language of conceptual configuration systems is defined.
In the paper we describe a verification method for families of distributed systems generated by context-sensitive network grammar of a special kind. The method is based on model checking technique and abstraction. A representative model depends on a specification grammar for family of systems. This model simulates a behavior of the systems in such a way that properties which hold for the representative model are satisfied for all these systems. We show using this method for verification of some properties of multiagent system for resolution of context-dependent ambiguities in ontology population.
Today the common practice of industrial automation is characterized by the following: generally the testing of control algorithms starts only when you run the software on a new facility. As a result the testing of the algorithm is postponed until the start-and-adjustment works begin. The readiness of the algorithm up to this moment is unknown. In the article the concept of virtual plant (VP) was put forward to reduce the risks. To ensure that the control algorithm satisfies necessary requirements the model checking verification method is used.
Developing temporal requirements to distributed program systems an engineer should determine and systemize event sequences caused by system processes interleaving. A number of such sequences grow exponentially that makes the requirement development procedure nontrivial.
This is why engineers prefer not to construct or construct elementary formal requirements. As result powerful formal verification methods become unavailable or some important properties of distributed systems leaved unexpressed. While it is well-known, that development of formal requirement even without verification improves an quality of a distributed system structure and functions.
In this paper we suggest a method for formal temporal systems development which is easy-to-use. The method is based on scalable patterns of linear temporal logic formulas.
Using this method we developed formal temporal requirements to a practical program control system (a vehicle power supply control system).
Verifying the requirements with the model checking method we found 3 critical errors that were missed by developers of the vehicle power supply control system during design and testing.
In this paper we consider a procedure of parallel composition construction of Timed Finite State Machines (TFSMs) using BALM-II and suggest different ways of getting linear functions that describe a set of output delays. Our research consists of three steps: at first step we consider composition of TFSMs when an output delay may be a natural number or zero; at second – we add transitions under timeouts; at third we consider composition of TFSMs in general case (when output delays are described as sets of linear functions). This paper is devoted only to the first step of the research.
The paper is dedicated to the specification of the structure and the behavior of software libraries. It describes the existing problems of libraries specifications. A brief overview of the research field concerned with formalizing the specification of libraries and library functions is presented. The requirements imposed on the formalism designed are established; the formalism based on these requirements allows specifying all the properties of the libraries needed for automating several classes of problems: detection of defects in the software, migration of applications into a new environment, generation of software documentation. The conclusion defines potential directions for further research.
A class of information systems considered in this paper is defined as follows: a system belongs to the class if its change can be caused by both its environment and factors inside the system, and there is an information transfer from it to its environment and from its environment to it. Two formalisms (information transition systems and conceptual transition systems) for abstract unified modelling of the artifacts (concept sketches and models) of the conceptual design of information systems of the class, early phase of information systems design process, are proposed. Information transition defines the abstract unified information model for the artifacts, based on such general concepts as state, information query, answer and transition. Conceptual transition systems are a formalism for conceptual modelling of information transition systems. They defines the abstract unified conceptual model for the artifacts. The basic definitions of the theory of conceptual transition systems are given. A language of conceptual transition systems is defined.
The relations of compatibility, consistency and identity are used to describe the semantics of the predicate programming language P. Recursive types are defined with the smallest fixed point. A generic type is described by a set of constraints. For expressions of duck type, the rules to define variable types are introduced. Algorithms for checking of recursion correctness, for determining the language construct types, and for checking the semantic correctness of constructs are developed.
The AVL-tree operators are usually presented in functional languages in compact and elegant stile. However functional programs for insertion and removal operators are ineffective. They implement the construction of new tree but not the modification of the source tree. The paper describes the predicate programs for two algorithms of insertions into an AVL-tree. The predicate program can be automatically transformed to effective imperative programs.
A review of the works that form the theoretical foundations of computation on Sleptsov nets and represent peculiarities of the drawing, compilation and linking of programs in the Sleptsov nets language, as well as massively parallel computing memory architectures for implementation of Sleptsov net processors is presented. The Petri net runs exponentially slower and represents a special case of the Sleptsov net. A universal Sleptsov net containing 13 places and 26 transitions is considered, which is a prototype of the Sleptsov net processor. Examples of programs in the Sleptsov nets language for efficient multiplication, RSA encryption / decryption, calculation of a fuzzy logic function, and solution of the Laplace equation are shown. The advantages of computations on Sleptsov nets are: visual graphical language, preserving the natural parallelism of domain, fine granulation of parallel computations, formal methods for parallel programs verification, and fast massively parallel architectures that implement the computation model.
In the predicate programming compiler, optimizing transformations of the list and tree operations are described. The set of rules for replacement of the original operation to its image in the target imperative language has been developed. The translated imperative program is as effective as if it would be written manually.
The paper presents an approach to the ontology-controlled organization of the process of information extraction from texts of clinical trials protocols. Individual components of the knowledge model, such as semantic vocabulary, text genre model, clinical trials ontology, are considered. Typical situations to be extracted are described and exemplified.
Ontological Approach to Organizing Specification Patterns in the Framework of Support System for Formal Verification of Distributed Program Systems
Ontological Approach to Organizing Specification Patterns in the Framework of Support System for Formal Verification of Distributed Program Systems
(p. 111-132)
The article describes the structure of the ontology of specification patterns from the texts of technical documentation. This ontology combines patterns of known classifications of requirements with new patterns. The ontology allows the recording of Boolean pattern combinations of the following types: qualitative, real and branching time, with combined events, quantitative characteristics of events and patterns, and simple statements about the data. Examples of the requirement patterns for the real vacuum control system of the Large Solar Vacuum Telescope are given. The scheme of intellectual support system for formal verification of distributed program systems is outlined.
In the paper the notion of the conceptual model of a programming language is proposed. This formalism represents types of the programming language, values, exceptions, states and executable constructs of the abstract machine of the language, and the constraints for these entities at the conceptual level. The new definition of conceptual transition systems oriented to specification of conceptual models of programming languages is presented, the language of redefined conceptual transition systems CTSL is described, and the technique of the use of CTSL as a domain-specific language of specification of conceptual models of programming languages is proposed. The conceptual models for the family of sample programming languages illustrate this technique.
In the paper the notion of the conceptual operational semantics of a programming language is proposed. This formalism represents operational semantics of a programming language in terms of its conceptual model based on conceptual transition systems. The special kind of conceptual transition systems, operational conceptual transition systems, oriented to specification of conceptual operational semantics of programming languages is defined, the extension of the language of conceptual transition systems CTSL for operational conceptual transition systems is described, and the technique of the use of the extended CTSL as a domain-specific language of specification of conceptual operational semantics of programming languages is proposed. The conceptual operational semantics for the family of sample programming languages illustrate this technique.
The paper is devoted to studying the (‘gedanken’) experiments with input/output automata. We propose how to derive proper input sequences for identifying the final (current) state of the machine under experiment, namely synchronizing and homing sequences. The machine is non-initialized and its alphabet of actions is divided into disjoint sets of inputs and outputs. In this paper, we consider a specific class of such machines for which at each state the transitions only under inputs or under outputs are defined, and the machine transition diagram does not contain cycles labeled by outputs, i.e. the language of the machine does not contain traces with infinite postfix of outputs. Moreover, for each state where the transitions under inputs are defined, the machine has a loop under a special quiescence output. For such class of input/output automata, we define the preset synchronizing and homing experiments, establish necessary and sufficient conditions for their existence and propose techniques for their derivation. The procedures for deriving the corresponding (‘gedanken’) experiments for input/output automata are based on the well-studied solutions to these problems for Finite State Machines.
The paper concerns recent advances in reaching the goal of industrial operating system (OS) verification. By industrial OS we mean a system actively used in some industrial domain, elaborated and maintained for a significant time, not a proof-of-concept OS developed with mostly research intentions. We consider decomposition of this goal into tasks related with various functional components of OS and various properties under verification, and application of different verification methods to those tasks. This is a trial to explicate and summarize the experience of several projects on various OS components and different OS features verification conducted in ISP RAS.
Store-and-forward buffering of packets is traditionally used in modern network devices such as switches and routers. But sometimes it is a significant obstacle to the quality of service improvement because the minimal packet delivery time is limited by the multiplier of the number of intermediate nodes by the packet transmission time in the channel. The cut-through transmission of packets removes this limitation, because it uses only the head of packet, which contains the destination address, for the forwarding decision. Thus, the cut-through technology of packets transmission has considerable opportunities for the quality of service improving. Models for the computing grid with the cut-through forwarding have been developed in the form of colored Petri nets. The model is composed of packet switching nodes and generators of traffic; it can be supplied with malefactor models in the form of traffic guns disguised under regular multimedia traffic. The present work is the further development of methods of the rectangular communication grids analysis for nodes performing the cut-through switching. The methods are intended for application in the design of computing grids, in the development of new telecommunications devices, and in intelligent defense systems. Preliminary estimations show that the cut-through technology inherits some of the negative effects, which are associated with the traditional store-and-forward delivery of packets. A series of simulations revealed conditions of blocking a grid with its regular traffic. The results are applicable in the intellectual detection of intrusions and counter-measures planning.
The paper covers design and developing software for hardware plant for water purification, the architecture for it, received automaton diagrams of water preparing and normalization based on customer specifications and requirements. Discussing the components of the system, layers of abstractions, verification points, issues to build it. The way of developing well-qualified suchlike systems based on specifications is given.
Null pointer dereferencing remains one of the major issues in modern object-oriented languages. An obvious addition of keywords to distinguish between never null and possibly null references appears to be insufficient during object initialization when some fields declared as never null may be temporary null before the initialization completes. The proposed solution avoids explicit encoding of these intermediate states in program texts in favor of statically checked validity rules that do not depend on special conditionally non-null types. Object initialization examples suggested earlier are reviewed and new ones are presented to compare applicability of different approaches. Usability of the proposed scheme is assessed on open-source libraries with a million lines of code that were converted to satisfy the rules.
This work represents the further development of the method for definite iteration verification. It extends the mixed axiomatic semantics method suggested for C-light program verification. This extension includes a verification method for definite iteration over unchangeable arrays with a loop exit in C-light programs. The method includes an inference rule for the iteration without invariants, which uses a special function that expresses loop body. This rule was implemented in verification conditions generator, which is the part of our C-light verification system. To prove generated verification conditions an induction is applied which is a challenge for SMT-solvers. At proof stage the SMT-solver Z3 is used in our verification system. To overcome mentioned difficulty a rewriting strategy for verification conditions is suggested. It allows to verify the definite iteration automatically using Z3. Also the paper describes the application of the theorem prover PVS for automatic proving of such verification conditions. An example, which illustrates the application of these methods, is considered.
The purpose of the study is to demonstrate the feasibility of automated code migration to a new set of programming libraries. Code migration is a common task in modern software projects. For example, it may arise when a project should be ported to a new library or to a new platform. The developed method and tool are based on the previously created by the authors formalism for describing libraries semantics. The formalism specifies a library behavior using a system of extended finite state machines (EFSM). The mentioned EFSMs are a foundation of the code migration method.
This paper outlines the metamodel designed to specify library descriptions and proposes easy to use domain-specific language (DSL), which can be used to define models for particular libraries. The mentioned metamodel directly forms the code migration method which is also described in the paper. A process of migration splits into five steps, and for each step the algorithm was developed.
Models and algorithms were implemented in the prototype of an automated code migration tool. The prototype was tested on both artificial code examples and several real-world open source projects. Results of the experiment indicate that code migration can be successfully automated with developed tool acting as the proof of concept. Models and methods designed form a basis for more powerful migration methods and full-featured automated code migration tools.
Static verification of source code correctness is a major milestone towards software reliability. The dynamic type system of the Jolie programming language, at the moment, allows avoidable run-time errors. A static type system for the language has been exhaustively and formally defined on paper, but still lacks an implementation. In this paper, we describe our steps toward a prototypical implementation of a static type checker for Jolie, which employs a technique based on a SMT solver.
In recent years, both scientific and commercial organizations have generated a significant amount of structured data. On the basis of this data, numerous applications are being developed, which makes it necessary for modern IT specialists to get acquainted with this trend. This paper discusses the important aspects of the Semantic Web evolution and the experience of teaching the methods and tools of the Semantic Web.
The following optimizing transformations: variable gluing, replacement of the tail recursion by a loop, in-line substitution, reduction - are described. Effective imperative program is the result of transformation application. Dataflow analysis of a program is applied to perform transformations.
The article presents the results of the 4th International Conference “The development of computer technology in Russia and in the former USSR: history and prospects” (SoRuCom-2017) Zelenograd, Moscow, 3-5 October 2017. Our Institute was among the most active organizers of the Conference. The conference was supported by the Russian Foundation for Basic Research, Grant 17-07-20538, Computer History Museum (USA, CA), and Russian citizen Vladimir Kurlyandchik.
This work presents a logical-probabilistic method for adaptive control of modular systems based on the use of the modules functional similarity and the logical-probabilistic algorithm of the guided search of rules. The proposed method is based on the joint learning of the control modules, starting with finding the common control rules for all modules and finishing with their subsequent specification in accordance with the ideas of the probabilistic inference. With an interactive 3D simulator, a number of successful experiments were carried out to train four virtual models of robots. Experimental studies have shown that the proposed approach is quite effective and can be used to manage modular systems with many degrees of freedom.
This paper presents a method of information retrieval based on the ontology of scientific activity. This method makes use of general-purpose search engines to retrieve links to relevant Internet resources using the search queries generated on the basis of the ontology concepts and thesaurus. Search results that do not contain information about scientific activity are filtered using ontology.
The article considers the twenty-year history and experience of creating automated systems for testing knowledge and skills in programming at the Novosibirsk State University. An analysis of the effectiveness or ineffectiveness of methods and means in the knowledge testing systems in use is given. The modern state of the architecture of the NSUts system is described.
The article is devoted to the problem of factorization of languages definitions and programming systems. As the main factorization factor, semantic decomposition is chosen in the framework of programming paradigms analysis. This choice allows to distinguish developed typical components of programming systems. Typical components should be adapted to the design of various information systems. In addition, their existence makes it possible to formulate a training methodology for the development of information system components. The distance in the conceptual complexity between programming and development of programming systems is shown.
The paper presents an approach to the development and implementation of this kind of ontology design patterns as content patterns. Using content patterns for building ontologies of scientific subject domains allows us both to provide a uniform and consistent representation of all the entities of the ontology under development and to save human resources and avoid typical errors of ontological modeling.
In the predicate programming paradigm, development of the parallel predicate program for solving leaner equations by the Gauss-Jordan method is described. A predicate program is parallel by construction. Optimized transformations of the predicate program for solving leaner equations produce an effective imperative parallel program.
Deductive verification is simplier and faster to perform for the predicate programs then for the analogous imperative program. For each C program, it is possible to construct an equivalent predicate program and optimize it so that resulting program coincides the source C program. This method is illustrated for the C library function strcat. The construction, deductive verification, and optimizing transformation of the predicate program strcat is described. New method of string coding via two pointers has been developed.
This paper introduces a conceptual framework for complex control algorithms in form of hyperprocess model. To demonstrate the practical value of the model we describe grammar and translational semantics of a process-oriented language, known as Reflex or “C with processes”. Expressive properties of the presented notation are shown on an example control algorithm design for a hand dryer device. Finally, we give a short report about practical application of the language and results, which have been obtained during its usage.
The universe (virtual world) of computer languages includes thousands of languages within different classes – programming, specification, simulation, etc. A research project targeting onto development of the classification of this universe was under development in the period from 2008 to 2013 at the A.P. Ershov Institute of Informatics Systems. This paper presents the sum of the main theoretical research results of this research on the classification of computer languages (which, in our opinion, are still valid and promising) and discusses new approaches to the development of a computer-aided support for this classification (based on machine learning and natural language processing).
An importance of further development of the classification project is based on the need of better understanding the universe of the computer languages and more objective approach to language choice to implement software projects.
The article presents the PolarDB library of structured data manipulation tools, created in the Institute of Information Systems of the SB RAS. It is designed to construct systems for structuring, storing and processing data, including data of large volume. The library is built on the previously developed recursive typing system and covers a number of essential tasks, such as data structuring, serialization, data mapping to byte streams, index constructions, block implementation of dynamic byte streams, data distribution and data processing, backup and recovery. PolarDB library allows users to create effective solutions for specialized databases in different paradigms: sequences, relational tables, storage key-value, graph structures.
The paper is devoted to problems of extraction of language constructs, including numerical and symbolic data which are significant for a given domain. The approach of a description of natural language constructs through lexico-semantic templates is presented and the language of templates construction on the base of language YAML is considered. The lexico-semantic template is a structural pattern of required language construction with a specified structure and lexico-semantic properties. In the case of successful matching of the template with a piece of text lexical object is formed to which formal (positional) and semantic (class and properties) characteristics are attributed. In the paper architecture of web-editor for development and testing of lexico-semantic templates is presented and establishing of two specialized dictionaries is described: 1) Dictionary of names of institutions, positions and it’s an abbreviation, 2) dictionary of numerical/temporal constructions. Designed technology supports lexico-semantic analysis of text on the base of templates and can be used independently for the task of information extraction from small pieces of text as well as part of other systems of information extraction. The proposed method is efficient for recognition of parametric constructions contain an estimate of parameter values (entities or events) in a domain.
The paper is devoted to the problems of automatic construction of the terminological system of the subject domain. A method for extracting domain terms from electronic encyclopedic data sources is proposed. The peculiarity of the proposed approach is a thorough analysis of the term structure, recognition of errors based on their linguistic classification, automatic generation of lexical-syntactic patterns representing multi-component terms, and the use of a set of heuristic methods for processing "special" terms. By analyzing encyclopedic dictionaries, a reference list of concept names is automatically formed, which is used to assess the quality of the dictionaries being developed.
Reflex is a process-oriented language that provides design of easy-to-maintain control software. The language has been successfully used in several safety-critical cyber-physical systems, e. g. control software for a silicon single crystal growth furnace. Now, the main goal of the Reflex language project is development a support for computer aided software engineering targeted to safety-critical application. This paper presents formal operational semantics of the Reflex language as a base for applying formal methods to verification of Reflex programs.
During the climb flight of big passenger planes, the pilot directly adjusts the pitch elevator and the plane reacts on this by changing its pitch angle. However, if the pitch angle becomes too large, the plane is in danger of an airflow disruption on the wings, which can cause the plane to crash. In order to prevent this, modern planes take advantage of control software to limit the pitch angle. However, if the software is poorly designed and if system designers have forgotten that sensors might yield wrong data, the software might cause the pitch angle to become negative, so that the plane loses height and can - eventually - crash. In this paper, we investigate on a model for a Boeing passenger plane how the control software could look like. Based on our model described in MatLab/Simulink, it is easy to see based on simulation that the plane loses height when the sensor for the pitch angle provides wrong data. For the opposite case of a correctly functioning sensor, our simulation does not indicate any problems. This simulation, however, is not a guarantee that the control is indeed safe. For this reason, we translated the MatLab/Simulink-model of the controller into a hybrid program in order to make this system amenable to formal verification using the theorem prover KeYmaera
User-friendly formal specification and verification of concurrent systems from various subject domains are active research topics due to their practical significance. In this paper, we present the method for development of verification-oriented domain-specific process ontologies which are used to describe concurrent systems of subject domains. One of advantages of such ontologies is their formal semantics which makes possible formal verification of described systems. Our method is based on the verification-oriented process ontology. For constructing a domain-specific process ontology, our method uses techniques of semantic markup and pattern matching to associate domain-specific concepts with classes of the process ontology. We give detailed ontological specifications of these techniques. Our method is illustrated by the example of developing a domain-specific ontology for typical elements of automatic control systems.
The most recent trends in the C-light verification system are MetaVCG, semantic labels appropriate for verification condition (VC) explanation and symbolic method of definite iterations. MetaVCG takes a C-light program together with some Hoare's logic and produces on-the-fly a VC generator (VCG), which in turn processes the input program. Hoare's logic for definite iterations is a good choice if we try to get rid of loop invariants. Finally, if a theorem prover was unable to validate some VCs we could follow two ways. Obviously, we could revise/enrich specifications or/and underlying proof theory to prove the truth of VCs. Or, perhaps, we could concentrate upon establishment of falsity, which meant there were errors in annotated program. This is where semantic labels play crucial role providing some natural language comments about wrong VC as well as a back-trace to the error location. The newly developed ACL2 heuristics to prove VC falsity is the main theme of this paper.
For many years, automotive embedded systems have been validated only by testing. In the near future, Advanced Driver Assistance Systems (ADAS) will take a greater part in the car’s software design and development. Furthermore, their increasing critical level may lead authorities to require a certification for those systems. We think that bringing formal proof in their development can help establishing safety properties and get an efficient certification process. Other industries (e.g. aerospace, railway, nuclear) that produce critical systems requiring certification also took the path of formal verification techniques. One of these techniques is \emph{deductive proof}. It can give a higher level of confidence in proving critical safety properties and even avoid unit testing.
In this paper, we chose a production use case: a function calculating a square root by linear interpolation. We use deductive proof to prove its correctness and show the limitations we encountered with the off-the-shelf tools. We propose approaches to overcome some limitations of these tools and succeed with the proof. These approaches can be applied to similar problems, which are frequent in automotive embedded software.
Finite State Machines (FSMs) are widely used for analysis and synthesis of digital components of control systems. In order to take into account time aspects, timed FSMs are considered. In this paper, we address the problem of deriving a parallel composition of FSMs with timed guards and output delays (output timeouts). When the parallel composition is considered, component FSMs work in the dialog mode and the composition produces an external output when the interaction between components is terminated. We formally define the parallel composition operator for FSMs with timed guards (TFSMs) and show that unlike classical FSMs, a "slow environment" is not sufficient for describing the behavior of a composition of deterministic TFSMs by a deterministic FSM with a single clock. Although the set of deterministic FSMs with timed guards is not closed under the parallel composition operator, some classes of deterministic TFSMs are still closed under this operator and the paper contains some examples of such classes.
Russell is a logical framework for the specification and implementation of deductive systems. It is a high-level language with respect to Metamath language, so inherently it uses a Metamath foundations, i.e. it doesn't rely on any particular formal calculus, but rather is a pure logical framework. The main difference with Metamath is in the proof language and approach to syntax: the proofs have a declarative form, i.e. consist of actual expressions, which are used in proofs, while syntactic grammar rules are separated from the meaningful rules of inference.
Russell is implemented in c++14 and is distributed under GPL v3 license. The repository contains translators from Metamath to Russell and back. Original Metamath theorem base (almost 30 000 theorems) can be translated to Russell, verified, translated back to Metamath and verified with the original Metamath verifier. Russell can be downloaded from the repository: https://github.com/dmitry-vlasov/russell.
In this paper, we describe an approach to formal verification of parallel and distributive programs in C#. We use Microsoft Roslyn technique to get syntax and semantic information about interesting constructions in the real source code to generate some corresponding code in Promela language, designed to model actor-based interoperation systems, so we do a program-to-model transformation. Then, we verify the usual problems of parallel and distributive code by checking pre-defined LTL formulas for the model program. We are able to provide checking of data races, improper locking usage, possible deadlocks in distributive service interoperations using the Model Checking approach. This method can be used to construct a static analyzer for the .NET platform.
Development and deductive verification of a predicate program of binary search, which is identical to bsearch.c program in the Linux kernel library, is described. New constructs in the predicate programming language for arbitrary types as program parameters are introduced. For an object of the arbitrary type, the transformation of coding an object via pointer is introduced.
This paper presents the results of examining changes that occur in thematic clusters constructed on a text set of conference proceedings from the research field "Argument mining".
Identification of terms, analysis of their relations and thematic clustering are performed with the use of a third-party software that allows to extract terms in the form of noun phrases and to cluster them in accordance to a modularity based algorithm. The quality of the resulting clusters is estimated by employing three distinct criteria. Temporal transformation of terminological content of clusters is analyzed through the use of directed graphs constructed with an underlying criterion that enables recognition of the most significant changes. Terminological lexicon of the identified thematic clusters characterizes distinct directions in which studies are conducted, while transformation of their terminological content demonstrates shifts of researchers' interests.
The purpose of the article is to describe the method of comparison of programming languages, convenient for assessing the expressive power of languages and the complexity of the programming systems. The method is adapted to substantiate practical, objective criteria of program decomposition, which can be considered as an approach to solving the problem of factorization of very complicated definitions of programming languages and their support systems. In addition, the article presents the results of the analysis of the most well-known programming paradigms and outlines an approach to navigation in the modern expanding space of programming languages, based on the classification of paradigms on the peculiarities of problem statements and semantic characteristics of programming languages and systems with an emphasis on the criteria for the quality of programs and priorities in decision-making in their implementation. The concept of "programming paradigm" is manifested as the way of thinking in the programming process. The author thanks the organizers and participants of the conferences "Scientific Service in the Internet Environment" (http://agora.guru.ru/display.php?conf=abrau2020&page=subjects&PHPSESSID=qbn3kbhgnk8b6a9g21qi1nkkq2 ), discussions with which made it possible to understand the main provisions of this article.
This paper considers the problems of adaptivity realization in intelligent e-learning tutoring systems, based on adaptive hypermedia technologies and individual learning styles models.
intelligent e-learning tutoring systems, adaptive systems, intelligent tutoring systems architecture, adaptive educational hypermedia systems, adaptive hypermedia, adaptive presentation, adaptive navigation support, individual learning styles, Kolb's experiential learning model, Honey and Mumford's learning styles model, Felder-Silverman learning style model, cognitive trait model, Bloom's taxonomy
The evolution of the concept of "knowledge graph" from the moment of its inception to the present moment is considered. The paper also discusses how the evolution of systems that position themselves as knowledge graphs has affected the definition and life cycle of knowledge graphs.
Deductive verification of the classical J.Williams heapsort algorithm for objects of an arbitrary type was conducted. In order to simplify verification, non-trivial transformations, replacing pointer arithmetic operators by an array element constructs, were applied. The program was translated to the predicate programming language. Deductive verification of the program in the tools Why3 and Coq appears to be complicated and time consuming.
Transformations eliminating pointers in the memweight function in OS Linux kernel library is described. Next, the function is translated to the predicate programming language P. For the obtained predicate program, deductive verification in the Why3 tool was performed. In order to simplify verification, the program model of calculating program inner state was constructed.
Named Entity Extraction (NER) is the task of extracting information from text data that belongs to predefined categories, such as organizations names, place names, people's names, etc. Within the framework of the presented work, was developed an approach for the additional training of deep neural networks with the attention mechanism (BERT architecture). It is shown that the preliminary training of the language model in the tasks of recovering the masked word and determining the semantic relatedness of two sentences can significantly improve the quality of solving the problem of NER. One of the best results has been achieved in the task of extracting named entities on the RuREBus dataset. One of the key features of the described solution is the closeness of the formulation to real business problems and the selection of entities not of a general nature, but specific to the economic industry.
The presented work describes the analysis of argumentative statements included into the same text topic fragment as a recognition feature in terms of its efficiency. This study is performed with the purpose of using this feature in automatic recognition of argumentative structures presented in the popular science texts written in Russian. The topic model of a text is constructed based on superphrasal units (text fragments united by one topic) that are identified by detecting clusters of words and word-combinations with the use of scan statistics. Potential relations, extracted from topic models, are verified through the use of texts with manually annotated argumentation structures. The comparison between potential (based on topic models) and manually constructed relations is performed automatically. Macro-average scores of precision and recall are equal to 48.6% and 76.2% correspondingly.
State identification is the well-known problem in the theory of Finite State Machines (FSM) where homing sequences (HS) are used for the identification of a current FSM state, and this fact is widely used in the area of software testing and verification. For various kinds of FSMs, there exist sufficient and necessary conditions for the existence of preset and adaptive HS and algorithms for their derivation. Nowadays timed aspects become very important for hardware and software systems. In this work, we address the problem of checking the existence and derivation of homing sequences for FSMs with timed guards. The investigation is based on the FSM abstraction of a Timed FSM.
Sequential reactive systems are formal models of programs that interact with the environment by receiving inputs and producing corresponding outputs. Such formal models are widely used in software engineering, computational linguistics, telecommunication, etc. In real life, the behavior of a reactive system depends not only on the flow of input data, but also on the time the input data arrive and the delays that occur when generating responses. To capture these aspects, a timed finite state machine (TFSM) is used as a formal model of a real-time sequential reactive system. However, in most of known previous works, this model was considered in simplified semantics: the responses in the output stream, regardless of their timestamps, follow in the same order in which the corresponding inputs are delivered to the machine. This simplification makes the model easier to analyze and manipulate, but it misses many important aspects of real-time computation. In this paper we study a refined semantics of TFSMs and show how to represent it by means of Labelled Transition Systems. This opens up a possibility to apply traditional formal methods for verifying more subtle properties of real-time reactive behavior which were previously ignored.
Sequential reactive systems such as controllers, device drivers, computer interpreters operate with two data streams and transform input streams of data (control signals, instructions) into output streams of control signals (instructions, data). Finite state transducers are widely used as an adequate formal model for information processing systems of this kind. Since runs of transducers develop over time, temporal logics, obviously, could be used as both simple and expressive formalism for specifying the behavior of sequential reactive systems. However, the conventional applied temporal logics (LTL, CTL) do not suit this purpose well, since their formulae are interpreted over omega-languages, whereas the behavior of transducers are represented by binary relations on infinite sequences, i.e. omega-transductions. To provide temporal logic with the ability to take into account this general feature of the behavior of reactive systems, we introduced new extensions of this logic. Two distinguished features characterize these extension: 1) temporal operators are parameterized by sets of streams (languages) admissible for input, and 2) sets (languages) of expected output streams are used as basic predicates. In the previous series of works we studied the expressive power and the model checking problem for Reg-LTL and Reg-CTL which are such extensions of LTL and CTL where the languages mentioned above are regular ones. We discovered that such an extension of temporal logics increases their expressive capability though retains the decidability of the model checking problem. Our next step in the systematic study of expressive and algorithmic properties of new extensions temporal logics is the analysis of the model checking problem for finite state transducers against Reg-CTL* formulae. In this paper we develop a model checking algorithm for Reg-CTL* and show that this problem is in ExpSpace.
We address the formal verification of the control software of critical systems, i.e., ensuring the absence of design errors in a system with respect to requirements. Control systems are usually based on industrial controllers, also known as Programmable Logic Controllers (PLCs). A specific feature of a PLC is a scan cycle: 1) the inputs are read, 2) the PLC states change, and 3) the outputs are written. Therefore, in order to formally verify PLC, e.g., by model checking, it is necessary to reason both in terms of state transitions within a cycle and in terms of larger state transitions according to the scan-cyclic semantics.
We develop a formalization of PLC as a hyperprocess transition system and an LTL-based temporal logic cycle-LTL for reasoning about PLC.
Deductive verification of a string to integer conversion program kstrtoul in Linux OS kernel library is described. The kstrtoul program calculates the integer value presented as a char sequence of digits. To simplify program verification the transformations of replacing pointer operators to equivalent actions without pointers are conducted. Model of inner program state are constructed to enhance program specification. Deductive verification was conducted in the tools Why3 and Coq.
The transformation and verification of the bus_sort_breadthfirst program, which belongs to the Linux OS kernel and implements sorting of devices are described. The C program is transformed into the cP language performing macros unfolding, structure changes, and elimination of pointers. Transformed program is translated into the WhyML functional language. For the received program, a specification is constructed. Deductive verification is carried out in the tool Why3.
The article presents a database model based on a sequence of objects. The issues of serialization/deserialization of objects, index constructions, structure and methods of the universal sequence and universal index, implementation of the complete editing basis through the use of the primary key, timestamp and "empty" value are considered.
It is shown that a specification in the Event-B language can be represented by an automata-based program as a non-deterministic composition of simple conditional statements, which corresponds to a narrow subclass of automata-based programs. A specification in Event-B is monolithic. To build a specification, there are no other means of composition, except for a refinement that implements an extension of a previously built specification.
Comparison of automata-based engineering method and Event-B modeling method is carried out on two example tasks. Previous solutions to the bridge traffic control problem in the Event-B system are complicated. A simpler solution with deductive verification in the Rodin tool is proposed. The effectiveness of the Event-B verification methods is confirmed by finding three non-trivial errors in our solution.
The article presents the results of experiments on the conjunctive decomposition of various representations of Boolean functions (ZDD, BDD, OKFDD, AIG) by methods that are derived by specializing the general decomposition algorithm. Test sets are random Boolean functions with various parameters, as well as a set of well-known benchmarks used to test algorithms for optimizing logic circuits. The comparison involves sequential and multi-threaded implementation of the algorithm.
Methods and technologies for constructing efficient and reliable programs and software systems based on graph models and semantic transformations
Methods and technologies for constructing efficient and reliable programs and software systems based on graph models and semantic transformations
(p. 1-14)
Since 2021 in the Institute of Informatics Systems SB RAS the project «Methods and technologies for constructing efficient and reliable programs and software systems based on graph models and semantic transformations» has been implementing by a team of researchers of the Program Construction and Optimization Laboratory and System Programming Laboratory under the scientific leadership of D.Sc., Full Professor V.N. Kasyanov. The main executors of the project are Ph.D., Associate Professor E.V. Kasyanova and Ph.D. V.I. Shelekhov. This article briefly presents such results of the first stage of this project, which were obtained by the staff of the Program Construction and Optimization Laboratory.
This paper presents main scientific results of Information systems laboratory in 2021
The paper considers the problem of evaluating the possibilities of alleviating transportation discrimination of the population of Asian Russia. The authors suggest extending the MIX-PROSTOR system for transportation stream modeling by including tools for evaluating options of multimodal passenger transportation.
This article describes methods for automatic term extraction and linking to Wikidata. The advantage of the proposed methods is the potential possibility of their applicability to any field of knowledge where only unmarked texts and small term dictionaries are available. To carry out the experiments, a corpus of scientific texts RuSERRC was collected and marked up. The corpus and models are published on GitHub and may be useful to other research teams.
Event structures are a well-established model in concurrency theory. Two structurally different methods of associating transition system semantics to event-oriented models are distinguished in the literature. One of them is based on configurations (event sets), the other – on residuals (model fragments). In this paper, we deal with a highly expressive model of event structures – event structures for resolvable conflict (RC-structures) – and provide isomorphism results on these two types of transition systems constructed from RC-structures, in step and partial order semantics.
The article describes an approach to automating the extraction of terminology to enrich the ontology of the scientific subject domain from texts in Russian. The applicability of methods of automatic ontology enrichment from natural language texts depends on the characteristics of the text corpus and the language used. The specifics of the input language, characterized by strong inflectedness and free word order, and the absence of a large corpus of texts lead to the choice of a linguistic approach based on the use of lexico-semantic patterns. The features of the proposed methodology of information extraction are as follows: a) automatic replenishment of subject vocabulary on the basis of ontology and corpus of texts and annotating it with the system of semantic features; b) definition of a small set of initial structural meta-patterns, establishing conceptual contexts of ontological information extraction; c) automatic generation of a set of lexico-semantic patterns, defining lexical, semantic and syntactic properties of the contexts, on the set of structural meta-patterns.
Method of subdefinite computations is one of the approaches to solve constraint satisfaction problem. A number of programming system were developed on its base; including constraint programming system Nemo. This paper describes basic notions to develop Nemo system, their implementation in practice, and opportunities provided to user to solve various problems.
An introduction to the study of quality assurance methods is essential to understand the development of complex and reliable software. Nevertheless, the modern software industry requires the earliest possible launch of a product to the market, and the methods of formal specification and verification of programs do not find much interest among the broad mass of future programmers.
In this article, the author proposes to organize a dedicated discipline and conduct seamless training in testing, test-driven development and formal verification using various methods for writing program specifications and using software tools for program checking.
The purpose of discussing discipline is to redefine the attitude of future developers towards software quality, its specification and automatic checking.
Within the framework of this article, the author considers his own discipline, which combines two courses -- software testing and formal verification. The proposed approach of teaching is primarily practice-oriented and includes teamwork. In accordance with the current curriculum, the discipline is held in the last semester for undergraduate students (4th course). The material of the article is based on the author's five-year experience of teaching the subject to students of the Software Engineering specialty. The article offers rather voluminous and descriptive examples of specifications and programs in model languages.
This work is dedicated to an Estonian scientist in Computer Science, Enn Tyugu (1935–2020). The article focuses on two significant events in his biography: deportation in 1941 and his interest in computers. The topic appears relevant since in the post-Soviet (the same as in the USSR) environment research on the life paths of the representatives of deported nations was scarce; we know little about their life and the life of their progeny. We will not elaborate on the issue of access of science and technology specialists (technocrats) to political power and administrative decision-making and will limit our interest to their socio-professional identities.
The work is devoted to the study of the problems of ontology automating creation of scientific subject areas using methods of automatic analysis of texts in natural language. The aim of the work is developing methods for automatic generation of lexico-syntactic patterns for extracting information and ontology augmentation based on the analysis of content ontology design patterns for scientific knowledge areas developed within the framework of the Semantic Web concept. Ontology design patterns are a structured description of top-level concepts in terms of classes, attributes and relationships, and also include competency questions in natural language that serve for understanding and correctly interpretation the properties and relationships of the concept by users. The article proposes an approach to the generation of lexico-syntactic patterns based on questions of competence assessment. The process of generating lexico-syntactic patterns includes the generation of a subject dictionary, the extraction of ontology entities and the formation of a pattern structure based on the Data Property and Object Property, and the generation of semantic, grammatical and positional constraints. Competency questions are used to identify grammatical and positional restrictions necessary for search for ontological relationships in texts. The ontology "Decision support in weakly formalized areas" and the corpus of scientific texts of the same subject area were used in the experiment. During the experiment, the following results were obtained: the degree of ambiguity of the generated templates - 1.5, the F1-measure of assessing the quality of the search for attributes and relations of objects - the F1-measure was 0.77 for attributes and 0.55 for relations, respectively. Comparison of the results obtained for patterns without grammatical restrictions and the results obtained for patterns with grammatical restrictions showed that the addition of restrictions significantly improves the quality of extraction of ontology objects.
Memory model describes the memory consistency requirements in a multithreading system. Compiler optimizations may violate the consistency requirements due to bugs, and the program behavior will differ from the required one. The bugs in compiler optimizations, like incorrect instruction reordering, are very difficult to detect, because they may occur with a very low chance in real execution on a hardware. There are different approaches of formal verification for memory consistency requirements, but the challenge is that the approaches are not scalable for industrial software. In the paper we present the MCC tool that was evaluated on the industrial virtual machine ARK VM and was able to find a real bug in a compiler optimization. The MCC is a static tool, which allows to check all possible executions of a particular test, not relying on a hardware execution. The approach also includes test suite generation and specification of memory consistency properties.
Process-oriented programming is a promising approach to the development of control software. Control software often has high reliability requirements. Formal verification methods, in particular deductive verification, are used to prove the correctness of such programs regarding the requirements. Previously, a temporal requirements language was developed to specify temporal requirements for deductive verification of process-oriented programs. It was also shown that a significant part of the requirements falls into a small number of classes. Requirements patterns was developed for these classes. In this paper, we present a collection of process-oriented programs and requirements for them. Requirements are formalized in the temporal requirements language and classified according the set of patterns. We also define a new requirement pattern. These results can be used in the research of formal verification methods for process-oriented programs, in particular in the research of methods of proving verification conditions.
The process-oriented programming is a paradigm based on the process concept where each process is a concurrent finite state machine inside. The paradigm is intended for PLC (programmable logic controllers) developers to write Industry 4.0-enabled software.
The poST language is a promising process-oriented extension of the IEC 61131-3 Structured Text (ST) language designed to provide a conceptual consistency of the PLC source code with technological description of the process under control. This language combines the advantages of FSM-based programming with the standard syntax of the ST language. We propose transformational semantics of poST providing rules for translation of poST language statements to Promela -- the input language of the SPIN model checker. Following these semantic rules, our Xtext-based translator outputs a Promela model for the poST program.
Our contribution is a method for automatic generation of the Promela code from poST control programs. The resulting Promela program is ready to be verified with SPIN model checker against linear temporal logic requirements to the source poST program.
SSA (static single-assignment) form is an intermediate representation for compiling imperative programs where every variable is assigned to only once. Properly defined, SSA form gives rise to the family of purely syntactic categories with some nice properties. We hope this lays out the groundwork for categorical approach to compiler optimization.
Generative artificial intelligence systems should be carefully tested because they can generate low-quality content, but the testing challenges the test oracle problem. Metamorphic testing helps to test programs without test oracles. In this study, we consider an AI system that generates personalized stickers. Personalized sticker is a thematic picture with a person's face, some decorative elements and phrases. This system is stochastic, complex and consists of many parts. We formulate requirements for the whole system and check them with metamorphic relations. The requirements focus on dependency on input images quality and resulting quality of generated stickers. Our testing methodology helps us to identify major issues in the system under test.
This paper presents the work in progress implementation of the language server for the Rzk proof assistant. It analyzes relevant technologies related to Language Server Protocol, VS Code extensions, and theorem provers, and builds on top of them for Rzk’s language support.
The article presents some of the author's developments in the field of TRIZ - the theory of solving inventive problems. The developments are expected to be turned into an assistant program intended for a wide range of users and a wide range of tasks. A model for setting the problem is proposed, in the form of an extended scenario analysis, then the complexity that hinders the implementation of this scenario is identified, and ways to resolve contradictions are proposed. Basically, the proposed methodology for setting and solving the problem corresponds to the theory developed and developed by G.S. Altshuller and his students. New is a model that unites different “branches” of TRIZ theory and practice.
The paper proposes an approach to the automatic construction of the terminological core of ontology in computer linguistics. The issues of creating a top-level ontology, which defines possible classes of terms for their further search and systematization, are considered. An algorithm for generating and initially populating a subject dictionary is proposed. It includes two main stages. At the first step, a system of lexical-semantic classes based on ontology classes is built. The second step is filling the dictionary with terms and their correlation with dictionary classes based on available resources: a universal ontology of scientific knowledge, a thesaurus and a portal on computer linguistics. For conducting experiments, a corpus of analytical articles on computational linguistics was collected from the Habr website. Moreover, datasets with term marking were created, including 1065 sentences in Russian. Experiments were carried out to solve two problems: term detection and their classification based on ontology classes. For the first task, three neural network models were considered: xlm-roberta-base, roberta-base-russian-v0 and ruRoberta-large. The best results were obtained with the last model: 0.91 F-measures. An analysis of the classifier errors showed a high frequency of errors of incomplete selection of the term. For the second task, the ruRoberta-large model was chosen due to its results for the first task. The average F-measure value for the 12 used ontology classes was 0.89. A general architecture of a system for creating and populating ontologies is proposed, integrating linguistic approaches and machine learning methods.
The paper is presented a response to the S.P. Prokhorov’s paper “The Fundamental Contribution of the USSR Academy of Sciences to the Development of Computer Sciences and Computer Technologies” in the “Bulletin of the Russian Academy of Sciences”, 2023, vol. 93, no. 10, pp. 980–988.
The paper is presented the results of the VI International conference on the history of computer science “Development of computer technology in Russia, the countries of the former USSR and CMEA” SoRuCom-23. It was organized on September 25–27, 2023 at the Nizhny Novgorod campus of the National Research University “Higher School of Economics”.
Dense-Time Petri nets (TPNs) are an extension of Petri nets where each transition has its own clock and time interval. This model is considered in the context of a weak semantics, in which time elapsing does not force transitions to fire. For TPNs, equivalences are investigated in the dichotomies of "interleaving — partial order" and "linear — branching time". The relationships between equivalences are analyzed with respect to intermediate and persistent atomic memory policies, which determine the rules for resetting the clocks of transitions.