ARElight: Context Sampling of Large Texts for Deep Learning Relation Extraction

Authors: Rusnachenko, N., Liang, H., Kalameyets, M. and Shi, L.

Journal: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Volume: 14612 LNCS

Pages: 229-235

eISSN: 1611-3349

ISBN: 9783031560682

ISSN: 0302-9743

DOI: 10.1007/978-3-031-56069-9_23

Abstract:

The escalating volume of textual data necessitates adept and scalable Information Extraction (IE) systems in the field of Natural Language Processing (NLP) to analyse massive text collections in a detailed manner. While most deep learning systems are designed to handle textual information as it is, the gap in the existence of the interface between a document and the annotation of its parts is still poorly covered. Concurrently, one of the major limitations of most deep-learning models is a constrained input size caused by architectural and computational specifics. To address this, we introduce ARElight1, a system designed to efficiently manage and extract information from sequences of large documents by dividing them into segments with mentioned object pairs. Through a pipeline comprising modules for text sampling, inference, optional graph operations, and visualisation, the proposed system transforms large volumes of text in a structured manner. Practical applications of ARElight are demonstrated across diverse use cases, including literature processing and social network analysis.(1https://github.com/nicolay-r/ARElight)

Source: Scopus