Incorporating Information Extraction in the Relational Database Model

Nahshon, Yoav; Peterfreund, Liat; Vansummeren, Stijn

doi:doi/10.1145/2932194.2932200

Citer

Incorporating Information Extraction in the Relational Database Model

par Nahshon, Yoav ;Peterfreund, Liat ;Vansummeren, Stijn

Référence WebDB(2016: June 26, 2016: San Francisco, CA, USA), Proceedings of the 19th International Workshop on Web and Databases, ACM
Publication Publié, 2016-06

Publication dans des actes

Résumé :

Modern information extraction pipelines are typically constructed by (1) loading textual data from a database into a special-purpose application, (2) applying a myriad of textanalytics functions to the text, which produce a structured relational table, and (3) storing this table in a database.Obviously, this approach can lead to laborious development processes, complex and tangled programs, and inefficient control flows. Towards solving these deficiencies, we embark on an effort to lay the foundations of a new generation of text-centric database management systems. Concretely, we extend the relational model by incorporating into it thetheory of document spanners which provides the means andmethods for the model to engage the Information Extraction (IE) tasks. This extended model, called Spannerlog, provides a novel declarative method for defining and manipulating textual data, which makes possible the automationof the typical work method described above. In addition toformally defining Spannerlog and illustrating its usefulness for IE tasks, we also report on initial results concerning its expressive power

Référencement	Visibilité	Pérennité	Facilité
Les publications encodées constituent la bibliographie académique de l'Université.	Les documents déposés sont indexés par les moteurs de recherche (Google Scholar,…).	Les documents déposés en open-access sont archivés au sein du réseau de préservation SAFE-PLN (www.safepln.org).	Les listes de publications sont compatibles avec le CV-ULB, le FNRS et accessibles sur le web.

Incorporating Information Extraction in the Relational Database Model

Documents en relation

DI-fusion