Article révisé par les pairs
Résumé : Providing useful and efficient semantic annotations is a major challenge for knowledge design of any body of text, especially historical documents. In this article, we propose Topic Modeling as an important first step to gather semantic information beyond the lexicon which can be added as annotations in the SHEBANQ. By laying out a case study, we discuss both noise and structure found in comparing topics extracted within different distributions, and show the value of such approach, which we label a topic hierarchy. We also show a first result in applying such approach to study diachronic variety in the Bible, and show how this overall Topic Modeling approach can result in more query options for users of the database.