Giulia Venturi of l’Istituto di Linguistica Computazionale del CNR di Pisa (ILC-CNR) has published Design and Development of TEMIS: a Syntactically and Semantically Annotated Corpus of Italian Legislative Texts, in LREC 2012 Conference Proceedings: Semantic Processing of Legal Texts (SPLeT-2012) Workshop, pp. 1-12.
Here is the abstract:
Methodological issues concerning the design and the development of TEMIS, a syntactically and semantically annotated corpus of Italian legislative texts, are presented and discussed in the paper. TEMIS is a heterogeneous collection of texts exemplifying different sub–varieties of Italian legal language, i.e. European, national and local texts. The whole corpus has been dependency annotated and a subset has been enriched with frame–based information by customizing the formalism of the FrameNet project. In both cases, a number of domain–specific extensions of the annotation criteria developed for the general language has been foreseen. The interest in building such a corpus stems from the increasing need for annotated collections of domain–specific texts recognized by both the Artificial Intelligence and Law (AI & Law) community and the Natural Language Processing (NLP) one. In two research communities the benefits of having a resource where both domain–specific content and its underlying linguistic structure are made explicit and aligned are widely acknowledged. To the author knowledge, this is the first annotated corpus of legal texts overtly devoted to be used for legal text processing applications based on NLP tools.
Tags: FrameNet, Giulia Venturi, Legal natural language processing, Legal text corpora, Natural language processing, Natural language processing and law, Natural language processing and legal texts, Semantic annotation of legal texts, SPLeT, SPLeT 2012, Syntactic annotation of legal texts, TEMIS, Workshop on Semantic Processing of Legal Texts
July 5, 2012 at 12:21 pm |
[...] without relying on bulk data downloads in XML?[Smart people are saying smart things about this very topic, worth looking [...]