Posts Tagged ‘Semantic analysis of legal texts’

Oldfather et al. on Automated Content Analysis, Court Opinions, and Legal Scholarly Methodology

September 8, 2012

Professor Chad M. Oldfather of Marquette University School of Law, Professor Dr. Joseph P. Bockhorst of the University of Wisconsin Madison Department of Electrical Engineering and Computer Science, and Brian P. Dimmer, Esq., have published Triangulating Judicial Responsiveness: Automated Content Analysis, Judicial Opinions, and the Methodology of Legal Scholarship, Florida Law Review, 64, 1189-1242 (2012).

Here is the abstract:

The increasing availability of digital versions of court documents, coupled with increases in the power and sophistication of computational methods of textual analysis, promises to enable both the creation of new avenues of scholarly inquiry and the refinement of old ones. This Article advances that project in three respects. First, it examines the potential for automated content analysis to mitigate one of the methodological problems that afflicts both content analysis and traditional legal scholarship — their acceptance on faith of the proposition that judicial opinions accurately report information about the cases they resolve and courts’ decisional processes. Because automated methods can quickly process large amounts of text, they allow for assessment of the correspondence between opinions and other documents in the case, thereby providing a window into how closely opinions track the information provided by the litigants. Second, it explores one such novel measure — the responsiveness of opinions to briefs — in terms of its connection to both adjudicative theory and existing scholarship on the behavior of courts and judges. Finally, it reports our efforts to test the viability of automated methods for assessing responsiveness on a sample of briefs and opinions from the United States Court of Appeals for the First Circuit. Though we are focused primarily on validating our methodology, rather than on the results it generates, our initial investigation confirms that even basic approaches to automated content analysis provide useful information about responsiveness, and generates intriguing results that suggest avenues for further study.

Lu and Conrad on Bringing Order to Legal Documents: An Issue-based Recommendation System via Cluster Association

August 28, 2012

Qiang Lu and Jack G. Conrad, both of Thomson Reuters, will present a paper entitled Bringing Order to Legal Documents: An Issue-based Recommendation System via Cluster Association, at KEOD 2012: The 4th International Conference on Knowledge Engineering and Ontology Development, to be held 4-7 October 2012 in Barcelona, Catalonia, Spain.

Here is the abstract:

The task of recommending content to professionals (such as attorneys or brokers) differs greatly from the task of recommending news to casual readers. A casual reader may be satisfied with a couple of good recommendations, whereas an attorney will demand precise and comprehensive recommendations from various content sources when conducting legal research. Legal documents are intrinsically complex and multi-topical, contain carefully crafted, professional, domain-specific language, and possess a broad and unevenly distributed coverage of issues. Consequently, a high quality content recommendation system for legal documents requires the ability to detect significant topics from a document and recommend high quality content accordingly. Moreover, a litigation attorney preparing for a case needs to be thoroughly familiar the principal arguments associated with various supporting opinions, but also with the secondary and tertiary arguments as well. This paper introduces an issue-based content recommendation system with a built-in topic detection/segmentation algorithm for the legal domain. The system leverages existing legal document metadata such as topical classifications, document citations, and click stream data from user behavior databases, to produce an accurate topic detection algorithm. It then links each individual topic to a comprehensive pre-defined topic (cluster) repository via an association process. A cluster labeling algorithm is designed and applied to provide a precise, meaningful label for each of the clusters in the repository, where each cluster is also populated with member documents from across different content types. This system has been applied successfully to very large collections of legal documents, O(100M), which include judicial opinions, statutes, regulations, court briefs, and analytical documents. Extensive evaluations were conducted to determine the efficiency and effectiveness of the algorithms in topic detection, cluster association, and cluster labeling. Subsequent evaluations conducted by legal domain experts have demonstrated that the quality of the resulting recommendations across different content types is close to those created by human experts.

For full text of the paper, please contact the authors.

Thanks to Jack for allowing me to post the abstract.

Guissé et al. : From Regulatory Texts to BRMS: Guiding the Acquisition of Business Rules

August 22, 2012

Abdooulaye Guissé, Professor Dr. François Lévy, and Professor Dr. Adeline Nazarenko, all of Université Paris-Nord Laboratoire d’informatique (LIPN), will present a paper entitled From regulatory texts to BRMS: How to guide the acquisition of business rules? at RuleML 2012: International Symposium on Rules, Montpellier, France, August 27-29(30), 2012.

Here is the abstract:

This paper tackles the problem of rule acquisition, which is critical for the development of BRMS. The proposed approach assumes that regulations written in natural language (NL) are an important source of knowledge but that turning them into formal statements is a complex task that cannot be fully automated. The present paper focuses on the first phase of this acquisition process, the normalization phase that aims at transforming NL statements into controlled language (CL), rather than on their formalization into an operational rule base. We show that turning a NL text into a set of self-sucient and independent CL rules is itself a complex task that involves some lexical and syntactic normalizations but also the restoration of contextual information and of implicit semantic entities to get a set of self-sucient and unambiguous rule statements. We also present the SemEx tool that supports the proposed acquisition methodology based on the selection of the relevant text fragments and their progressive and interactive transformation into CL rule statements.

SemEx is:

a semantic explorer platform designed to assist business analysts in building a base of candidate business rules out of a policy document.

Papers Available for SPLeT 2012: Workshop on Semantic Processing of Legal Texts

May 27, 2012

Full text papers have been posted for SPLeT 2012: Workshop on Semantic Processing of Legal Texts, being held 27 May 2012 in Istanbul, Turkey.

Here is the list of papers:

  • Giulia Venturi: Design and Development of TEMIS: a Syntactically and Semantically Annotated Corpus of Italian Legislative Texts
  • Guido Boella, Luigi Di Caro, Llio Humphreys, Livio Robaldo: Using Legal Ontology to Improve Classification in the Eunomos Legal Document and Knowledge Management System
  • Antonio Lazari, Mª Ángeles Zarco-Tejada: JurWordNet and FrameNet Approaches to Meaning Representation: a Legal Case Study
  • Lorenzo Bacci, Enrico Francesconi, Maria Teresa Sagri: A Rule-based Parsing Approach for Detecting Case Law References in Italian Court Decisions
  • Adam Wyner, Wim Peters: Semantic Annotations for Legal Text Processing using GATE Teamware
  • Paulo Quaresma: Legal Information Extraction ← Machine Learning Algorithms + Linguistic Information
  • Adam Wyner: Problems and Prospects in the Automatic Semantic Analysis of Legal Texts
  • Felice Dell’Orletta, Simone Marchi, Simonetta Montemagni, Barbara Plank, Giulia Venturi: The SPLeT–2012 Shared Task on Dependency Parsing of Legal Texts
  • Giuseppe Attardi, Daniele Sartiano and Maria Simi: Active Learning for Domain Adaptation of Dependency Parsing on Legal Texts
  • Alessandro Mazzei, Cristina Bosco: Simple Parser Combination
  • Niklas Nisbeth, Anders Søgaard: Parser combination under sample bias

Boyd, Hoffman, et al. on Building a Taxonomy of Litigation: Clusters of Causes of Action in Federal Complaints

May 21, 2012

Professor Dr. Christina L. Boyd of the State University of New York (SUNY) – Department of Political Science, Professor David A. Hoffman of the Temple University School of Law and the Cultural Cognition Project at Yale Law School, and colleagues, have posted Building a Taxonomy of Litigation: Clusters of Causes of Action in Federal Complaints.

This article has been published in: Journal of Empirical Legal Studies, 10(2), 253-287 (2013): http://dx.doi.org/10.1111/jels.12010

Here is the abstract:

This project empirically explores civil litigation from its inception by examining the content of civil complaints. We utilize spectral cluster analysis on a newly compiled federal district court dataset of causes of action in complaints to illustrate the relationship of legal claims to one another, the broader composition of lawsuits in trial courts, and the breadth of pleading in individual complaints. Our results shed light not only on the networks of legal theories in civil litigation but also on how lawsuits are classified and the strategies that plaintiffs and their attorneys employ when commencing litigation. This approach permits us to lay the foundations for a more precise and useful taxonomy of federal litigation than has been previously available, one that, after the Supreme Court’s recent decisions in Bell Atlantic v. Twombly (2007) and Ashcroft v. Iqbal (2009), has also arguably never been more relevant than it is today.

This study is notable for several reasons, including that Computational Legal Studies founders Professor Dr. Daniel Martin Katz and Michael Bommarito commented on the statistical methodology used in the study, and that the study uses government data made public through RECAP, the open government data project developed by Harlan Yu, Stephen Schultze, and Timothy B. Lee, all of Princeton’s Center for Information Technology Policy.

Further, this study exemplifies the scholarly use of open government data predicted by David Robinson, Harlan Yu, and Ed Felten, in their influential article, Government Data and the Invisible Hand.

HT @freemoth.

JURIX 2010 Slides Available

January 16, 2011

Slides are now available for many papers given at JURIX 2010: The 23rd International Conference on Legal Knowledge and Information Systems, held 16-17 December 2010 at the University of Liverpool Computer Science Department, in Liverpool, England, UK.

HT JURIX Blog.

JURIX 2010

December 15, 2010

The final program has been posted for JURIX 2010: The International Conference on Legal Knowledge and Information Systems, being held 15-17 December 2010, at the University of Liverpool Department of Computer Science, in Liverpool, England, UK.

The Twitter hashtag for the conference is #jurix.

Click here for papers from the 15 December workshop: Modelling Legal Cases and Legal Rules 2010.

Click here for information about the invited speakers, who include John L. Sheridan of The National Archives (UK).

Click here for information for conference participants.

We wish our colleagues who are organizing, presenting at, or attending JURIX 2010 a very successful and rewarding conference.

JURIX 2010: Accepted Papers

October 9, 2010

Accepted papers have been announced for JURIX 2010: The International Conference on Legal Knowledge and Information Systems, to be held 16-17 December 2010, at the University of Liverpool Department of Computer Science, in Liverpool, England, UK.

Invited speakers for the conference have also been announced.

Stede & Kuhn on Identifying the Content Zones of German Court Decisions

May 22, 2010

Professor Dr. Manfred Stede and Florian Kuhn, both of Universität Potsdam Department Linguistik, have published Identifying the Content Zones of German Court Decisions, in Business Information Systems Workshops: BIS 2009 International Workshops, Poznan, Poland, April 27-29, 2009, Revised Papers (2009).

The paper was originally presented at LIT 2009: The 2nd Workshop on Legal Informatics and Legal Information Technology, held 28 April 2009 in Poznan, Poland.

Here is the abstract of the paper:

A central step in the automatic processing of court decisions is the identification of the various content zones, i.e., breaking up the document into functionally independent areas. We assembled a corpus of German court decisions and argue that this genre belongs to the class of semi-structured text documents. Currently, we are implementing zone identification by means of a set of recognition rules, following up on our earlier experiences with a different genre (film reviews).


Follow

Get every new post delivered to your Inbox.

Join 97 other followers

%d bloggers like this: