Kuhlman: Legal Synonyms Project, and word cloud of U.S. Code

Casey Kuhlman of the U.S. Open Data Institute has posted a word cloud of the U.S. Code.

Casey says the data are “actual word counts piped into a JQuery lib,” and that he is also working on “N grams and POS tags” for the U.S. Code.

This visualization is an outcome of his Legal Synonyms Project. (HT @benbalter). Here is a description of that project, from the readme:

A synonym.txt for Solr Instances. Solr is a great search engine but it is even better with a bit of training. One of the most used ways to train Solr is to add a synonyms.txt file. Building a synonyms.txt file for a particular corpus of language is not an easy exercise. This repository is an attempt to build a synonyms.txt file for a legal corpus so that Solr can be used to search a corpus of documents of a legal nature.

The results of this effort rather than being strictly and traditionally versioned are contained in different synonyms.txt files. [...]

It will be interesting to compare this n-gram application to Daniel Martin Katz, Michael Bommarito, and colleagues’ Legal Language Explorer, which displays n-gram data for U.S. federal court decisions.

For more details, please see the Legal Synonyms Project repository.

HT @compleatang

Posted in Applications | Tagged , , , , , , , , , , , , , , , , , , , , | Leave a comment

Katz and Bommarito: Measuring the Complexity of the Law: The United States Code

Daniel Martin Katz and Michael J. Bommarito II have published Measuring the Complexity of the Law: The United States Code, forthcoming in Artificial Intelligence and Law.

A preprint of the paper is available on SSRN.

Samuel Arbesman has published a post about the paper at Wired Science.

Here is the abstract of the paper:

Einstein’s razor, a corollary of Ockham’s razor, is often paraphrased as follows: make everything as simple as possible, but not simpler. This rule of thumb describes the challenge that designers of a legal system face — to craft simple laws that produce desired ends, but not to pursue simplicity so far as to undermine those ends. Complexity, simplicity’s inverse, taxes cognition and increases the likelihood of suboptimal decisions. In addition, unnecessary legal complexity can drive a misallocation of human capital toward comprehending and complying with legal rules and away from other productive ends. While many scholars have offered descriptive accounts or theoretical models of legal complexity, most empirical research to date has been limited to simple measures of size, such as the number of pages in a bill. No extant research rigorously applies a meaningful model to real data. As a consequence, we have no reliable means to determine whether a new bill, regulation, order, or precedent substantially effects legal complexity. In this paper, we begin to address this need by developing a proposed empirical framework for measuring relative legal complexity. This framework is based on “knowledge acquisition”, an approach at the intersection of psychology and computer science, which can take into account the structure, language, and interdependence of law. We then demonstrate the descriptive value of this framework by applying it to the U.S. Code’s Titles, scoring and ranking them by their relative complexity. We measure various features of a title including its structural size, the net flow of its intra-title citations and its linguistic entropy. Our framework is flexible, intuitive, and transparent, and we offer this approach as a first step in developing a practical methodology for assessing legal complexity.

Posted in Applications, Articles and papers, Methodology, Technology developments | Tagged , , , , , , , , , , , | Leave a comment

Legal transactional hackathon: Code the Deal: 19-21 September 2014, NYC

Code the Deal, a legal transactional hackathon, is scheduled to be held 19-21 September 2014, in New York City.

The event is organized by Legal Hackers and Nixon Peabody.

Here is a description, from the event’s Website:

Code the Deal is a hackathon that will take place on September 19 to 21, 2014, at Dev Bootcamp‘s NYC Lower Manhattan campus.

Participants will compete to create tech-enabled products that will improve transactional legal practice–tools that aid in counseling businesses through the legal and regulatory hurdles of consummating a sale or purchase. We believe that this is a huge, untapped market for entrepreneurship. Join us as we build some amazing products and compete for generous prizes.

What’s at Stake?

Grand Prize: $2500

Second Prize: $1000

Third Prize: $500

All participants will be invited for drinks and food at our opening reception on the evening of September 19th. We will also take care of your lunch while working (September 20-21). [...]

Twitter hashtags for the event include #codethedeal and legalhack

Resources and other details are available at the Hacker League site for the event.

For more details, please see the event Website.

HT @legalhackNYC

Posted in Applications, Conference Announcements, Hackathons, Hacking, Technology developments, Technology tools | Tagged , , , , , | Leave a comment

Legal informatics papers at ICFIS 2014

A number of papers on legal informatics, legal data analysis, or legal information were presented at ICFIS 2014: International Conference on Forensic Inference and Statistics, held 19-22 August 2014 in Leiden.

The program is available at: https://dl.dropboxusercontent.com/u/6795661/ICFIS2014/program.html

HT Bart Verheij

Posted in Applications, Conference resources, Methodology, Technology developments | Tagged , , , , , , , , | Leave a comment

Zvenyach: Coding for Lawyers

V. David Zvenyach has released a book entitled Coding for Lawyers.

The book currently covers regular expressions, Markdown, HTML, data types, using arrays, and coding in Python.

The GitHub repository for the book is at: https://github.com/vzvenyach/codingforlawyers

Here is a description, from the FAQ:

[...] It’s true. Lawyers can code. In fact, if you’re a lawyer, the truth is that it’s easier than you think. I am a lawyer, and a coder. In the course of two years, I have gone from knowing essentially nothing to being a decent coder in several languages. This book is intended to drastically shorten that time for others who, like me, decide that they want to learn to code. [...] At the moment, I am still making many decisions about this project, and I want your feedback. Is it worth it? Are any lawyers actually interested? Are the chapters too dense? Too easy? Are there topics that you definitely want covered? A great way to help would be to send me an email at [the email address listed in the FAQ] and let me know what you think. An even better way is to submit an issue on GitHub or submit a pull request. [...]

For more details, please see the book or the FAQ.

HT @compleatang

Posted in Applications, Guides, Monographs | Tagged , , , , , , , , , , , , , | 3 Comments

Ramakrishna and Paschke: Semi-Automated Vocabulary Building for Structured Legal English

Shashishekar Ramakrishna and Adrian Paschke presented a paper entitled Semi-Automated Vocabulary Building for Structured Legal English, at RuleML 2014: International Web Rule Symposium, held 18-20 August 2014 in Prague.

Here is the abstract of the paper:

Structured English has been applied as computational independent language for defining business vocabularies and business rules, e.g., in the context of OMG’s Semantics and Business Vocabulary Representation (SBVR). It allows non-technical domain experts to engineer knowledge in natural language, but with an underlying semi-formal semantics which eases the automation of machine transformation into formal knowledge representations and logic-based machine interpretation. We adapt this approach to the legal domain in order to support legal domain experts in their task to build legal vocabularies and legal rules in Structured English from legal texts. In this paper we contribute with a semi-automated vocabulary and rule development process which is supported by automated suggestions of legal concepts computed by a semantic legal text analysis. We implement a proof-of-concept in the KR4IPLaw tool, which enables legal domain experts to represent their knowledge in Structured English. We evaluate the proposed approach on the basis of use cases in the domain of IP and patent law.

Posted in Applications, Articles and papers, Conference papers, Technology developments, Technology tools | Tagged , , , , , , , , , , , , , , , , , , , , , , , | 1 Comment