Posts Tagged ‘Relevance in legal information retrieval’

Sinha on Speed and Relevance Improvements to Indian Kanoon

October 25, 2012

Dr. Sushant Sinha of Indian Kanoon has posted Faster and More Relevant Kanoon!, at the Indian Kanoon forums.

He writes:

A new release of IndianKanoon brings in the following changes:
1. A new tiering function that slims down the top tier and significantly improves the time taken to execute a query.
2. A new ranking function to improve relevance.
3. Improved word matching and abbreviations. A search of “adm jabalpur” will match “additional district magistrate jabalpur”
http://www.indiankanoon.org/search/?formInput=adm+jabalpur
4. New operators ANDD, ORR and NOTT that can be used with words
5. Clicking on a document after a search query shows the contexts in the document in which the query appears.
6. Performance improvements coming from upgrade to Postgresql 9.2

For more information or to provide comments, please see the complete post.

HT @sushantsinha

Ghosh et al. on Cluster-based Relevance Feedback in Legal Information Retrieval

July 30, 2012

Kripabandhu Ghosh of the Indian Statistical Institute Information Retrieval Lab, and colleagues, have published Cluster-based Relevance Feedback: TREC Legal Track 2011, in The Twentieth Text REtrieval Conference (TREC 2011) Proceedings.

Here is the abstract:

This is our second participation in the TREC Legal Track. The TREC Legal Track 2011 featured only the Learning Task. We participated in Topics 401 and 403. We used Lemur 4.11 for Boolean retrieval and followed it with a clustering technique, where we chose members from each cluster (which we called seeds) for relevance judgement by the TA and assumed all other members of the cluster whose seeds are assessed as relevant to be relevant. Based on the relevance information from seeds and their clusters, we applied Rocchio relevance feedback technique implemented in Terrier 3.0. Then, we used the feedback terms for the expansion of both the text queries and the Boolean queries. Finally, we used Z-fusion, a data fusion technique, on two of our runs.

Click here for reports of the authors’ results (scroll down to INDIAN STATISTICAL INSTITUTE, KOLKATA).

Click here for the overview paper on the 2011 TREC Legal Track.

Nevelow Mart & Luftig on Curation of Legal Resources, and Digest and Citator Results in Wexis

July 24, 2012

Professor Susan Nevelow Mart of the University of Colorado Boulder School of Law, and Professor Dr. Jeffrey T. Luftig of the University of Colorado, Boulder, have posted the abstract of a new paper entitled The Case for Curation: The Relevance of Digest and Citator Results in Westlaw and Lexis.

Here is the abstract:

Humans and machines are both involved in the creation of legal research resources. For legal information retrieval systems, the human-curated finding aid is being overtaken by the computer algorithm. But human-curated finding aids still exist. One of them is the West Key Number system. The Key Number system’s headnote classification of case law, started back in the nineteenth century, was and is the creation of humans. The retrospective headnote classification of the cases in Lexis’s case databases, started in 1999, was created primarily although not exclusively with computer algorithms. So how do these two very different systems deal with a similar headnote from the same case, when they link the headnote to the digesting and citator functions in their respective databases? This paper continues an investigation into this question, looking at the relevance of results from digest and citator search run on matching headnotes in ninety important federal and state cases, to see how each performs. For digests, where the results are curated – where a human has made a judgment about the meaning of a case and placed it in a classification system – humans still have an advantage. For citators, where algorithm is battling algorithm to find relevant results, it is a matter of the better algorithm winning. But no one algorithm is doing a very good job of finding all the relevant results; the overlap between the two citator systems is not that large. The lesson for researchers: know how your legal research system was created, what involvement, if any, humans had in the curation of the system, and what a researcher can and cannot expect from the system you are using.

This paper was presented at AALL 2012: American Association of Law Libraries’ Annual Meeting, held 21-24 July 2012, in Boston Massachusetts, USA.

Zhang et al. on Legal Information Discovery Based on Relevant Feedback

July 17, 2012

Jiayue Zhang and colleagues of the Beijing University of Posts and Telecommunications School of Information and Communication Engineering have published PRIS at TREC 2011 Legal Track: Discovery Based on Relevant Feedback, in The Twentieth Text REtrieval Conference (TREC 2011) Proceedings.

Here is the abstract:

In order to finish the task of TREC 2011 Legal Track, this paper puts forward an experiment method, which combines indri and relevant feedback to evaluate the probability of relevance of every document in a collection.

For reports of the authors’ results, please see:

Click here for the overview paper on the 2011 TREC Legal Track.

Grossman et al.: Overview of the TREC 2011 Legal Track

July 13, 2012

Maura R. Grossman, Esq., of Wachtell, Lipton, Rosen & Katz, and colleagues, have published Overview of the TREC 2011 Legal Track, in The Twentieth Text REtrieval Conference (TREC 2011) Proceedings.

Here is the abstract:

The TREC 2011 Legal Track consisted of a single task: the learning task, which captured elements of both the TREC 2010 learning and interactive tasks. Participants were required to rank the entire corpus of 685,592 documents by their estimate of the probability of responsiveness to each of three topics, and also to provide a quantitative estimate of that probability. Participants were permitted to request up to 1,000 responsiveness determinations from a Topic Authority for each topic. Participants elected either to use only these responsiveness determinations in preparing automatic submissions, or to augment these determinations with their own manual review in preparing technology-assisted submissions. We provide an overview of the task and a summary of the results. More detailed results are available in the Appendix to the TREC 2011 Proceedings.

HT Gordon V. Cormack.

Geist on Demystifying Legal Relevance

March 31, 2012

Anton Geist, LL.M., of Wirtschaftsuniversität Wien, has posted The Need to Demystify Legal Relevance, on the VoxPopuLII blog, published by the Legal Information Institute at Cornell University Law School.

In this post, Mr. Geist argues that the new online information environment, characterized by mass data, requires relinquishing the traditional information retrieval concepts of recall and precision, in favor of the concept of relevance. Mr. Geist recommends ways in which the concept of relevance should be operationalized to meet the particular needs of legal information systems users.

For more information, please see the complete post.

Malmgren: Towards a Theory of Jurisprudential Relevance Ranking – Using Link Analysis on EU Case Law

September 22, 2011

Staffan Malmgren of Stockholm University and the free access to law service of Sweden, lagen.nu, has posted his Master’s thesis, Towards a Theory of Jurisprudential Relevance Ranking – Using Link Analysis on EU Case Law (2011). Here is the abstract:

The concept of relevance is central to both jurisprudence and information retrieval. But what do we mean when we say that something is relevant? Is there a difference between how relevance is understood in jurisprudence and in information science? Which aspects that are unique to legal information have effect on relevance? And can we use this to build better information retrieval systems for legal information?

This thesis discusses the concept of relevance, both as it is used in general and in legal contexts. It describes the retrieval models used in modern information systems, and what notion these models have of relevance. By examining the legal reasoning process, in particular the process of finding legal information, it attempts to find a retrieval model and a function for ranking that is adapted to legal information.

This function is implemented and evaluated against a traditional probabilistic ranking algorithm. It is shown to perform substantially better for all tested information need scenarios.

New on Slaw.ca: Indian Kanoon: Sushant Sinha on Innovation and Free Law in India

June 1, 2011

Dr. Sushant Sinha‘s free access to law service for India, Indian Kanoon, is the subject of my new, in-depth article on Slaw.ca, the Canadian legal blog.

The article provides a great deal of detailed information about Indian Kanoon, including information on technology and open source, users and usage, business models and sustainability, partnerships, product differentiation, advertising, Indian Kanoon‘s online forums, and Dr. Sinha’s innovative concept of “the thirst for law”: the idea, first expressed in Dr. Sinha’s recent VoxPopuLII post, that free access to law online stimulates the public’s demand for such access, in a virtuous circle.

The post also describes Indian Kanoon‘s new partnership with PRS Legislative Research, in which Indian Kanoon is adding to its content full text of the debates of the Parliament of India, which PRS will then use in providing legislative history and research services to members of India’s national and state parliaments. This partnership exemplifies how civil-society-based free-access-to-law services can function as “extensions” of e-Government, consistent with the vision set out by Robinson et al., in Government Data and the Invisible Hand.

I’m very grateful to Dr. Sinha for taking time for extensive interviews that provided the content of the article.

New on VoxPopuLII: Sinha on Indian Kanoon: The Genesis and the Legal Thirst

March 18, 2011

Dr. Sushant Sinha of Yahoo! India has posted Indian Kanoon: The Genesis and the Legal Thirst, on the VoxPopuLII Blog, published by the Legal Information Institute at Cornell University Law School.

In this post, Dr. Sinha describes the origins and development of Indian Kanoon, the free legal search engine for India, for which Dr. Sinha was recently named one of “18 Young Innovators under 35 in India” by MIT’s Technology Review India.

Indian Kanoon provides free online access to Indian statutes, judicial and administrative decisions, debates of India’s constituent assemblies, reports of the Indian Law Commission, and articles from selected law journals. Indian Kanoon also hosts several discussion forums, in which users can ask and receive responses to questions concerning substantive legal issues or Indian Kanoon‘s functionality.

In his post, Dr. Sinha identifies as the principal goal of Indian Kanoon the “empower[ment of] citizens” by enabling them to become informed about “their rights and privileges” under the law.

Dr. Sinha observes that the number of visitors to Indian Kanoon is extremely large and steadily rising; and that the average visitor to Indian Kanoon spends substantial time viewing each retrieved document. Dr. Sinha concludes that these data indicate a growing demand among the Indian people for access to the law — a demand he calls The Legal Thirst — and considers possible causes for this increasing demand.

Dr. Sinha suggests that two factors in particular — the provision of access to law free of charge, and improvements in search technology, including “forgiving” keyword search functionality and the ranking of results by relevance — are fueling the desire of the Indian public to read the full text of the laws that govern them.

Dr. Sinha’s post will be of interest to legal information systems developers, legal publishers, the ICT for development community, and all those interested in the free access to law movement.

Hedges at CITP: Confidentiality, Public Access to Court Records, & Automated Relevance & Privilege Review

August 3, 2010

Former U.S. Magistrate Judge Ronald J. Hedges, Esq. of The Sedona Conference has become a Visiting Research Collaborator with Princeton University’s Center for Information Technology Policy (CITP), according to his recent post at the CITP blog, Freedom to Tinker.

In his recent post, Judge Hedges describes the topics respecting which he’ll conduct research and organize scholarly discussion at CITP. Those topics include:

  • Protecting the confidentiality of personally identifying information and other sensitive information contained in electronic court records;
  • Providing public access to electronic court records;
  • Automatically reviewing electronically stored information (ESI) in the litigation context — and particularly in the eDiscovery context — for relevance and the application of legal privilege.

For more information on Judge Hedges’s work in this area, follow his posts at Freedom to Tinker.


Follow

Get every new post delivered to your Inbox.

Join 97 other followers

%d bloggers like this: