Posts Tagged ‘Enron corpus’

New Discussion: Bommarito on Hands-on Examples of Legal Search

May 18, 2012

Michael J. Bommarito II of Computational Legal Studies has started a discussion entitled Hands-on Examples of Legal Search, in the LinkedIn group of the International Association for Artificial Intelligence and Law (IAAIL).

The examples are Mr. Bommarito’s recent work describing cloud-based approaches to legal information retrieval and eDiscovery using AWS CloudSearch (here, here, and here), and information retrieval of statutes using Lucene.

For more information, please see the complete post.

Bommarito on AWS CloudSearch for eDiscovery: Generating AWS CloudSearch SDF for Emails

May 8, 2012

Michael J. Bommarito II of Computational Legal Studies has posted Generating AWS CloudSearch SDF for Emails, and “Google” for subpoenaed emails: AWS CloudSearch for eDiscovery, on his blog.

In these posts, Mr. Bommarito proposes using AWS CloudSearch for ediscovery, and provides a worked example, which he describes as follows:

I thought I’d share a proof-of-concept email parser based on the Enron email dataset. The Python script below takes a directory of RFC822 email messages and returns an AWS CloudSearch JSON SDF with fields from the Date, From, To, Subject, and Body fields of the email. There is no special handling for attachments or encoding in this example, but it can be used to populate a CloudSearch domain from the Enron emails. Sample usage below, as well as the output sample here. [...]

For more information, including source code, please see the complete posts: Generating AWS CloudSearch SDF for Emails, and “Google” for subpoenaed emails: AWS CloudSearch for eDiscovery.


Follow

Get every new post delivered to your Inbox.

Join 97 other followers

%d bloggers like this: