Jaquith: Two Mini-Projects spun off from The State Decoded: Subsection Identifier & Definition Scraper

Waldo Jaquith has posted Two Mini-Projects: Subsection Identifier and Definition Scraper, at The State Decoded blog.

Here are excerpts from the post:

The State Decoded project has spun off a couple of sub-projects, components of the larger project that can be useful for other purposes, and that deserve to stand alone. (Both are found on our GitHub repository.)

The first is Subsection Identifier, which turns theoretically structured text into actually structured text. It is common for documents in outline form (contracts, laws, and other documents that need to be able to cross-reference specific passages) to be provided in a format in which the structural labels flow into the text. […]

The second mini-project is Definition Scraper, which extracts defined terms from passages of text. Many legal documents begin by defining words that are then used throughout the document, and knowing those definitions can be crucial to understanding that document. So it can be helpful to be able to extract a list of terms and their definitions. Definition Scraper needs only be handed a passage of text, and it will determine whether it contains defined terms and, if it does, it will return a dictionary of those terms and their definitions. […]

For more details, please see the complete post.

The State Decoded is Waldo’s free and open legal data and e-participation platform for U.S. states.

Click here for other posts about The State Decoded.

HT @StateDecoded here and here

This entry was posted in Applications, Technology developments, Technology tools and tagged , , , , , , , , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s