Shaffer: Using NLP to measure executive discretion provided for by constitutions and statutes

Robert Shaffer of the University of Texas will present a paper entitled Power in Text: Grammar and Language in Comparative Delegation, this week at APSA 2013.

Here is the abstract:

Throughout the political science and legal literatures, scholars often use statutes and national constitutions as key sources of data. However, most of these analyses rely on labor-intensive coding schemes, offering precise results but requiring long hours from trained researchers. Existing quantitative measures (e.g., word counts of particular documents) have produced insightful results, but provide imprecise measures for variables of interest. Computational linguistics techniques, including natural language processing (NLP), provide an alternative approach; because legal documents are written so systematically, these texts lend themselves well to automated analysis, allowing computers to extract information in a repeatable fashion. By combining NLP tools with existing coding schemes and close readings of individual documents, scholars can identify and measure key traits of particular texts, creating powerful and innovative measurement schemes.

As a sample application of these techniques, I use NLP programming packages to develop a new measure for the level of executive discretion offered by a particular legal document. I conceptualize “discretion” as the average number of other players involved at each decision point in a statute or national constitution. I then use computational linguistics tools to develop two proposed measures, based on normalized word count and on proper noun incidence, respectively. In particular, I attempt to measure both the number of powers offered by a document, and the number of veto players involved in each power. Finally, I conduct validity tests on each measure, as well as on competing approaches from the literature. For my validity tests, I use data obtained from Elkins, Ginsburg, and Melton’s Comparative Constitutions Project (CCP), which hand-codes national constitutions based on a wide array of attributes. Using CCP data, I generate summary “discretion” statistics for a sample of post-1945 constitutions, which I treat as the “true values” for each document. I then compare the results for each of my measures to the CCP data. Generally speaking, I find that my NLP-based measures are more strongly correlated with these “true values” than the competing approaches, highlighting the potential power of these tools.

HT @aabibliographer

This entry was posted in Applications, Articles and papers, Conference papers, Data sets, Research findings and tagged , , , , , , , , , , , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s