Lee: What Gets Redacted in Pacer?

Timothy B. Lee of the Princeton University Department of Computer Science and Center for Information Technology Policy (CITP) has posted What Gets Redacted in Pacer?, on the CITP’s blog, Freedom to Tinker.

In this post, Mr. Lee reports on research respecting documents from the U.S. federal courts’ PACER database. Using customized software, Mr. Lee — using a non-random sample of 1.8 million PACER documents, of which 11,000 appeared to contain redactions — identifies the types of information most frequently redacted in PACER documents. In this sample, social security numbers were the most frequently redacted type of information. Mr. Lee summarizes:

[…][O]ut of 6208 redacted documents, there are 4315 Social Security that can be redacted automatically by machine, 449 addresses whose redaction doesn’t seem to be required by the rules of procedure, and 419 “trade secrets” whose release will typically only harm the party who fails to redact it.

That leaves around 1000 documents that would expose risky confidential information if not properly redacted, or about 0.05 percent of the 1.8 million documents I started with. A thousand documents is worth taking seriously (especially given that there are likely to be tens of thousands in the full PACER corpus). The courts should take additional steps to monitor compliance with the redaction rules and sanction parties who fail to comply with them, and they should explore techniques to automate the detection of redaction failures in these categories.

Mr. Lee’s post doesn’t appear to explain the difference between the 11,000 documents found to contain redactions, and the 6,208 documents described in his statistical analysis.

Mr. Lee concludes:

This tiny fraction of PACER documents with confidential information in them is a cause for concern, but it probably isn’t a good reason to limit public access to the roughly 99.9 percent of documents that contain no sensitive information and may be of significant benefit to the public.

For more information, please see the complete post.

This entry was posted in Policy Materials, Research findings and tagged , , , , , , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s