Timothy B. Lee of the Princeton University Department of Computer Science and Center for Information Technology Policy (CITP) has posted Studying the Frequency of Redaction Failures in PACER, on the CITP’s blog, Freedom to Tinker.
In this post, Mr. Lee reports on research respecting documents from the U.S. federal courts’ PACER database. Using customized software, he found that, respecting some of these documents, redactions have been attempted, but have failed. The information not redacted included:
trade secrets such as sales figures and confidential product information. Other improperly redacted documents contain sensitive medical information, addresses, and dates of birth. Still others contain the names of witnesses, jurors, plaintiffs, and one minor.
Mr. Lee then offers recommendations to the U.S. federal judiciary respecting how to avoid this problem. He links to a letter, stating many of these recommendations, that he recently sent to a committee of the Judicial Conference of the United States.
Mr. Lee has also has posted the software code that he used to identify the unsuccessfully redacted documents.
Mr. Lee says that this research was funded by Public.Resource.Org.
For more information on CITP’s PACER-related research, please see Stephen Schultze’s recent VoxPopuLII post, PACER, RECAP, and the Movement to Free American Case Law.