The addition of case law and patent search functions to Google Scholar, and subsequent comments from Richard Nash, prompted the following thoughts, which are mine alone:
In my view, the U.S. market for high-end computer-assisted legal research (CALR) currently requires at least the following systems features:
- complete and current collections of primary law;
- large collections of quality secondary legal resources;
- retrieval systems that can search across multiple primary and secondary legal resource databases; and
- specialized and current descriptive metadata applied to document segments, especially citator systems and knowledge representation for points of law.
This high-end market appears to be shrinking, for several reasons:
- clients are forcing cost reductions;
- the low-end legal research market (which typically offers only “plain-vanilla” retrieval of primary legal documents containing little if any descriptive metadata) is offering better metadata (currently citators and, probably before long, subject access to whole documents, if not segments); and
- many lawyers and paralegals appear to be learning to live with the features offered by low-end systems, according to recent survey evidence, such as that from the latest ABA Legal Technology Survey Report.
Google’s U.S. CALR strategy so far seems to have been to engage with this sector from the bottom up. Google first developed its presence in the low-end market by indexing the free case law available on the Web. This week, it has advanced “up market” by integrating that primary legal resource retrieval system with the secondary legal resource retrieval systems of Google Scholar and Google Book Search.
I think that, if Google really wants to compete for the high-end market, its next steps would be:
- to fill the holes in the primary legal resource collections that it indexes (perhaps by offering incentives to organizations that publish primary legal resources free of charge on the Web);
- to build an effective, automated legal citator, as Andrew Plumb-Larrick discusses in his fine post today; and
- to develop a good quality, automated knowledge representation system to provide subject access to individual primary legal documents (i.e., individual statutes, cases, and regulations).
An interesting white paper published earlier this year by Tim O’Reilly & John Battelle may shed light on Google’s next steps. That paper notes Google’s aptitude for developing sophisticated automatic metadata creation systems, such as the one underlying Google Mobile App, that incorporate improvements drawn from the study of large numbers of user searches. This aptitude suggests that Google probably won’t need very long to build a good quality automatic legal citator and subject indexing system, if it has a mind to. If Google takes those further steps, then I think it could take a big share of the high-end CALR market. At the very least, its efforts, coupled with Bloomberg’s, should result in increased competition, lower prices, and more innovation yielding better retrieval tools for users in the U.S. CALR sector. (On the obstacles to innovation in the U.S. CALR sector, see Professor Viktor Mayer-Schönberger’s recent post on VoxPopuLII.)
Google’s ongoing development of its legal research system could also yield additional benefits for the legal community. The O’Reilly-Battelle white paper notes Google’s skill at identifying implicit structures in large data sets. These implicit structures can complement express, formal descriptive metadata, such as, in the legal realm, West’s Key Number System, Lexis’s Headnotes, or Bloomberg’s Points of Law. One great potential benefit for the legal community in Google’s development of its legal research system, is the (automated) discovery of previously unknown, implicit structures in primary and secondary legal information. Some legal scholars are already exploring this area by using techniques developed in connection with the study of complex adaptive systems. I think Google may also have important contributions to make to this area of research, particularly if it chooses to publish its research findings.
Tags: Automatic subject classification of legal documents, Automatic subject indexing of legal documents, Computer assisted legal research, Free access to law, Google Mobile App, Google Mobile Application, Google Scholar, Implicit metadata in legal information, Implicit structure of legal information, John Battelle, Law as a complex adaptive systems, Legal descriptive metadata, Legal knowledge representation, Legal metadata, Legal research, O'Reilly White Papers, Public access to legal information, Subject access to legal information, Tim O'Reilly, Web Squared: Web 2.0 Five Years On