`
`No. 18-1150
`================================================================================================================
`
`In The
`Supreme Court of the United States
`--------------------------------- ---------------------------------
`STATE OF GEORGIA, et al.,
`Petitioners,
`
`v.
`
`PUBLIC.RESOURCE.ORG, INC.,
`Respondent.
`
`--------------------------------- ---------------------------------
`On Writ Of Certiorari To The
`United States Court Of Appeals
`For The Eleventh Circuit
`--------------------------------- ---------------------------------
`BRIEF OF 36 COMPUTATIONAL
`LAW SCHOLARS AS AMICI CURIAE
`IN SUPPORT OF RESPONDENT
`--------------------------------- ---------------------------------
`MICHAEL A. LIVERMORE
` Counsel of Record
`Professor of Law
`UNIVERSITY OF VIRGINIA SCHOOL OF LAW
`580 Massie Road
`Charlottesville, VA 22903
`Tel: (434) 982-6224
`mlivermore@virginia.edu
`CHARLOTTE S. ALEXANDER
`Associate Professor of Law and Analytics
`J. MACK ROBINSON COLLEGE OF BUSINESS
`GEORGIA STATE UNIVERSITY
`35 Broad Street, NW
`Atlanta, GA 30303
`Tel: (404) 413-7468
`calexander@gsu.edu
`ANNE M. TUCKER
`Professor of Law
`GEORGIA STATE UNIVERSITY COLLEGE OF LAW
`85 Park Place, NE
`Atlanta, GA 30303
`Tel: (404) 413-9179
`amtucker@gsu.edu
`
`================================================================================================================
`COCKLE LEGAL BRIEFS (800) 225-6964
`WWW.COCKLELEGALBRIEFS.COM
`
`
`
`
`
`i
`
`TABLE OF CONTENTS
`
`4
`
`Page
`Interest of Amici Curiae .........................................
`1
`Summary of the Argument .....................................
`1
`Argument ................................................................
`4
`
`I. Extending copyright protection to official
`annotations of state statutes will inhibit le-
`gal scholarship .............................................
`A. Digitized, publicly available legal texts
`facilitate the use of computational tools
`in legal scholarship ................................
`B. Open access to large volumes of digital
`texts is required for many forms of com-
`putational legal analysis .......................
`C. Official annotations are legal texts that
`can be usefully analyzed by scholars .... 12
` II. Computational legal scholarship builds on
`a long tradition of scholarly synthesis of
`legal materials that courts have found use-
`ful .................................................................. 14
`A. Commentaries, treatises, and related
`forms of legal scholarship have proven
`useful to courts ...................................... 16
`B. Computational tools are well suited to
`continue the tradition of interpretive le-
`gal scholarship by synthesizing large
`volumes of texts ..................................... 23
`
`4
`
`9
`
`
`
`ii
`
`TABLE OF CONTENTS—Continued
`
`Page
` III. Official annotations such as Georgia’s are
`created by state action, and as such are gov-
`ernment edicts rather than legal commen-
`tary ............................................................... 27
`A. The Eleventh Circuit’s “hallmarks” clearly
`identify official legal texts ..................... 28
`B. Copyright will continue to protect works
`of law scholarship, including annota-
`tions, that have not been officially en-
`dorsed by the state ................................ 30
`Conclusion ............................................................... 31
`
`APPENDIX
`List of Signatories ...................................................... 1a
`
`
`
`
`iii
`
`TABLE OF AUTHORITIES
`
`Page
`
`CASES
`Betterman v. Montana, 136 S. Ct. 1609 (2016) .......... 18
`Bradlie v. Md. Ins. Co., 37 U.S. 378 (1838) ................. 18
`Code Revision Comm’n for General Assembly of
`Georgia v. Public.Resource.Org, Inc., 906 F.3d
`1229 (11th Cir. 2018) ............................................... 28
`Craig v. Provo City, 389 P.3d 423 (Utah 2016) ............ 8
`D.C. v. Heller, 554 U.S. 570 (2008) ........................ 17, 18
`Fire Ins. Exch. v. Oltmanns, 416 P.3d 1148 (Utah
`2018) .......................................................................... 8
`Glebe v. Frost, 574 U.S. 21 (2014) ............................... 14
`Green v. Bock Laundry Mach. Co., 490 U.S. 504
`(1989) ....................................................................... 16
`Kansas v. Nebraska, 135 S. Ct. 1042 (2015) ............... 20
`Kernan v. Cuero, 138 S. Ct. 4 (2017) .......................... 14
`Kisor v. Wilkie, 139 S. Ct. 2400 (2019) ................. 19, 21
`Muddy Boys, Inc. v. DOC, 440 P.3d 741 (Utah Ct.
`App. 2019) .................................................................. 8
`Norfolk & W. Ry. Co. v. Ayers, 538 U.S. 135 (2003) ........ 20
`People v. Harris, 885 N.W.2d 832 (Mich. 2016) ...... 8, 13
`Richards v. Cox, No. 20180033, 2019 Utah LEXIS
`157 (Utah Sept. 11, 2019) ................................... 8, 13
`Rimini Street Inc. et al. v. Oracle USA, Inc., 139
`S. Ct. 873 (2019) ........................................................ 6
`
`
`
`iv
`
`TABLE OF AUTHORITIES—Continued
`
`Page
`
`State v. Lantis, No. 46171, 2019 Ida. LEXIS 127
`(Idaho Aug. 23, 2019) .......................................... 8, 13
`Stoneridge Inv. Partners, LLC v. Scientific-Atlanta,
`Inc., 552 U.S. 148 (2008) .......................................... 18
`Sun Oil Co. v. Wortman, 486 U.S. 717 (1988) ............. 17
`Surplus Trading Co. v. Cook, 281 U.S. 647 (1930) ........ 18
`The William Bageley, 72 U.S. 377 (1866) ................... 18
`U.S. v. Maine, 475 U.S. 89 (1986) ................................ 18
`U.S. v. Wong Kim Ark, 169 U.S. 649 (1898) ................ 18
`Washington v. Glucksberg, 521 U.S. 702 (1997) ......... 18
`Wilson v. Safelite Grp., Inc., 930 F.3d 429 (6th
`Cir. 2019) ............................................................. 8, 13
`
`
`OTHER AUTHORITIES
`1767 ANNUAL REGISTER 286 (8th ed. 1809) ................. 17
`Albert W. Alschuler, Rediscovering Blackstone,
`145 U. PA. L. REV. 1 (1996) ......................... 15, 17, 18
`Charlotte S. Alexander, Litigation Migrants, 56
`AM. BUS. L.J. 235 (2019) ......................................... 12
`Charlotte S. Alexander, #MeToo and the Litiga-
`tion Funnel, 22 EMPL. RTS. & EMPL. POL’Y J.
`101 (2019) ................................................................ 12
`American Law Reports, 23 A.L.R. FED. 878
`(1975) ....................................................................... 22
`Henry W. Ballantine, BALLANTINE ON CORPORA-
`TIONS (rev. ed. 1946) ................................................. 19
`
`
`
`v
`
`TABLE OF AUTHORITIES—Continued
`
`Page
`
`Oren Bar-Gill, Omri Ben-Shahar & Florencia
`Marotta-Wurgler, Searching for the Common
`Law: The Quantitative Approach of the Re-
`statement of Consumer Contracts, 84 U. CHI.
`L. REV. 7 (2017) ....................................................... 23
`Robert C. Berring, Full-Text Databases and Le-
`gal Research: Backing into the Future, 1 HIGH
`TECH L.J. 27 (1986) ................................................... 4
`David M. Blei, Probabilistic Topic Models, 44
`COMM. ACM 77 (2012) ............................................... 6
`David M. Blei & John D. Lafferty, A Correlated
`Topic Model of Science, 1 ANN. APPL. STAT. 17
`(2007) ......................................................................... 6
`Keith Carlson, Michael A. Livermore & Daniel
`Rockmore, A Quantitative Analysis of Writing
`Style on the US Supreme Court, 93 WASH. U.
`L. REV. 1461 (2016) ................................................. 24
`Tom S. Clark & Benjamin E. Lauderdale, The Ge-
`nealogy of Law, 20 POL. ANALYSIS 329 (2012) ......... 25
`Edward W. Cleary et al., MCCORMICK ON EVI-
`DENCE (3d ed. 1984) ................................................. 16
`Arthur L. Corbin et al., CORBIN ON CONTRACTS
`(rev. ed. 2019)........................................................... 19
`Pamela C. Corley, Paul M. Collins, Jr. & Bryan
`Calvin, Lower Court Influence on U.S. Su-
`preme Court Opinion Content, 73 J. POL. 31
`(2011) ......................................................................... 6
`
`
`
`vi
`
`TABLE OF AUTHORITIES—Continued
`
`Page
`
`Corpus of Contemporary American English, https://
`www.english-corpora.org/coca/ ................................. 13
`Corpus of Historical American English, https://
`www.english-corpora.org/coha/ ............................... 13
`CourtListener, https://www.courtlistener.com/ ............ 9
`Mattias Derlén & Johan Lindholm, Is it good
`law? Network Analysis and the CJEU’s Inter-
`nal Market Jurisprudence, 20 J. INT’L ECON. L.
`257 (2017) ................................................................ 25
`A.V. Dicey, Blackstone’s Commentaries, 4 CAM-
`BRIDGE L.J. 286 (1932) ............................................. 17
`Frank Fagan, Successor Liability from the Per-
`spective of Big Data, 9 VA. L. & BUS. REV. 391
`(2014) ....................................................................... 25
`Adam Feldman, With A Little Help from Aca-
`demic Scholarship, Empirical SCOTUS, Oct.
`31, 2018, https://empiricalscotus.com/2018/10/
`31/academic-scholarship/ ........................................ 16
`William M. Fletcher et al., FLETCHER CYCLOPEDIA
`OF THE LAW OF PRIVATE CORPORATIONS (perm.
`ed., rev. vol. 1999) .............................................. 19, 22
`James H. Fowler & Sangick Jeon, The Authority
`of Supreme Court Precedent, 30 SOC. NETWORKS
`16 (2008) .................................................................. 24
`James H. Fowler et al., Network Analysis and
`the Law: Measuring the Legal Importance
`of Precedents at the U.S. Supreme Court, 15
`POL. ANALYSIS 324 (2007) .......................................... 7
`
`
`
`vii
`
`TABLE OF AUTHORITIES—Continued
`
`Page
`
`Nikhil Garg et al., Word Embeddings Quantify
`100 Years of Gender and Ethnic Stereotypes,
`115 PROC. NAT’L ACAD. SCI. E3635 (2018) .............. 11
`Andrew Hamm, Retired Justice Kennedy prom-
`ises message of civility at American Law Insti-
`tute’s annual meeting, SCOTUSBlog (May 20,
`2019, 5:17 PM), https://www.scotusblog.com/2019/
`05/retired-justice-kennedy-promises-message-
`of-civility-at-american-law-institutes-annual-
`meeting .................................................................... 20
`Mireille Hildebrandt, The Force of Law and the
`Force of Technology, in THE ROUTLEDGE HAND-
`BOOK OF TECHNOLOGY, CRIME AND JUSTICE 597
`(M.R. McGuire & Thomas J. Holt eds., 2017) ........... 4
`H.F. Jolowicz & Barry Nicholas, HISTORICAL IN-
`TRODUCTION TO THE STUDY OF ROMAN LAW (3d
`ed. 1972) .................................................................. 15
`Justia, https://www.justia.com/ .................................... 9
`Eamonn Keogh & Abdullah Mueen, Curse of
`Dimensionality, in ENCYCLOPEDIA OF MACHINE
`LEARNING AND DATA MINING 315 (Claude Sam-
`mut & Geoffrey I. Webb eds., 2017) ........................ 10
`Friedrich Kessler, Corbin on Contracts: Part I:
`Formation of Contract, 61 YALE L.J. 1092
`(1952) ....................................................................... 19
`Sara Klingenstein, Tim Hitchcock & Simon DeDeo,
`The Civilizing Process in London’s Old Bailey,
`111 PROC. NAT’L ACAD. SCI. 9419 (2014) ................. 26
`
`
`
`viii
`
`TABLE OF AUTHORITIES—Continued
`
`Page
`
`Mason Ladd, Credibility Tests—Current Trends,
`89 U. PA. L. REV. 166 (1940) ................................... 16
`LAW AS DATA: COMPUTATION, TEXT, AND THE FU-
`TURE OF LEGAL ANALYSIS (Michael A. Liver-
`more & Daniel L. Rockmore eds., 2019) ................... 5
`David S. Law, Constitutional Archetypes, 95 TEX.
`L. REV. 153 (2016) ................................................... 25
`David S. Law & David Zaring, Law Versus Ideol-
`ogy: The Supreme Court and the Use of Legis-
`lative History, 51 WM. & MARY L. REV. 1653
`(2010) ....................................................................... 12
`Thomas R. Lee & Stephen C. Mouritsen, Judg-
`ing Ordinary Meaning, 127 YALE L.J. 788
`(2018) ......................................................................... 7
`Bing Liu, SENTIMENT ANALYSIS: MINING OPINIONS,
`SENTIMENTS, AND EMOTIONS (2015) ........................... 6
`Michael A. Livermore, Vladimir Eidelman & Brian
`Grom, Computationally Assisted Regulatory
`Participation, 93 NOTRE DAME L. REV. 977
`(2018) ....................................................................... 12
`Michael A. Livermore, Allen B. Riddell & Daniel
`N. Rockmore, The Supreme Court and the Ju-
`dicial Genre, 59 ARIZ. L. REV. 837 (2017) ................. 7
`Jonathan Macey & Joshua Mitts, Finding Order
`in the Morass: The Three Real Justifications
`for Piercing the Corporate Veil, 100 CORNELL
`L. REV. 99 (2014) ..................................................... 25
`
`
`
`ix
`
`TABLE OF AUTHORITIES—Continued
`
`Page
`
`John Manning, Constitutional Structure and Ju-
`dicial Deference to Agency Interpretations of
`Agency Rules, 96 COLUM. L. REV. 612 (1996) ......... 21
`María José Marín, Legalese as Seen Through the
`Lens of Corpus Linguistics—An Introduction
`to Software Tools for Terminological Analysis,
`6 INT’L J. LANGUAGE & L. 18 (2017) ........................ 13
`Carl McGowan, Impeachment of Criminal De-
`fendants by Prior Conviction, 1970 L. & SOC.
`ORDER 1 (1970) ........................................................ 16
`Joseph Scott Miller, Error Costs and IP Law,
`2014 U. ILL. L. REV. 175 .......................................... 30
`James Moore & Helen Bendix, MOORE’S FEDERAL
`PRACTICE (2d ed. 1988) ............................................ 16
`Eric C. Nystrom & David S. Tanenhaus, The Fu-
`ture of Digital Legal History: No Magic, No Sil-
`ver Bullets, 56 AM. J. LEGAL HIST. 150 (2016) ........ 27
`Eric C. Nystrom & David S. Tanenhaus, “Let’s
`Change the Law”: Arkansas and the Puzzle of
`Juvenile Justice Reform in the 1990s, 34 L. &
`HIST. REV. 957 (2016) .............................................. 27
`Richard R. Powell, POWELL ON REAL PROPERTY
`(Patrick J. Rohan ed., 1995) .................................... 19
`Douglas Rice, The Impact of Supreme Court Ac-
`tivity on the Judicial Agenda, 48 L. & SOC’Y
`REV. 63 (2014).......................................................... 12
`
`
`
`
`
`
`x
`
`TABLE OF AUTHORITIES—Continued
`
`Page
`
`Allen Riddell, How to Read 22,198 Journal Arti-
`cles: Studying the History of German Studies
`with Topic Models, in DISTANT READINGS: TO-
`POLOGIES OF GERMAN CULTURE IN THE LONG
`NINETEENTH CENTURY 91 (Matt Erlin & Lynne
`Tatlock eds., 2014) ................................................... 24
`Daniel Rockmore et al., The Cultural Evolution
`of National Constitutions, 69 J. ASS’N INFO.
`SCI. & TECH. 483 (2017) .......................................... 12
`Charles W. Romney, Using Vector Space Models
`to Understand the Circulation of Habeas Cor-
`pus in Hawai’i, 1852-92, 34 L. & HIST. REV.
`999 (2016) ................................................................ 26
`J.B. Ruhl, John Nay & Jonathan M. Gilligan,
`Topic Modeling the President: Conventional
`and Computational Methods, 86 GEO. WASH.
`L. REV. 1243 (2018) ................................................. 25
`J.B. Ruhl, Daniel Martin Katz & Michael J.
`Bommarito II, Harnessing Legal Complexity:
`Bringing Tools of Complexity Science to Bear
`on Improving Law, 355 SCIENCE 1377 (2017) ........... 7
`Stephen Skinner, Blackstone’s Support for the
`Militia, 44 AM. J. LEGAL HIST. 1 (Jan. 2000) .......... 15
`Cass R. Sunstein & Adrian Vermeule, Consti-
`tutional Structure and Judicial Deference to
`Agency Interpretations of Agency Rules, 96
`COLUM. L. REV. 612 (1996) ...................................... 21
`
`
`
`xi
`
`TABLE OF AUTHORITIES—Continued
`
`Page
`
`Samuel Williston & Richard A. Lord, A TREATISE
`ON THE LAW OF CONTRACTS (4th ed. 1993 &
`Supp. 1999) ........................................................ 19, 22
`Jack Weinstein & Margaret Berger, WEINSTEIN’S
`EVIDENCE (rev. ed. 1988) .......................................... 16
`John H. Wigmore, A TREATISE ON THE ANGLO-
`AMERICAN SYSTEM OF EVIDENCE IN TRIALS AT
`COMMON LAW (3d ed. 1940) ..................................... 19
`
`
`
`1
`
`INTEREST OF AMICI CURIAE1
`Amici are a group of 36 scholars who study the
`
`law using computational methodologies and whose re-
`search requires access to digital versions of legal texts.2
`Amici scholars have a range of disciplinary backgrounds,
`including law, political science, history, economics, fi-
`nance, computer science, and mathematics. Common
`to amici’s scholarly work is the need for unfettered,
`copyright-free access to legal texts, in order to synthe-
`size, interpret, and study the law.
`
`--------------------------------- ---------------------------------
`
`SUMMARY OF THE ARGUMENT
`
`The People have interests both in knowing what
`the law says and in understanding what it means. The
`government edicts doctrine furthers these interests
`by granting unfettered, copyright-free access to legal
`texts. This access is available to legal subjects so that
`they can have notice of the rules that apply to their
`
`
`1 Counsel of record for all parties received notice at least 10
`
`days prior to the due date of the amici curiae’s intention to file
`this brief. All parties have given consent. No counsel for a party
`authored this brief in whole or in part, and no counsel or party
`made a monetary contribution intended to fund the preparation
`or submission of this brief. No person other than amici curiae
`made a monetary contribution to this brief ’s preparation or sub-
`mission. A list of all of the amici is set forth in the Appendix to
`this brief.
`2 The views expressed herein are those of the amici in their
`
`capacity as scholars. No part of this brief purports to express the
`views of any institution, including the University of Virginia
`School of Law and Georgia State University.
`
`
`
`2
`
`conduct. Access to legal texts is also foundational to
`law scholarship, which has informed shared under-
`standing of the law and shaped legal development for
`centuries.
`
`In a digital age, digital access to the law is the
`
`touchstone. Such access not only lowers barriers to
`the public; it also facilitates the application of new re-
`search tools, such as natural language processing, com-
`putational text analysis, and machine learning, that
`can help illuminate the law’s meaning. Narrowing the
`scope of the government edicts doctrine will inhibit
`scholars’ ability to access the law and apply these new
`tools and techniques.
`
`Since at least the sixth century, when Justinian I
`
`ordered the organization and codification of Roman
`Law, the work of law scholars, jurists, and legal practi-
`tioners has been intertwined. As societies grew in scale
`and complexity, the law became a learned profession,
`requiring substantial study to come to grips with the
`rules and rulings issued by government bodies diffused
`across increasingly sprawling states. In the common
`law tradition, legal commentaries, treatises, and other
`works of law scholarship have played a particularly
`important role, aggregating and synthesizing what
`would otherwise be an unmanageable body of case law.
`Such scholarship has frequently been referenced by
`state and federal courts, including this Court, since the
`founding of the Republic.
`
`From the printing press to the internet, law schol-
`
`arship has evolved alongside information technology.
`
`
`
`3
`
`With the growing availability of digital versions of le-
`gal texts, scholars of the law have begun to take ad-
`vantage of related advances in mathematics, computer
`science, statistics, and machine learning. Recent work
`applies these tools to a range of legal texts, and re-
`searchers are actively developing methodologies and
`techniques that can help improve both scholarly and
`public understanding of the law. Unfettered, copyright-
`free access to large bodies of legal texts in digital form
`is a precondition for future development in this area.
`
`Official annotations to state statutory codes fall
`
`into the heartland of the types of texts that law schol-
`ars can usefully analyze with computational techniques.
`Such annotations are state-endorsed interpretations of
`and commentary on state statutes. Whatever their of-
`ficially binding character, they are used by courts and
`other legal actors—including scholars of the law—to
`understand and apply the law. Other forms of legal
`commentary and scholarship created by private au-
`thors are informative and persuasive, as evidenced by
`the Court’s long history of reliance on such works, but
`are not authoritative. Legislative endorsement of offi-
`cial annotations confer the legitimacy of the state on
`these interpretations, distinguishing them from other
`forms of legal commentary and raising their status to
`the level of a government edict.
`
`
`
`--------------------------------- ---------------------------------
`
`
`
`
`
`
`4
`
`ARGUMENT
`I. Extending copyright protection to official
`annotations of state statutes will inhibit le-
`gal scholarship
`New technologies such as natural language pro-
`
`cessing and other methods of computational text analy-
`sis have created new approaches to legal scholarship.
`This work has already borne early fruit as scholars
`have developed new insights and courts have looked to
`techniques such as corpus linguistics to aid legal inter-
`pretation. This type of scholarship requires access to
`large data sets of legal texts. Statutory annotations fall
`squarely within the types of data that can be usefully
`subjected to computational legal analysis. As a conse-
`quence, subjecting official annotations to copyright
`would unnecessarily hinder the growth of this new
`form of legal scholarship.
`
`
`
`A. Digitized, publicly available legal texts
`facilitate the use of computational tools
`in legal scholarship
`
`Law and legal scholarship have long been inter-
`twined with information technology. Robert C. Berring,
`Full-Text Databases and Legal Research: Backing into
`the Future, 1 HIGH TECH L.J. 27 (1986). A technological
`change enabled the transition from exclusive reliance
`on cultural norms to formal, consistent legal rules:
`the advent of the written word. Mireille Hildebrandt,
`The Force of Law and the Force of Technology, in
`THE ROUTLEDGE HANDBOOK OF TECHNOLOGY, CRIME AND
`
`
`
`5
`
`JUSTICE 597, 599 (M.R. McGuire & Thomas J. Holt eds.,
`2017). The printing press, and later the creation of
`searchable legal databases, also profoundly influenced
`how law was distributed, understood, and studied. See
`Berring, supra.
`
` More recently, two related trends are transform-
`ing the practice and study of law: the large-scale digit-
`ization of legal texts; and advances in information
`processing technology and theory. Michael A. Liver-
`more & Daniel L. Rockmore, Introduction: From Ana-
`logue to Digital Legal Scholarship, in LAW AS DATA:
`COMPUTATION, TEXT, AND THE FUTURE OF LEGAL ANALY-
`SIS xv (Michael A. Livermore & Daniel L. Rockmore
`eds., 2019). In the 1970s, commercial databases led
`the digitization of legal texts, which later spread
`through the burgeoning internet. Now, growing public
`data availability has intersected with developments in
`the fields of artificial intelligence, natural language
`processing, text mining, and machine learning to in-
`crease the role of computational methods in the profes-
`sional lives of lawyers, law scholars, and courts.
`
`Researchers engaged in the computational analysis
`
`of legal texts depend not only on access to digital legal
`texts, however, but also, and critically, on open access
`to them. This is because proprietary databases such as
`Lexis and Westlaw prevent researchers (even those
`with paid subscriptions) from downloading textual data
`in bulk using automated approaches. If data is availa-
`ble on an open-source site such as Public.Resource.Org,
`researchers can automate the data collection process—
`essentially programming their computers to collect the
`
`
`
`6
`
`data for them. When legal texts are not publicly avail-
`able, but are instead locked away in proprietary data-
`bases, computational research is extremely costly and
`inefficient, at best, and may be entirely infeasible.
`
`Once text is assembled in bulk, however, tech-
`
`niques in natural language processing can be used to
`extract quantitatively useful information from com-
`plex legal texts. For example, similarities between doc-
`uments can be discovered and used to sort and classify
`documents into meaningful categories, which is the
`basis of e-discovery, a legal practice at issue in the
`Court’s recent decision in Rimini Street Inc. et al. v.
`Oracle USA, Inc., 139 S. Ct. 873 (2019). Further, algo-
`rithms originally designed to detect plagiarism can be
`used to measure similarity between texts, enabling
`researchers to explore, for example, how lower federal
`court opinions influence the content of Supreme Court
`opinions. Pamela C. Corley, Paul M. Collins, Jr. &
`Bryan Calvin, Lower Court Influence on U.S. Supreme
`Court Opinion Content, 73 J. POL. 31 (2011). Another
`technique, known as sentiment analysis, uses the
`presence of positive or negative words to estimate the
`emotional content of texts, and accordingly facilitates
`research into attitudes, feelings, and biases of authors
`and institutions. Bing Liu, SENTIMENT ANALYSIS: MIN-
`ING OPINIONS, SENTIMENTS, AND EMOTIONS (2015). Topic
`modelling, which extracts semantic content from tex-
`tual data, David M. Blei, Probabilistic Topic Models, 44
`COMM. ACM 77 (2012); David M. Blei & John D. Laf-
`ferty, A Correlated Topic Model of Science, 1 ANN. APPL.
`STAT. 17 (2007), is another approach that legal scholars
`
`
`
`7
`
`employ to quantitatively represent legal texts, and
`study similarities and differences among them. Mi-
`chael A. Livermore, Allen B. Riddell & Daniel N. Rock-
`more, The Supreme Court and the Judicial Genre, 59
`ARIZ. L. REV. 837 (2017). Moreover, text mining tools
`can extract citation information from legal texts, which
`can then be coupled with various forms of network
`analysis to, for example, examine legal complexity or
`reveal patterns of influence among courts. J.B. Ruhl,
`Daniel Martin Katz & Michael J. Mommarito II, Har-
`nessing Legal Complexity: Bringing Tools of Complex-
`ity Science to Bear on Improving Law, 355 SCIENCE
`1377 (2017); James H. Fowler et al., Network Analysis
`and the Law: Measuring the Legal Importance of Prec-
`edents at the U.S. Supreme Court, 15 POL. ANALYSIS
`324, 325 (2007). The information extracted through
`computational methods such as these can be analyzed
`using traditional statistical models as well as newer,
`machine-learning approaches, to generate both descrip-
`tive and predictive results of widespread interest—
`both to the legal community and broader society.
`
`Technology-driven analytic methods applied to le-
`
`gal texts can inform long-standing inquiries in the law.
`One approach that has received considerable recent
`attention is the use of corpus linguistics in legal anal-
`ysis. Thomas R. Lee & Stephen C. Mouritsen, Judging
`Ordinary Meaning, 127 YALE L.J. 788 (2018). Corpus
`linguistics is a computer-based method of collecting
`information regarding the use and context of a phrase
`or word by interrogating a large body, or corpus, of
`naturally occurring language. Id. This tool permits
`
`
`
`8
`
`scholars, parties to litigation, and judges to address
`ambiguity in a law by considering the ordinary mean-
`ing of a word or phrase in the historical context of the
`legislation’s enactment.
`
`Corpus linguistics has already been recognized as
`
`a valuable tool by courts, some of which have employed
`the analytic method to inform their legal interpreta-
`tions. In recent years, courts including the Sixth Cir-
`cuit Court of Appeals, Wilson v. Safelite Grp., Inc., 930
`F.3d 429, 438-39 (6th Cir. 2019), and the Supreme
`Courts of Utah, Richards v. Cox, No. 20180033, 2019
`Utah LEXIS 157, at *10-14 (Utah Sept. 11, 2019);
`Michigan, People v. Harris, 885 N.W.2d 832, 838-39
`(Mich. 2016); and Idaho, State v. Lantis, No. 46171,
`2019 Ida. LEXIS 127, at *13-17 (Idaho Aug. 23, 2019),
`have independently conducted corpus linguistic inquir-
`ies and reported their results in published opinions.
`Other courts have noted positively the value of such
`analysis and have encouraged parties to present em-
`pirical support derived from the method. See, e.g.,
`Muddy Boys, Inc. v. DOC, 440 P.3d 741, 749 (Utah Ct.
`App. 2019) (“[O]ne of the chief benefits of a corpus-
`linguistics-style analysis is that it offers a systematic,
`nonrandom look at the way words are used across a
`large body of sources.”); Craig v. Provo City, 389 P.3d
`423, 428 (Utah 2016). The Utah Supreme Court en-
`couraged lawyers to “provide courts with meaningful
`tools using the best available methods when the court
`is tasked with determining ordinary meaning,” noting
`that there is a general shortcoming in human ability
`to select the most common meaning of language. Fire
`
`
`
`9
`
`Ins. Exch. v. Oltmanns, 416 P.3d 1148, 1163 n.9 (Utah
`2018).
`
`Computationally based legal scholarship prom-
`
`ises to shed light on classic questions of legal interpre-
`tation and may open up entirely new avenues for
`future legal development. But the success of this re-
`search depends on open access to unbiased data—the
`very type of access that Public.Resource.Org as well as
`similar open-source sites, such as CourtListener and
`Justia, are working to facilitate. CourtListener, https://
`www.courtlistener.com/; Justia, https://www.justia.com/.
`Scholars in this area are taking advantage of open ac-
`cess to the law and legal materials to leverage increas-
`ingly sophisticated analytic techniques. This trend has
`already shown important potential to contribute to the
`long-standing and productive dialogue between jurists
`and legal scholarship.
`
`
`
`B. Open access to large volumes of digital
`texts is required for many forms of com-
`putational legal analysis
` Many of the computational techniques discussed
`above share an important characteristic: their ability
`to return useful results is a function of the quantity
`and quality of data available to them. Without a large
`amount of textual data, these forms of analysis are less
`effective and, in some cases, cannot be conducted at all.
`Limiting access to legal texts—and particularly those
`legal texts that have the most value for computational
`
`
`
`10
`
`legal analysis—directly interferes with the field’s abil-
`ity to grow.
`
`There are two primary reasons why bulk data is
`
`needed for computational legal analysis. The first con-
`cerns complexity. Legal texts, when represented in a
`quantitative fashion, can be understood as complex,
`high-dimensional objects. A single judicial opinion, for
`example, might address multiple different legal claims,
`analyze multiple different strands of precedent rele-
`vant to each, and conclude with different rulings on
`different sub-parts of the parties’ arguments. To render
`all of this information quantitatively and in high fidel-
`ity requires an extremely large number of variables
`(i.e., dimensions). In other words, many factors are
`necessary for computational tools to be precise. Even
`when dimensionality reduction tools are used to make
`the data more tractable, there are often limits to the
`simplicity with which legal texts can be accurately rep-
`resented. As the number of dimensions in a data set
`grows, the number of observations (amount of text)
`needed to carry out meaningful analysis also grows. In
`the technical literature, this fact is sometimes referred
`to as the “curse of dimensionality.” Eamonn Keogh &
`Abdullah Mueen, Curse of Dimensionality, in ENCY-
`CLOPEDIA OF MACHINE LEARNING AND DATA MINING 314
`(Claude Sammut & Geoffrey I. Webb eds., 2017). For
`analysis that is sensitive to fine distinctions between
`legal documents (implying a relatively large number of
`variables), a large number of observations is needed, in
`the form of large masses of legal text.
`
`
`
`11
`
`A second consideration that favors large volumes
`
`of data is the problem of bias. Nikhil Garg et al., Word
`Embeddings Quantify 100 Years of Gender and Ethnic
`Stereotypes, 115 PROC. NAT’L ACAD. SCI. E3635 (2018).
`Datasets that are systematically limited create the
`risk that conclusions drawn from this data will be
`skewed in some unobservable fashion that makes
`analysis and interpretation difficult. To take one exam-
`ple, corpus linguistics examines how words are used in
`context. Were a corpus to systematically exclude texts
`produced by certain groups of language users, then the
`resulting analyses would be biased, in the sense that
`alternative usages that are common among the e