throbber
An Autopsy on Submarine Patents
`
`A Window into Expectations of the World Technological Frontier
`
`Honors Thesis
`Alexander M. Bell∗
`
`Departments of Economics and Computer Science
`
`Brown University
`
`Abstract
`
`A “submarine patent” is one whose issuance and publication has been intentionally delayed
`
`by its inventor with the hope that firms will independently discover and come to rely on
`
`a similar innovation at some later time, at which point the inventor causes his patent to
`
`issue and claims infringement. Although submarine patents are harmful to an innovation-led
`
`economy, previous research has struggled with how to discriminate between these subversive
`
`patents and patents whose issuance was delayed for legitimate reasons. I propose a novel
`
`identification strategy that exploits self-sorting by inventors around a 1995 policy change
`
`that was unfavorable to submarine patents. Using a regression-discontinuity design, I find
`
`that submarine patents are on average much more likely to be asserted for infringement in
`
`court cases. In addition, I conclude that submarines were more common in certain industries
`
`than others as evidenced by differential responses of industries to the policy change. Finally,
`
`I show that the failure of submarine patents within specific industries to ultimately assert
`
`infringement seems to be an indicator of which industries in the world economy experienced
`
`shifts in technological paradigms during this time, providing an additional method to asses
`
`the determinants of differences in income per capita across countries. I also describe how this
`
`result can be generalized across timeperiods.
`
`April 17, 2013
`
`∗The author acknowledges the guidance of his thesis advisor, Oded Galor, and comments from his reader in the computer
`science department, Tom Doeppner. The author is also grateful for helpful comments from Erez Yoeli and a number of friends,
`including those in other disciplines who helped relate these findings to the nature and history of technological progress in other
`fields. Finally, the author acknowledges previous work that laid the framework for this paper while studying the patent backlog
`at the USPTO in combination with Alan Marco, Stu Graham, and Conny Chen.
`
`

`

`Alex Bell
`
`1 Introduction
`
`An Autopsy on Submarine Patents
`
`A little-known Texas-based company filed a patent infringement complaint against social gaming giant Zynga
`
`in early 2012, alleging infringement on four patents. The issue dates of Personalized Media Communications’
`
`four patents were between June of 2010 and March of 2011, but all were filed in 1995: one on May 23, one
`
`on June 6, and two on June 7.
`
`These four patents, and others like them, are sometimes termed “submarine patents.” This term refers
`
`to a patent whose inventor does not wish to market his invention. Instead, after filing his non-public patent
`
`application, the inventor hopes for another inventor to discover the same invention and develop it into a
`
`successful product. When this happens, like a submarine emerging from the depths, the inventor finishes
`
`the paperwork for his patent to issue and sues the now-infringing producer for a share of his profits.
`
`Submarine patenting can be thought of in the context of the free-rider problem. If the costs of developing
`
`an invention into a marketable good are high, an inventor might choose to wait for another firm to invest in
`
`development, then extract royalties from that firm’s success. Submarine patenting may also be considered
`
`to cause a deadweight loss to society in the sense that some fraction of innovators who wish to sell their
`
`products must pay an additional “tax” on their products to submarine patenters.
`
`The economic rationale for patents is to encourage innovation. The government grants a fixed-term
`
`monopoly to an inventor in exchange for fully disclosing his invention early so that society might learn from
`
`it. At the same time, there are standards for patentability. Patents must be useful, novel, and non-obvious.
`
`The applicant carries the burden of proof of patentability, but the US Patent and Trademark Office would
`
`not be doing its job of aiding innovation if it risked disenfranchising inventors of their patents. Thus, if an
`
`inventor botches his application, he is given leniency to amend it without fear of losing his patent rights to
`
`others who would file after his original application. Section 2 discusses other legitimate reasons why a patent
`
`application may be delayed in more depth.
`
`This paper proposes a novel strategy for identifying a large group of submarine patents. I first show that
`
`patent applicants intending to take advantage of this loophole self-sorted to file before a policy change that
`
`made submarining future patents infeasible. Having shown that patents on either side of the discontinuity
`
`are similar in all ways other than their likelihood of being submarine patents, I examine characteristics
`
`of submarine patents and find that they are on average much more likely to be involved in infringement
`
`litigation. However, I find starkly different litigation outcomes for different industry classes, and conclude
`
`that this is due to shifts in technological paradigms within certain industries.
`
`I discuss the relevance of
`
`submarine patents to measure the expectations of technological change and provide a strategy to generalize
`
`the results of this paper to construct a measure of technological expectations and identify shifts in techno-
`
`1
`
`

`

`Alex Bell
`
`An Autopsy on Submarine Patents
`
`logical paradigms over time and within industries. Such information can be useful for assessing differences
`
`in income across countries if economies that are engaged in the use of technologies that undergo paradigm
`
`shifts benefit from such technological revolutions.
`
`The remainder of this paper is as follows. Section 2 provides background on the policy change and a
`
`review of literature within the economics of innovation. Section 3 describes the dataset of patent grants I have
`
`compiled including outward linkages to outcomes such as litigation. Section 4 puts forward a theoretical
`
`model for submarining and testable hypotheses. Section 5 conveys my findings, while Section 6 offers
`
`discussion of the wider implications of these findings to studying the economics of innovation and Section 7
`
`finally concludes.
`
`2 Background and Motivation
`
`This section describes the history of the submarine loophole before proceeding to survey relevant literature.
`
`2.1 A 21st Century Vantage Point
`
`In order for a patent system to be susceptible to submarine patents, I argue that it must have two traits:
`
`(1) While the patent application is pending, other firms must have no knowledge of it. If they do, they will
`
`not use the technology.
`
`(2) Regardless of how long the patent pends for, once granted it must have a long enough term of force to
`
`be worth enforcing against profitable firms that rely on the invention.
`
`Issue (1) was resolved by the American Inventors Protection Act of 1999. Since 2000, most US applications
`are published 18 months after filing, regardless of their status as denied, issued, or still pending.∗ This policy
`
`change seems to have been aimed at bringing the US in line with what other countries were doing, speeding
`
`the diffusion of knowledge, and reducing the feasibility of submarine patents.
`
`Issue (2), however, was resolved earlier, when an agreement was signed by member nations of what was to
`
`become the World Trade Organization in 1994. With the goal of a more globally homogeneous system of IP
`
`enforcement to foster international trade, the Agreement on Trade-Related Aspects of Intellectual Property
`
`Rights contained a number of standards for laws pertaining to copyright, patenting, and other intellectual
`
`property.
`
`One of the many standards introduced was a harmonization of patent term. TRIPS, agreed on near the
`
`end of the 1994 Uruguay Round of the General Agreement on Tariffs and Trade (GATT), mandated that
`∗There are a few exceptions to mandatory application publication, the most notable of which occurs when applicants certify
`that they do not intend to file for the same invention in other countries.
`
`2
`
`

`

`Alex Bell
`
`An Autopsy on Submarine Patents
`
`WTO members grant patent protection of at least 20 years, starting the clock at the filing date of a patent.
`
`Prior to TRIPS, applicants in the US were granted a patent term of 17 years from issue date. President
`
`Clinton signed the GATT on December 8, 1994, with patent term reforms set to take effect six months later,
`
`on June 8.
`
`The effect of the policy change was a tremendous flood of patent applications just prior to the shift. On
`
`June 7, ten times as many applications were filed as any other day excluding the month leading up. The
`
`Appendix contains a press release from the USPTO from June 28 explaining that it received a quarter of
`
`the year’s projected filings in just nine days. From the vantage point of 1995, this is a curious anomaly. But
`
`now that most of these applications have either issued or been abandoned, we see that this cohort of patent
`
`applications differs in important ways from other cohorts, and I will argue in Section 4 that it offers a unique
`
`window into the behavior of submarine patents.
`
`2.2 Related Literature
`
`This section begins with a survey of metrics that other studies have used to measure patent value, then
`
`discusses ways that economists believe inventors appropriate revenue from their inventions in different in-
`
`dustries. After a brief summary of the sparse literature on submarine patents, I overview the vast literature
`
`concerned with the effects on the economy of differing rates of technological progress.
`
`2.2.1 Measures of Patent Value
`
`The patent literature is rich with a variety of metrics for patent value and quality. Much of the earliest
`
`work in defining the roll of patent statistics within economics was done by Griliches (1991) and others at
`
`the NBER’s research program in productivity. The underlying motive was to better understand economic
`
`processes that lead to productivity gains – pursuing “the dream of getting hold of an output indicator of
`
`inventive activity,” in Griliches’ words. Following after Scherer (1984), who linked 15,000 patents to the
`
`443 largest US manufacturing firms in the FTC’s Line of Business Survey, Griliches and others explored
`
`outward linkages to R&D figures and stock market data for publicly traded corporations. 16 Griliches (1991)
`
`summarizes that a strong relationship can be identified at the cross-sectional level between R&D expenditure
`
`and the number of patents a firm has received. He further concludes that there may be evidence of diminishing
`
`returns to R&D expenditures. 8
`
`Hall, Jaffe, and Trajtenberg (2001) provide a more modern approach to patent data characteristic of
`
`the growing availability of digital information, particularly for patents. Of the 400 three-digit classes the
`
`USPTO groups patents into, the authors condensed the data into 36 two-digit technological sub-categories,
`
`3
`
`

`

`Alex Bell
`
`An Autopsy on Submarine Patents
`
`and ultimately into six higher-level categories: Chemical (excluding drugs), Computers and Communica-
`
`tions, Drugs and Medical, Electrical and Engineering, Mechanical, and Others. However, their study reflects
`
`the difficulties of others who have attempted similar groupings, and they suggest that “while convenient,
`
`the present classification should be used with great care, and reexamined critically for specific applications.”
`
`They also discuss the usefulness of backward citations (citations a patent makes) as constituting a “paper
`
`trail” to measure knowledge spillovers and forward citations (citations received) as indicative of the “im-
`
`portance” of a patent. They put forward new measures in the form of Herfindahl concentration indices:
`
`Generality – the percentage of citations a patent makes in classes other than its own – and Originality – the
`
`percentage of citations received from other classes. They briefly discuss some validation strategies for these
`
`metrics. For example, Computers and Communications scores high on Generality, consistent with the view
`
`that it is a general purpose technology, and high on originality, in accordance with a view that it tends to
`
`break traditional models in terms of innovation. 11
`
`Hall, Jaffe, and Trajtenberg (2000) provide further insight into outside linkages of patent data. The study
`
`found that, in predicting firms’ market value from patent counts, weighting patents by their citation counts
`
`could better predict firms’ market value, indicating that forward citations are in some way tied to a notion
`
`of a patent’s “value.” 10
`
`In a different vein, a comprehensive survey by Scherer and others of US and German firms found payment
`
`of renewal fees to be a reliable proxy for patent value. They also confirm that the distribution of patent
`
`values is highly skewed, with a few patents being extremely profitable. 17
`
`Table 1 summarizes and expands upon a dichotomy proposed in van Zeebroeck et al (2008), which
`
`classifies the strategies used by economists to view patent data as either patent-based or market-based. 18
`
`Many of these techniques are revisited in Section 4, in which the feasibility and applicability of their use for
`
`this project are discussed.
`
`2.2.2 Appropriability of Inventions
`
`Within the field of economics, two major investigations have been carried out into how firms appropriate
`
`rents from their innovations. The first, published in 1987, was a survey of 650 R&D executives in 130 different
`
`lines of business (as defined by the FTC). It is sometimes referred to as the Yale survey. 13 The second was
`
`administered in 1994 to 1478 R&D labs, and is sometimes referred to as the Carnegie Mellon survey. 5
`
`The Yale survey divided its questions into product and process patents.
`
`In general, firms reported
`
`capturing profits from product innovations with patents more often than with process innovations, perhaps
`
`because it is more difficult (and less desirable) to keep product innovations secret. For processes, lead time
`
`and learning advantages were rated much more useful than patents. In general, patents were rated more
`
`4
`
`

`

`Alex Bell
`
`An Autopsy on Submarine Patents
`
`Table 1: Established Metrics of Patent Value and Related Dimensions
`
`Metric
`—Patent-Based —
`Backward Citations
`
`Forward Citations
`
`Generality
`Originality
`Maintenance fee payments
`Legal disputes
`
`Parents
`
`—Market-Based —
`Firm value
`Estimated patent value
`
`Description
`
`citations the patent makes; commonly used to track knowledge
`spillovers
`citations the patent receives; generally accepted to be measure of
`“importance”
`diverse classes of forward citations
`diverse classes of backwards citations
`indicator of how much owner values patent
`incidences and outcomes of infringement suits; usually believed to
`be indicative of valuable patents
`count of parents or earliest parent’s filing date to show “entrance”
`into system
`
`Stock market performance, R&D statistics, Tobin’s q
`Royalties, valuation by inventors or managers, buy-outs
`
`important to businesses for preventing duplication by competitors than for amassing royalties. The industries
`
`that reported relying on patents the most to capture revenues on their innovations were chemicals and drugs,
`
`perhaps because in those industries, infringement is more clear-cut. The authors found that responses to
`
`their appropriability survey were consistently significant predictors of industries’ R&D intensities.
`
`The Carnegie Mellon survey some years later asked similar questions to the Yale survey, but extended the
`
`investigation into why firms use patents, even if they are not a valuable means of protecting their inventions
`
`(as most industries indicated). The most common reasons for not patenting were ease of inventing around
`
`products and concern over disclosing the invention. For small firms, the cost of defending their patents was
`
`a commonly raised issue. In examining individuals’ responses as to why their firms patent, the authors saw
`
`two groups naturally emerging. They define “complex” industries as those in which new products typically
`
`contain many patented inventions (eg, electronics) compared to “discrete industries” such as drugs and
`
`chemicals. Respondents in discrete industries tended to report use of patents not only for maintaining a
`
`monopoly on their innovations, but also to block rivals’ entry via similar inventions (“patent fences”). In
`
`contrast, the bulk of respondents who reported using patents to enter into licensing negotiations were in
`
`complex industries.
`
`2.2.3 Submarine Patents
`
`A limited literature exists within economics on issues relating to submarine patents. It seems the reason for
`
`the dearth of research on this topic is that submarine patents are generally difficult to identify.
`
`Graham and Mowery (2002) was one of the first papers within economics to analyze the effect of the
`
`5
`
`

`

`Alex Bell
`
`An Autopsy on Submarine Patents
`
`patent application continuation process. A continuation is a legal term within patenting; it allows a rejected
`
`application to be restarted while claiming its original priority (filing) date. Of note, they found an increase in
`
`the use of continuations prior to 1995, with a sharp drop-off for applications filed after. They reported that
`
`software companies seemed to be using the continuations process the most. They also found continuations to
`
`be positively correlated with the number of forward citations, patent originality, and incidence of post-grant
`
`litigation (as measured by linkages to the Derwent patent litigation database). 7
`
`Hegde, Mowery, and Graham (2007) use more recent data, current as of 2004 (the NBER patent database
`
`described in Hall, Jaffe, and Trajtenberg (2001)). They describe the uniqueness of continuations to the US
`
`patent system and their suspected involvement with submarine patents. At the same time, they explain the
`
`stance of some patent attorneys and industry groups: these long-pending continued applications may also
`
`be the result of “high-risk investments of ‘pioneering inventors’ in ‘young’ fields of invention that are subject
`
`to uncertainty.” They admit that little empirical evidence has been brought to bear on the characteristics of
`
`those applicants who exploit the US continuation process. The paper empirically examines which types of
`
`industries have used different types of continuations: Continuations, Continuations in Part, and Divisionals.
`
`All three types introduce a delay in the application process while allowing the final patent to retain the
`
`priority date of the initial application. 12
`
`2.2.4 “Skill-Biased” Technical Change
`
`A number of studies, particularly in the 1990s and early 2000, have attempted to account for the growing wage
`
`gap observed in developed countries between skilled and unskilled workers. The gap can be characterized by
`
`an increase in wages of skilled (educated) workers above those of the unskilled, accompanied by a growing
`
`abundance of skilled workers in the labor force relative to unskilled. 2
`
`A concept of “skill-biased technical change” became a prominent explanation. The theory was that
`
`certain types of workers may fair better than others during periods of rapid technological growth. Namely,
`
`economists hypothesized that more educated or otherwise more able workers could better adjust to changing
`
`workplaces. This increase in demand for skilled workers could account for the relative rise in returns to skill
`
`in the midst of a relative increase in supply of skilled workers.
`
`Caselli divides technological revolutions into two categories: skill-biased (eg, the information technology
`
`revolution) and de-skilling (eg, the assembly line, which replaced skilled artisans in the production of cars).
`
`With the focus of modeling skill-biased revolutions, he develops a model in which productivity-augmenting
`
`technology spurs increases in wages, but particularly so for quick-learning workers, while the slow-learning
`
`workers continue to use the old technology for a certain period of time (the old capital is not immediately
`
`valueless). He confirms that recent increases in wage inequality within industries are in fact associated with
`
`6
`
`

`

`Alex Bell
`
`An Autopsy on Submarine Patents
`
`increased inequality of capital-to-labor ratios. Furthermore, he theorizes that education may make workers
`
`better able to adjust to new technology; if this is the case, then technological revolutions within an industry
`
`would increase the returns to education for workers in those industries. 3
`
`In another model of technological change and wages, Galor and Moav theorize that regardless of whether
`
`a technological revolution ultimately brings about a new paradigm that is biased toward or away from skill
`
`(or ability), in the short run the transition to this new technological state will be skill (ability)-biased in
`
`the short turn. Their model predicts several effects that are confirmed by data. 6
`
`Machin and van Reenen used R&D intensities in the manufacturing sectors of seven OECD countries
`
`as a proxy for technological change. They found strong results across developed nations that industries
`
`experiencing technical change also experience a shift in composition of their labor forces favoring skilled
`
`workers, and conclude that this evidence is in line with the skill-biased technical change hypothesis. 14
`
`In a similar vein, Berman, Bound, and Machin address the spread of skill-biased technical change through-
`
`out the world economy. Using similar industry-level manufacturing data, they examine within-industry
`
`changes among developed countries in the skilled-to-unskilled labor ratio. They find that industry changes
`
`are highly correlated among countries. This is strong evidence that skill-biased technical change within
`
`industries is not localized to the country from which the technology originates, but rather the change per-
`
`meates throughout the industry and affects workers in the broader world economy. The authors also offer
`
`some evidence that skill-biased technical change may account for a trend of skill upgrading witnessed in the
`
`manufacturing sectors of less-developed countries. 2
`
`Bartel and Sicherman use a wide array of measures for technical change in manufacturing industries,
`
`including use of patents, investment in R&D, and various measures of TFP (ie, residuals from estimating
`
`sales after controlling for capital). Tracking workers through the National Longitudinal Survey of Youth,
`
`the authors find that after controlling for individual-level fixed effects, the relationship relationship between
`
`wages and technological change is weakened. They conclude that a significant part of industry-level changes
`
`in wage inequality and skill composition is due to sorting by workers who move between industries. 1
`
`A major limitation of studies examining technological change on industry wages and skill compositions
`
`has been a tendency to rely exclusively on trends in the manufacturing sector, where statistics such as
`
`industries’ R&D expenditures to sales ratios, total factor productivity, and wages are easiest to measure
`
`and most freely available from the government due to the manufacturing sector’s relevance to economic
`
`indicators. 9 With the method proposed in my paper to identify technological revolutions, combined with the
`
`increased use of individual-level tax data on earnings within economics, 4 it may soon be possible to extend
`
`the literature beyond the manufacturing industry.
`
`7
`
`

`

`Alex Bell
`
`3 Data
`
`An Autopsy on Submarine Patents
`
`We ourselves do not put enough emphasis on the value of data and data collection in
`our training of graduate students and in the reward structure of our profession. It is
`the preparation skill of the econometric chef that catches the professional eye, not the
`quality of the raw materials in the meal, or the effort that went into procuring them.
`(Zvi Griliches, address to the American Economic Association, January 4, 1994 9)
`
`In this following section, I summarize my ingredients.
`
`3.1 My data
`
`The USPTO has made remarkable strides in the past few years toward making raw patent data available to
`
`researchers. With this much data, the next question economists are confronting is what can be done with
`
`it. I first discuss what I have done with it, then what others are doing.
`
`Through a partnership with Google, the USPTO has made available the full text of every patent granted
`
`from January 1976 to present (updated weekly). To download the data, one must download and unzip nearly
`
`2,000 weekly files. None of these files are in formats that are useful to any type of research that economists
`
`would be interested in.
`
`There are four different file formats, spanning different timeperiods of operation:
`
`.txt (1365 files from 1976-2001). These are text files that are organized in some vaguely
`1. pftaps
`hierarchical manner. Certain text strings recursively separate different data fields.
`
`2. pg
`.sgm (54 files mid-2001 and a few in 2002). These files are in Standardized Generalized Markup
`Language, an ISO standard that is similar in some ways to XML.
`
`.xml (156 files 2002-2004). Similar to above, but somewhat more compatible with modern XML
`3. pg
`readers.
`
`4. ipg
`.xml (currently about 400 files, from 2005-present). The most recent XML generation. Names
`of tags were redone to be more descriptive.
`
`Weekly patent grant files were parsed in Python with a script run over the course of several days.
`
`Figure 1 presents a stylized Entity-Relationship model of how I conceptualize the data contained within
`
`a patent grant.
`
`In translating the data from a docuement-oriented model to the relational model, my
`
`ultimate goal was to produce a relational database in Boyce-Codd Normal Form – that is, one without
`
`unnecessary duplication of data. However, determining the data’s functional dependencies has proven more
`
`complex than expected. For example, we know the filing date for all five million patents in the database –
`
`let this be contained in the relation BasicInf o(patent, f iling date, issue date), in which the primary key
`
`is underlined. Yet when a citation is recorded within a patent grant, the USPTO lists not only the cited
`
`patent’s number, but also the cited patent’s application and issue dates.
`
`If we have a relation linking a
`
`patent to its citations, Cites(patent, cited patent), storing the application and issue dates of cited patent
`
`8
`
`

`

`Alex Bell
`
`An Autopsy on Submarine Patents
`
`Figure 1: Entity-Relationship Diagram of Available Data
`Compustat linkages discussed in next section
`
`Diagram.png
`
`in that relation seems redundant, because this information can be assembled by joining the relations on
`
`BasicInf o.patent = Cites.cited patent. Given that the dataset contains five million patents but 55 million
`
`citations, this kind of duplication would be enormously costly. However, this foreign key relationship does
`
`not always hold true: a cited patent may have been issued prior to 1976, or it may be a patent in another
`
`country. If either of these is the case, its filing and issue dates would not be contained in the Cites relation.
`
`Similar problems exist in recording details of patents associated with parent relationships and elsewhere in
`
`the construction of the relational database. When confronted with these dilemmas, I generally erred on the
`
`side of preserving information, at the risk of redundancy.
`
`Another takeaway of Figure 1 is the many-to-many cardinalities that tend to characterize patent data.
`
`One patent may have many inventors, and one inventor may have many patents. The diagram shows the
`
`same for other relationships, such as assignment at grant and citations. Fields that a patent may strictly
`
`have one of are listed in the Patent entity (with the addition of primary class, shown by arrows pointing
`
`away from the Patent entity).
`
`9
`
`

`

`Alex Bell
`
`An Autopsy on Submarine Patents
`
`3.1.1 Linkages External to Patent Grant from the USPTO
`
`I incorporate linkages to two relations provided by the USPTO through Google. The first is maintenance
`
`fee events. These are provided as a table indicating patent number, date of event, and type of event (eg,
`
`payment of a certain type of fee or refund for various reasons). Also relevant to the analysis, each event
`
`contains an indicator for whether the assignee claims small entity status, which affords the inventor a reduced
`
`financial burden. In the US, the patent fee schedule is highly back-loaded, as shown in Table 2. Because
`
`a relatively small number of patents are renewed to full maturity, this data offers an indication for patents
`
`that have issued several years in the past as to how much their owner values them.
`
`Table 2: Summary of Current Major Fee Schedule (USD)
`
`Type of fee
`Utility issue fee
`Due at 3.5 years
`Due at 7.5 years
`Due at 11.5 years
`
`Standard Amount Amount for Small Entity
`1,770
`885
`1,150
`575
`2,900
`1,450
`4,810
`2,405
`
`At the time of grant, the patent examiner places a patent into one of more than 400 technology classi-
`
`fications, primarily for the purpose of facilitating future searches for relevant prior art. These classes are
`
`further broken down into thousands of sub-classes. The class system frequently changes as new classes are
`
`created, merged, or obsoleted. A very recent addition to the USPTO’s bulk data downloads is a file linking
`
`each patent to its current classification. These are the technology classes analyzed in this paper. Still, an
`
`overwhelming challenge to all researchers working with patent data has been finding meaningful groupings
`
`for patents by economically relevant industries, as opposed to by the USPTO’s large number of technology
`
`classes.
`
`3.2 Litigation Data
`
`No branch of the government, including the USPTO, keeps track of patent infringement suits for individual
`
`patents. However, the US court system makes documents pertinent to legal cases available online through
`
`services such as Public Access to Court Electronic Records (PACER) for a fee of a few cents per page
`
`downloaded.
`
`Lex Machina is a private firm spun out of a joint research project between Stanford’s law school and com-
`
`puter science department called the IP Litigation Clearninghouse, which was designed to bring transparency
`
`to IP law. The business of Lex Machina is providing IP litigation data and analytics to businesses. The
`
`company does this by crawling court records including those from PACER and district court databases daily.
`
`10
`
`

`

`Alex Bell
`
`An Autopsy on Submarine Patents
`
`Using natural language processing algorithms, they determine which cases are instances of IP litigation, and
`
`in the cases of patent infringement, link the case with the patent being infringed upon (also known as the
`
`patent being “asserted”).
`
`Although Lex Machina sells this information to businesses, they agreed to contribute to this project a
`
`dataset of linkages from patent numbers to legal assertions for all patents filed in 1995.
`
`In section 5, I
`
`examine how the likelihood of assertion changes for patents filed just before and just after the policy change.
`
`3.3 NBER Patent Data Project and Name Matching
`
`This paper does not use the NBER database, but it is useful to discuss in terms of work that has been done
`
`with similar raw data to what I am using. The work done by the PDP and made public to researchers has
`
`been instrumental in speeding the diffusion of this data through the empirical research community. The data
`
`provided by the Patent Data Project is outlined in Hall, Jaffe, and Trajtenberg (2001). It contains data
`
`primarily on patent citations and assignees.
`
`The most significant drawback of this dataset is that it is only current as of 2006. In examining long-
`
`pending patents filed around the 1995 discontinuity, the picture would look different if we examined only
`
`patents issued by 2006.
`
`A tremendous contribution of this dataset is the authors’ attempts to conduct meaningful assignee name
`
`disambiguation and matching to Compustat firms. This is not a trivial task for several reasons. Consider a
`
`company identified in Compustat as IBM. In addition to filing patents as IBM, patents they file might list
`
`as assignee:
`
`1. An unabbreviated name, such as International Business Machines
`
`2. Some formal legal name, such as IBM, Inc. (or other languages’ variants). This is extremely common.
`
`3. A division of the company (eg “IBM R&D”, “IBM Circuits Division”, or “IBM India”). This is also
`extremely common.
`
`4. A misspelling of any of the above, perhaps due to data entry error.
`
`The architects of the dataset share their name standardization routines on their site. There isn’t much
`
`to do about (1). By either removing or standardizing most common suffixes and other elements of firm
`
`names, the authors seem to do a good job of mitigating lost matches due to (2). To

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket