An Autopsy on Submarine Patents
`A Window into Expectations of the World Technological Frontier
`Honors Thesis
`Alexander M. Bell∗
`Departments of Economics and Computer Science
`Brown University
`A “submarine patent” is one whose issuance and publication has been intentionally delayed
`by its inventor with the hope that firms will independently discover and come to rely on
`a similar innovation at some later time, at which point the inventor causes his patent to
`issue and claims infringement. Although submarine patents are harmful to an innovation-led
`economy, previous research has struggled with how to discriminate between these subversive
`patents and patents whose issuance was delayed for legitimate reasons. I propose a novel
`identification strategy that exploits self-sorting by inventors around a 1995 policy change
`that was unfavorable to submarine patents. Using a regression-discontinuity design, I find
`that submarine patents are on average much more likely to be asserted for infringement in
`court cases. In addition, I conclude that submarines were more common in certain industries
`than others as evidenced by differential responses of industries to the policy change. Finally,
`I show that the failure of submarine patents within specific industries to ultimately assert
`infringement seems to be an indicator of which industries in the world economy experienced
`shifts in technological paradigms during this time, providing an additional method to asses
`the determinants of differences in income per capita across countries. I also describe how this
`result can be generalized across timeperiods.
`April 17, 2013
`∗The author acknowledges the guidance of his thesis advisor, Oded Galor, and comments from his reader in the computer
`science department, Tom Doeppner. The author is also grateful for helpful comments from Erez Yoeli and a number of friends,
`including those in other disciplines who helped relate these findings to the nature and history of technological progress in other
`fields. Finally, the author acknowledges previous work that laid the framework for this paper while studying the patent backlog
`at the USPTO in combination with Alan Marco, Stu Graham, and Conny Chen.

`Alex Bell
`1 Introduction
`An Autopsy on Submarine Patents
`A little-known Texas-based company filed a patent infringement complaint against social gaming giant Zynga
`in early 2012, alleging infringement on four patents. The issue dates of Personalized Media Communications’
`four patents were between June of 2010 and March of 2011, but all were filed in 1995: one on May 23, one
`on June 6, and two on June 7.
`These four patents, and others like them, are sometimes termed “submarine patents.” This term refers
`to a patent whose inventor does not wish to market his invention. Instead, after filing his non-public patent
`application, the inventor hopes for another inventor to discover the same invention and develop it into a
`successful product. When this happens, like a submarine emerging from the depths, the inventor finishes
`the paperwork for his patent to issue and sues the now-infringing producer for a share of his profits.
`Submarine patenting can be thought of in the context of the free-rider problem. If the costs of developing
`an invention into a marketable good are high, an inventor might choose to wait for another firm to invest in
`development, then extract royalties from that firm’s success. Submarine patenting may also be considered
`to cause a deadweight loss to society in the sense that some fraction of innovators who wish to sell their
`products must pay an additional “tax” on their products to submarine patenters.
`The economic rationale for patents is to encourage innovation. The government grants a fixed-term
`monopoly to an inventor in exchange for fully disclosing his invention early so that society might learn from
`it. At the same time, there are standards for patentability. Patents must be useful, novel, and non-obvious.
`The applicant carries the burden of proof of patentability, but the US Patent and Trademark Office would
`not be doing its job of aiding innovation if it risked disenfranchising inventors of their patents. Thus, if an
`inventor botches his application, he is given leniency to amend it without fear of losing his patent rights to
`others who would file after his original application. Section 2 discusses other legitimate reasons why a patent
`application may be delayed in more depth.
`This paper proposes a novel strategy for identifying a large group of submarine patents. I first show that
`patent applicants intending to take advantage of this loophole self-sorted to file before a policy change that
`made submarining future patents infeasible. Having shown that patents on either side of the discontinuity
`are similar in all ways other than their likelihood of being submarine patents, I examine characteristics
`of submarine patents and find that they are on average much more likely to be involved in infringement
`litigation. However, I find starkly different litigation outcomes for different industry classes, and conclude
`that this is due to shifts in technological paradigms within certain industries.
`I discuss the relevance of
`submarine patents to measure the expectations of technological change and provide a strategy to generalize
`the results of this paper to construct a measure of technological expectations and identify shifts in techno-

`Alex Bell
`An Autopsy on Submarine Patents
`logical paradigms over time and within industries. Such information can be useful for assessing differences
`in income across countries if economies that are engaged in the use of technologies that undergo paradigm
`shifts benefit from such technological revolutions.
`The remainder of this paper is as follows. Section 2 provides background on the policy change and a
`review of literature within the economics of innovation. Section 3 describes the dataset of patent grants I have
`compiled including outward linkages to outcomes such as litigation. Section 4 puts forward a theoretical
`model for submarining and testable hypotheses. Section 5 conveys my findings, while Section 6 offers
`discussion of the wider implications of these findings to studying the economics of innovation and Section 7
`finally concludes.
`2 Background and Motivation
`This section describes the history of the submarine loophole before proceeding to survey relevant literature.
`2.1 A 21st Century Vantage Point
`In order for a patent system to be susceptible to submarine patents, I argue that it must have two traits:
`(1) While the patent application is pending, other firms must have no knowledge of it. If they do, they will
`not use the technology.
`(2) Regardless of how long the patent pends for, once granted it must have a long enough term of force to
`be worth enforcing against profitable firms that rely on the invention.
`Issue (1) was resolved by the American Inventors Protection Act of 1999. Since 2000, most US applications
`are published 18 months after filing, regardless of their status as denied, issued, or still pending.∗ This policy
`change seems to have been aimed at bringing the US in line with what other countries were doing, speeding
`the diffusion of knowledge, and reducing the feasibility of submarine patents.
`Issue (2), however, was resolved earlier, when an agreement was signed by member nations of what was to
`become the World Trade Organization in 1994. With the goal of a more globally homogeneous system of IP
`enforcement to foster international trade, the Agreement on Trade-Related Aspects of Intellectual Property
`Rights contained a number of standards for laws pertaining to copyright, patenting, and other intellectual
`One of the many standards introduced was a harmonization of patent term. TRIPS, agreed on near the
`end of the 1994 Uruguay Round of the General Agreement on Tariffs and Trade (GATT), mandated that
`∗There are a few exceptions to mandatory application publication, the most notable of which occurs when applicants certify
`that they do not intend to file for the same invention in other countries.

`Alex Bell
`An Autopsy on Submarine Patents
`WTO members grant patent protection of at least 20 years, starting the clock at the filing date of a patent.
`Prior to TRIPS, applicants in the US were granted a patent term of 17 years from issue date. President
`Clinton signed the GATT on December 8, 1994, with patent term reforms set to take effect six months later,
`on June 8.
`The effect of the policy change was a tremendous flood of patent applications just prior to the shift. On
`June 7, ten times as many applications were filed as any other day excluding the month leading up. The
`Appendix contains a press release from the USPTO from June 28 explaining that it received a quarter of
`the year’s projected filings in just nine days. From the vantage point of 1995, this is a curious anomaly. But
`now that most of these applications have either issued or been abandoned, we see that this cohort of patent
`applications differs in important ways from other cohorts, and I will argue in Section 4 that it offers a unique
`window into the behavior of submarine patents.
`2.2 Related Literature
`This section begins with a survey of metrics that other studies have used to measure patent value, then
`discusses ways that economists believe inventors appropriate revenue from their inventions in different in-
`dustries. After a brief summary of the sparse literature on submarine patents, I overview the vast literature
`concerned with the effects on the economy of differing rates of technological progress.
`2.2.1 Measures of Patent Value
`The patent literature is rich with a variety of metrics for patent value and quality. Much of the earliest
`work in defining the roll of patent statistics within economics was done by Griliches (1991) and others at
`the NBER’s research program in productivity. The underlying motive was to better understand economic
`processes that lead to productivity gains – pursuing “the dream of getting hold of an output indicator of
`inventive activity,” in Griliches’ words. Following after Scherer (1984), who linked 15,000 patents to the
`443 largest US manufacturing firms in the FTC’s Line of Business Survey, Griliches and others explored
`outward linkages to R&D figures and stock market data for publicly traded corporations. 16 Griliches (1991)
`summarizes that a strong relationship can be identified at the cross-sectional level between R&D expenditure
`and the number of patents a firm has received. He further concludes that there may be evidence of diminishing
`returns to R&D expenditures. 8
`Hall, Jaffe, and Trajtenberg (2001) provide a more modern approach to patent data characteristic of
`the growing availability of digital information, particularly for patents. Of the 400 three-digit classes the
`USPTO groups patents into, the authors condensed the data into 36 two-digit technological sub-categories,

`Alex Bell
`An Autopsy on Submarine Patents
`and ultimately into six higher-level categories: Chemical (excluding drugs), Computers and Communica-
`tions, Drugs and Medical, Electrical and Engineering, Mechanical, and Others. However, their study reflects
`the difficulties of others who have attempted similar groupings, and they suggest that “while convenient,
`the present classification should be used with great care, and reexamined critically for specific applications.”
`They also discuss the usefulness of backward citations (citations a patent makes) as constituting a “paper
`trail” to measure knowledge spillovers and forward citations (citations received) as indicative of the “im-
`portance” of a patent. They put forward new measures in the form of Herfindahl concentration indices:
`Generality – the percentage of citations a patent makes in classes other than its own – and Originality – the
`percentage of citations received from other classes. They briefly discuss some validation strategies for these
`metrics. For example, Computers and Communications scores high on Generality, consistent with the view
`that it is a general purpose technology, and high on originality, in accordance with a view that it tends to
`break traditional models in terms of innovation. 11
`Hall, Jaffe, and Trajtenberg (2000) provide further insight into outside linkages of patent data. The study
`found that, in predicting firms’ market value from patent counts, weighting patents by their citation counts
`could better predict firms’ market value, indicating that forward citations are in some way tied to a notion
`of a patent’s “value.” 10
`In a different vein, a comprehensive survey by Scherer and others of US and German firms found payment
`of renewal fees to be a reliable proxy for patent value. They also confirm that the distribution of patent
`values is highly skewed, with a few patents being extremely profitable. 17
`Table 1 summarizes and expands upon a dichotomy proposed in van Zeebroeck et al (2008), which
`classifies the strategies used by economists to view patent data as either patent-based or market-based. 18
`Many of these techniques are revisited in Section 4, in which the feasibility and applicability of their use for
`this project are discussed.
`2.2.2 Appropriability of Inventions
`Within the field of economics, two major investigations have been carried out into how firms appropriate
`rents from their innovations. The first, published in 1987, was a survey of 650 R&D executives in 130 different
`lines of business (as defined by the FTC). It is sometimes referred to as the Yale survey. 13 The second was
`administered in 1994 to 1478 R&D labs, and is sometimes referred to as the Carnegie Mellon survey. 5
`The Yale survey divided its questions into product and process patents.
`In general, firms reported
`capturing profits from product innovations with patents more often than with process innovations, perhaps
`because it is more difficult (and less desirable) to keep product innovations secret. For processes, lead time
`and learning advantages were rated much more useful than patents. In general, patents were rated more

`Alex Bell
`An Autopsy on Submarine Patents
`Table 1: Established Metrics of Patent Value and Related Dimensions
`—Patent-Based —
`Backward Citations
`Forward Citations
`Maintenance fee payments
`Legal disputes
`—Market-Based —
`Firm value
`Estimated patent value
`citations the patent makes; commonly used to track knowledge
`citations the patent receives; generally accepted to be measure of
`diverse classes of forward citations
`diverse classes of backwards citations
`indicator of how much owner values patent
`incidences and outcomes of infringement suits; usually believed to
`be indicative of valuable patents
`count of parents or earliest parent’s filing date to show “entrance”
`into system
`Stock market performance, R&D statistics, Tobin’s q
`Royalties, valuation by inventors or managers, buy-outs
`important to businesses for preventing duplication by competitors than for amassing royalties. The industries
`that reported relying on patents the most to capture revenues on their innovations were chemicals and drugs,
`perhaps because in those industries, infringement is more clear-cut. The authors found that responses to
`their appropriability survey were consistently significant predictors of industries’ R&D intensities.
`The Carnegie Mellon survey some years later asked similar questions to the Yale survey, but extended the
`investigation into why firms use patents, even if they are not a valuable means of protecting their inventions
`(as most industries indicated). The most common reasons for not patenting were ease of inventing around
`products and concern over disclosing the invention. For small firms, the cost of defending their patents was
`a commonly raised issue. In examining individuals’ responses as to why their firms patent, the authors saw
`two groups naturally emerging. They define “complex” industries as those in which new products typically
`contain many patented inventions (eg, electronics) compared to “discrete industries” such as drugs and
`chemicals. Respondents in discrete industries tended to report use of patents not only for maintaining a
`monopoly on their innovations, but also to block rivals’ entry via similar inventions (“patent fences”). In
`contrast, the bulk of respondents who reported using patents to enter into licensing negotiations were in
`complex industries.
`2.2.3 Submarine Patents
`A limited literature exists within economics on issues relating to submarine patents. It seems the reason for
`the dearth of research on this topic is that submarine patents are generally difficult to identify.
`Graham and Mowery (2002) was one of the first papers within economics to analyze the effect of the

`Alex Bell
`An Autopsy on Submarine Patents
`patent application continuation process. A continuation is a legal term within patenting; it allows a rejected
`application to be restarted while claiming its original priority (filing) date. Of note, they found an increase in
`the use of continuations prior to 1995, with a sharp drop-off for applications filed after. They reported that
`software companies seemed to be using the continuations process the most. They also found continuations to
`be positively correlated with the number of forward citations, patent originality, and incidence of post-grant
`litigation (as measured by linkages to the Derwent patent litigation database). 7
`Hegde, Mowery, and Graham (2007) use more recent data, current as of 2004 (the NBER patent database
`described in Hall, Jaffe, and Trajtenberg (2001)). They describe the uniqueness of continuations to the US
`patent system and their suspected involvement with submarine patents. At the same time, they explain the
`stance of some patent attorneys and industry groups: these long-pending continued applications may also
`be the result of “high-risk investments of ‘pioneering inventors’ in ‘young’ fields of invention that are subject
`to uncertainty.” They admit that little empirical evidence has been brought to bear on the characteristics of
`those applicants who exploit the US continuation process. The paper empirically examines which types of
`industries have used different types of continuations: Continuations, Continuations in Part, and Divisionals.
`All three types introduce a delay in the application process while allowing the final patent to retain the
`priority date of the initial application. 12
`2.2.4 “Skill-Biased” Technical Change
`A number of studies, particularly in the 1990s and early 2000, have attempted to account for the growing wage
`gap observed in developed countries between skilled and unskilled workers. The gap can be characterized by
`an increase in wages of skilled (educated) workers above those of the unskilled, accompanied by a growing
`abundance of skilled workers in the labor force relative to unskilled. 2
`A concept of “skill-biased technical change” became a prominent explanation. The theory was that
`certain types of workers may fair better than others during periods of rapid technological growth. Namely,
`economists hypothesized that more educated or otherwise more able workers could better adjust to changing
`workplaces. This increase in demand for skilled workers could account for the relative rise in returns to skill
`in the midst of a relative increase in supply of skilled workers.
`Caselli divides technological revolutions into two categories: skill-biased (eg, the information technology
`revolution) and de-skilling (eg, the assembly line, which replaced skilled artisans in the production of cars).
`With the focus of modeling skill-biased revolutions, he develops a model in which productivity-augmenting
`technology spurs increases in wages, but particularly so for quick-learning workers, while the slow-learning
`workers continue to use the old technology for a certain period of time (the old capital is not immediately
`valueless). He confirms that recent increases in wage inequality within industries are in fact associated with

`Alex Bell
`An Autopsy on Submarine Patents
`increased inequality of capital-to-labor ratios. Furthermore, he theorizes that education may make workers
`better able to adjust to new technology; if this is the case, then technological revolutions within an industry
`would increase the returns to education for workers in those industries. 3
`In another model of technological change and wages, Galor and Moav theorize that regardless of whether
`a technological revolution ultimately brings about a new paradigm that is biased toward or away from skill
`(or ability), in the short run the transition to this new technological state will be skill (ability)-biased in
`the short turn. Their model predicts several effects that are confirmed by data. 6
`Machin and van Reenen used R&D intensities in the manufacturing sectors of seven OECD countries
`as a proxy for technological change. They found strong results across developed nations that industries
`experiencing technical change also experience a shift in composition of their labor forces favoring skilled
`workers, and conclude that this evidence is in line with the skill-biased technical change hypothesis. 14
`In a similar vein, Berman, Bound, and Machin address the spread of skill-biased technical change through-
`out the world economy. Using similar industry-level manufacturing data, they examine within-industry
`changes among developed countries in the skilled-to-unskilled labor ratio. They find that industry changes
`are highly correlated among countries. This is strong evidence that skill-biased technical change within
`industries is not localized to the country from which the technology originates, but rather the change per-
`meates throughout the industry and affects workers in the broader world economy. The authors also offer
`some evidence that skill-biased technical change may account for a trend of skill upgrading witnessed in the
`manufacturing sectors of less-developed countries. 2
`Bartel and Sicherman use a wide array of measures for technical change in manufacturing industries,
`including use of patents, investment in R&D, and various measures of TFP (ie, residuals from estimating
`sales after controlling for capital). Tracking workers through the National Longitudinal Survey of Youth,
`the authors find that after controlling for individual-level fixed effects, the relationship relationship between
`wages and technological change is weakened. They conclude that a significant part of industry-level changes
`in wage inequality and skill composition is due to sorting by workers who move between industries. 1
`A major limitation of studies examining technological change on industry wages and skill compositions
`has been a tendency to rely exclusively on trends in the manufacturing sector, where statistics such as
`industries’ R&D expenditures to sales ratios, total factor productivity, and wages are easiest to measure
`and most freely available from the government due to the manufacturing sector’s relevance to economic
`indicators. 9 With the method proposed in my paper to identify technological revolutions, combined with the
`increased use of individual-level tax data on earnings within economics, 4 it may soon be possible to extend
`the literature beyond the manufacturing industry.

`Alex Bell
`3 Data
`An Autopsy on Submarine Patents
`We ourselves do not put enough emphasis on the value of data and data collection in
`our training of graduate students and in the reward structure of our profession. It is
`the preparation skill of the econometric chef that catches the professional eye, not the
`quality of the raw materials in the meal, or the effort that went into procuring them.
`(Zvi Griliches, address to the American Economic Association, January 4, 1994 9)
`In this following section, I summarize my ingredients.
`3.1 My data
`The USPTO has made remarkable strides in the past few years toward making raw patent data available to
`researchers. With this much data, the next question economists are confronting is what can be done with
`it. I first discuss what I have done with it, then what others are doing.
`Through a partnership with Google, the USPTO has made available the full text of every patent granted
`from January 1976 to present (updated weekly). To download the data, one must download and unzip nearly
`2,000 weekly files. None of these files are in formats that are useful to any type of research that economists
`would be interested in.
`There are four different file formats, spanning different timeperiods of operation:
`.txt (1365 files from 1976-2001). These are text files that are organized in some vaguely
`1. pftaps
`hierarchical manner. Certain text strings recursively separate different data fields.
`2. pg
`.sgm (54 files mid-2001 and a few in 2002). These files are in Standardized Generalized Markup
`Language, an ISO standard that is similar in some ways to XML.
`.xml (156 files 2002-2004). Similar to above, but somewhat more compatible with modern XML
`3. pg
`4. ipg
`.xml (currently about 400 files, from 2005-present). The most recent XML generation. Names
`of tags were redone to be more descriptive.
`Weekly patent grant files were parsed in Python with a script run over the course of several days.
`Figure 1 presents a stylized Entity-Relationship model of how I conceptualize the data contained within
`a patent grant.
`In translating the data from a docuement-oriented model to the relational model, my
`ultimate goal was to produce a relational database in Boyce-Codd Normal Form – that is, one without
`unnecessary duplication of data. However, determining the data’s functional dependencies has proven more
`complex than expected. For example, we know the filing date for all five million patents in the database –
`let this be contained in the relation BasicInf o(patent, f iling date, issue date), in which the primary key
`is underlined. Yet when a citation is recorded within a patent grant, the USPTO lists not only the cited
`patent’s number, but also the cited patent’s application and issue dates.
`If we have a relation linking a
`patent to its citations, Cites(patent, cited patent), storing the application and issue dates of cited patent

`Alex Bell
`An Autopsy on Submarine Patents
`Figure 1: Entity-Relationship Diagram of Available Data
`Compustat linkages discussed in next section
`in that relation seems redundant, because this information can be assembled by joining the relations on
`BasicInf o.patent = Cites.cited patent. Given that the dataset contains five million patents but 55 million
`citations, this kind of duplication would be enormously costly. However, this foreign key relationship does
`not always hold true: a cited patent may have been issued prior to 1976, or it may be a patent in another
`country. If either of these is the case, its filing and issue dates would not be contained in the Cites relation.
`Similar problems exist in recording details of patents associated with parent relationships and elsewhere in
`the construction of the relational database. When confronted with these dilemmas, I generally erred on the
`side of preserving information, at the risk of redundancy.
`Another takeaway of Figure 1 is the many-to-many cardinalities that tend to characterize patent data.
`One patent may have many inventors, and one inventor may have many patents. The diagram shows the
`same for other relationships, such as assignment at grant and citations. Fields that a patent may strictly
`have one of are listed in the Patent entity (with the addition of primary class, shown by arrows pointing
`away from the Patent entity).

`Alex Bell
`An Autopsy on Submarine Patents
`3.1.1 Linkages External to Patent Grant from the USPTO
`I incorporate linkages to two relations provided by the USPTO through Google. The first is maintenance
`fee events. These are provided as a table indicating patent number, date of event, and type of event (eg,
`payment of a certain type of fee or refund for various reasons). Also relevant to the analysis, each event
`contains an indicator for whether the assignee claims small entity status, which affords the inventor a reduced
`financial burden. In the US, the patent fee schedule is highly back-loaded, as shown in Table 2. Because
`a relatively small number of patents are renewed to full maturity, this data offers an indication for patents
`that have issued several years in the past as to how much their owner values them.
`Table 2: Summary of Current Major Fee Schedule (USD)
`Type of fee
`Utility issue fee
`Due at 3.5 years
`Due at 7.5 years
`Due at 11.5 years
`Standard Amount Amount for Small Entity
`At the time of grant, the patent examiner places a patent into one of more than 400 technology classi-
`fications, primarily for the purpose of facilitating future searches for relevant prior art. These classes are
`further broken down into thousands of sub-classes. The class system frequently changes as new classes are
`created, merged, or obsoleted. A very recent addition to the USPTO’s bulk data downloads is a file linking
`each patent to its current classification. These are the technology classes analyzed in this paper. Still, an
`overwhelming challenge to all researchers working with patent data has been finding meaningful groupings
`for patents by economically relevant industries, as opposed to by the USPTO’s large number of technology
`3.2 Litigation Data
`No branch of the government, including the USPTO, keeps track of patent infringement suits for individual
`patents. However, the US court system makes documents pertinent to legal cases available online through
`services such as Public Access to Court Electronic Records (PACER) for a fee of a few cents per page
`Lex Machina is a private firm spun out of a joint research project between Stanford’s law school and com-
`puter science department called the IP Litigation Clearninghouse, which was designed to bring transparency
`to IP law. The business of Lex Machina is providing IP litigation data and analytics to businesses. The
`company does this by crawling court records including those from PACER and district court databases daily.

`Alex Bell
`An Autopsy on Submarine Patents
`Using natural language processing algorithms, they determine which cases are instances of IP litigation, and
`in the cases of patent infringement, link the case with the patent being infringed upon (also known as the
`patent being “asserted”).
`Although Lex Machina sells this information to businesses, they agreed to contribute to this project a
`dataset of linkages from patent numbers to legal assertions for all patents filed in 1995.
`In section 5, I
`examine how the likelihood of assertion changes for patents filed just before and just after the policy change.
`3.3 NBER Patent Data Project and Name Matching
`This paper does not use the NBER database, but it is useful to discuss in terms of work that has been done
`with similar raw data to what I am using. The work done by the PDP and made public to researchers has
`been instrumental in speeding the diffusion of this data through the empirical research community. The data
`provided by the Patent Data Project is outlined in Hall, Jaffe, and Trajtenberg (2001). It contains data
`primarily on patent citations and assignees.
`The most significant drawback of this dataset is that it is only current as of 2006. In examining long-
`pending patents filed around the 1995 discontinuity, the picture would look different if we examined only
`patents issued by 2006.
`A tremendous contribution of this dataset is the authors’ attempts to conduct meaningful assignee name
`disambiguation and matching to Compustat firms. This is not a trivial task for several reasons. Consider a
`company identified in Compustat as IBM. In addition to filing patents as IBM, patents they file might list
`as assignee:
`1. An unabbreviated name, such as International Business Machines
`2. Some formal legal name, such as IBM, Inc. (or other languages’ variants). This is extremely common.
`3. A division of the company (eg “IBM R&D”, “IBM Circuits Division”, or “IBM India”). This is also
`extremely common.
`4. A misspelling of any of the above, perhaps due to data entry error.
`The architects of the dataset share their name standardization routines on their site. There isn’t much
`to do about (1). By either removing or standardizing most common suffixes and other elements of firm
`names, the authors seem to do a good job of mitigating lost matches due to (2). To deal with

