`
`PCT/US2017/046176
`
`Technology Validation And Ownership
`
`
`FIELD
`
`[0001]
`
`The present application relates to validation, and more particularly to
`
`protecting ownership of software source code and hardware design.
`
`BACKGROUND
`
`[0002]
`
`Today, most software products are assembled from components in
`
`much the same way that physical products are assembled from parts. Software
`
`components help get a product to market faster, and often result in cost savings
`
`because a functionality does not need to be developed from scratch. Whether the
`
`product is a mobile application, a medical device, an industrial controller, 80C, or
`
`firmware used in an airplane, it is created in part by assembling software components.
`
`[0003]
`
`First party proprietary code is a software component that developed by
`
`a product team in an organization. This code is usually what makes magic happen, and
`
`contains various levels of proprietary, often classified intellectual property. First party
`
`code is what, in part, differentiates products from competitors and makes them unique.
`
`Sometimes, the builder of the product may choose to license this code for a fee to
`
`others, or offer it for free under various open source licensing models. First party code
`
`also ties together third party code and other components used by the product. Third
`
`party code components are often open source components, but can also be
`
`commercial. For example, a product team might use open source component for
`
`securing network communication. They could use a purchased commercial library for
`
`generating reports. And they might use an internally-developed component maintained
`
`by another team containing proprietary IP for video encoding and decoding.
`
`
`
`WO 2018/031703
`
`PCT/US2017/046176
`
`[0004]
`
`Similarly, hardware devices are often designed using combination of
`
`proprietary hardware components (e.g., electrical circuits that perform a certain
`
`function) combined with third party or open source hardware components. Prior to
`
`instantiation into a physical device, the hardware components are designed using
`
`hardware description language (HDL) code. Collectively, the combination of software
`
`and hardware components or code to create a product can have great commercial
`
`value. These software and hardware components are typically treated as valuable
`
`intellectual property assets of the builder.
`
`[0005]
`
`Leakage of a company’s intellectual property assets such as source
`
`code and/or HDL code to the public domain, or to a competitor can have dramatic
`
`negative consequences. This may be a result of any number of events including, but not
`
`limited to a deliberate industrial espionage, hackers penetrating company and posting
`
`all or some of the IP to public domain, ex-employee stealing company’s source code
`
`and using that at his new employer with or without new employers knowledge, or a
`
`result of a careless commit that accidentally makes lP available to unlicensed entities,
`
`or exposes it to the public. Similarly, uncontrolled introduction of IP under incompatible
`
`licenses to a proprietary code bases has potential to lead into unbounded damages, or
`
`a need to, by way of example, open source the affected lP blocks.
`
`
`
`WO 2018/031703
`
`PCT/US2017/046176
`
`BRIEF DESCRIPTION OF THE FIGURES
`
`[0006]
`
`The present invention is illustrated by way of example, and not by way
`
`of limitation, in the figures of the accompanying drawings and in which like reference
`
`numerals refer to similar elements and in which:
`
`[0007]
`
`Figure 1 is a network diagram showing one embodiment of a
`
`technology and ownership validation system, at a high level.
`
`[0008]
`
`Figures 2A-2F are diagrams illustrating various use cases for the
`
`system.
`
`[0009]
`
`Figure 3 is a block diagram of one embodiment of the system.
`
`[0010]
`
`Figure 4 is an overview flowchart of one embodiment of the system.
`
`[0011]
`
`Figure 5 is a flowchart of one embodiment of generating signatures.
`
`(Taken from 2841 U801)
`
`[0012]
`
`Figure 6 is a flowchart of one embodiment of enumerating matched
`
`signatures. The process starts at block 610
`
`[0013]
`
`Figure 7 is a flowchart of one embodiment of verifying proprietary data
`
`against open source.
`
`[0014]
`
`Figure 8 is a flowchart of one embodiment of verifying proprietary data
`
`against other proprietary data.
`
`[0015]
`
`Figure 9 is a flowchart of one embodiment of licensing and
`
`authentication using the system.
`
`[0016]
`
`Figure 10 is a flowchart of one embodiment of updating data in an
`
`existing signature.
`
`[0017]
`
`Figure 11 is a flowchart of one embodiment of resolving conflicts.
`
`
`
`WO 2018/031703
`
`PCT/US2017/046176
`
`[0018]
`
`Figure 12 is a block diagram of a computer system which may be used
`
`with the present system.
`
`[0019]
`
`Figure 13 shows a simplified representation of one embodiment of an
`
`electronic design automation (EDA) design flow..
`
`
`
`WO 2018/031703
`
`PCT/US2017/046176
`
`DETAILED DESCRIPTION
`
`[0020]
`
`An automated solution that enables the creation of one-way signatures
`
`of proprietary technology data, such as software code (which may be either in source
`
`code or object code format) or hardware code, such as HDL code or code in another
`
`hardware descriptive language or format, and record them into a Global signature
`
`database.
`
`In one embodiment, prior to recording signatures, they are validated for
`
`uniqueness and origin.
`
`In one embodiment, once signatures are in the Global signature
`
`database, a builder, perhaps more commonly referred to as a vendor, may receive
`
`alerts if their data is seen in the public domain or outside the organization. In one
`
`embodiment, the system may be used to alert vendor A that IP belonging to someone
`
`else is being introduced to their proprietary code base or IP. In one embodiment, the
`
`system may be used to track where the proprietary IP is being detected. In one
`
`embodiment, the system also ensures that the components used in the proprietary IP
`
`are of high quality and can be legally used, without risk of contaminating the proprietary
`
`code bases with incompatible or 'toxic' free or open source software (FOSS) or
`
`commercial licenses or potentially illegally obtained commercial IP. In one embodiment,
`
`the system allows effectively protection of vendor proprietary technology data,
`
`managing the risk of using 3rd party code, and alerting if IP theft or leakage is detected.
`
`In situations where ownership is contested, it can provide a proof of existence and
`
`ownership at given point in time.
`
`In one embodiment, the Global signature database
`
`may be a distributed database, and the system may use public blockchains as ledgers
`
`to record signatures in a decentralized, difficult to forge manner.
`
`
`
`WO 2018/031703
`
`PCT/US2017/046176
`
`[0021]
`
`The following detailed description of embodiments of the invention
`
`makes reference to the accompanying drawings in which like references indicate similar
`
`elements, showing by way of illustration specific embodiments of practicing the
`
`invention. Description of these embodiments is in sufficient detail to enable those
`
`skilled in the art to practice the invention. One skilled in the art understands that other
`
`embodiments may be utilized and that logical, mechanical, electrical, functional and
`
`other changes may be made without departing from the scope of the present invention.
`
`The following detailed description is, therefore, not to be taken in a limiting sense, and
`
`the scope of the present invention is defined only by the appended claims.
`
`[0022]
`
`Figure 1 is a network diagram showing one embodiment of technology
`
`and ownership validation system, at a high level. The system includes a plurality of
`
`vendors 110, 120, with proprietary files. The proprietary files may be software code,
`
`hardware description language (HDL), lP blocks in various languages, or other
`
`proprietary files representing code for software, hardware, or a combination. Note that
`
`the proprietary files may include FPGA code, and other descriptors.
`
`[0023]
`
`The protection server 160 is designed to create a system in which
`
`vendors can, in some embodiments, track their own proprietary files securely without
`
`disclosing them to any third party, as well as verify that their files are not leaking (being
`
`released as open source), and they are not bringing on-board the proprietary files or
`
`others, or open source code, without awareness. The protection server 160 in one
`
`embodiment makes local signature generators 115, 125 available to vendors. The
`
`vendors 110, 120 can use the signature generators to generate unique, trackable,
`
`unforgeable, and non-reverse-engineerable signatures for their proprietary files.
`
`
`
`WO 2018/031703
`
`PCT/US2017/046176
`
`[0024]
`
`Those signatures are then shared with protection server 160.
`
`In one
`
`embodiment, the signature may be made available via a distributed database 190. The
`
`distributed database 190, in one embodiment, stores blockchain signed versions of
`
`signatures, in one embodiment generated by signature generation system 170.
`
`[0025]
`
`In one embodiment, in addition to the proprietary files of vendors 110,
`
`120, the system may also obtain files from one or more open source databases 130 and
`
`repositories 180 and other sources 185. Other sources 185 may include drop file
`
`sources, such as paste.bin, wikileaks, and other drop sites. The signature generation
`
`system 170 may process these files to generate the unique signatures for open source
`
`files. This enables the IP protection server 160 to perform comparisons not only
`
`between the files of different vendors, but also the files of vendors and open source
`
`files.
`
`[0026]
`
`The protection server 160 performs comparisons, and provides alerts
`
`to vendors, as will be described below.
`
`In one embodiment, the IP protection server
`
`160 also provides validation of ownership, and chain of use.
`
`[0027]
`
`Figures 2A-2F are diagrams illustrating various use cases for the
`
`system. Figure 2A illustrates an exemplary use case.
`
`In this scenario, a vendor creates
`
`a signature of all or a portion of their proprietary files, or code base.
`
`In one
`
`embodiment, metadata is added. Metadata may include the vendor identity, copyright
`
`date, license data, and other relevant information. Other relevant information may
`
`include supported chipsets/devices, compilation targets, memory requirements,
`
`associated other files, etc.
`
`In one embodiment, each signature is of a segment of a file.
`
`
`
`WO 2018/031703
`
`PCT/US2017/046176
`
`In one embodiment, the metadata associated with one segment indicates the other
`
`signature segments associated in a particular complete file.
`
`[0028]
`
`The signatures are processed at the vendor site, enabling the system
`
`to be used without providing copies of files which are proprietary to the system. The
`
`signatures are then submitted to the signature database.
`
`In one embodiment, the
`
`proprietary files may be sent to another system, to enable processing of the files off—
`
`premise. In one embodiment, the database may be a database maintained by the IP
`
`protection sever.
`
`In one embodiment, the database may be a publicly available
`
`distributed database.
`
`[0029]
`
`The system validates the signatures are validated to be unique and
`
`high quality. The validated signatures are then added into the database.
`
`[0030]
`
`Open source data is obtained from various publicly available
`
`databases and sources, such as GitHub, SourceForge, Bitbucket, paste.bin, Wikileaks,
`
`and others. The files from these open source repositories are processed to generate
`
`signatures as well.
`
`[0031]
`
`The system then monitors the proprietary code, to ensure that no open
`
`source file signatures are found in the proprietary data, indicating that open source
`
`information has been entered into the vendor’s proprietary files or that the vendor’s
`
`proprietary data / IP exists in some public database.
`
`If such a match is found, the
`
`vendor may be alerted, to enable them to take action.
`
`[0032]
`
`Figure 28 illustrates one embodiment of another use case.
`
`In this use
`
`case, the signatures are matched against signatures from another vendor. When a
`
`match is found, an alert is sent to the vendor whose files are contaminated.
`
`
`
`WO 2018/031703
`
`PCT/US2017/046176
`
`[0033]
`
`Figure 2C illustrates another example use case, in which when a
`
`match is found between the files of two vendors, the alert is sent to the vendor whose
`
`files are leaked/misappropriated.
`
`[0034]
`
`Figure 2D illustrates another example use case in which a vendor
`
`creates signatures of licensed files, with metadata. The metadata may identify the type
`
`of licenses provided, and other relevant data. When the data of other vendors, and
`
`optionally open source files, are scanned, the use of the licensed code is identified.
`
`Furthermore, it enables identification of the code that is not properly licensed.
`
`[0035]
`
`Figure 2E illustrates another example use case, in which proof of
`
`authorship, ownership, and existence is incorporated into the GIP8 protection database.
`
`This enables the system to become a central registrar for authenticity of source code,
`
`based on proprietary data. This may be provided as an effective proof, without storing
`
`the actual source code.
`
`In one embodiment, the system also permits owner of the IP to
`
`register multiple different versions of software, with similar, overlapping signatures for
`
`some parts of the IP. In one embodiment, the system also permits moving of a portfolio
`
`between companies, due to mergers & acquisitions (M&A), technology transfers, etc.
`
`In
`
`one embodiment, if such a transfer occurs, the system may also provide a complete
`
`audit trail of such transactions.
`
`[0036]
`
`Figure 2F illustrates an example use case, in which the signature data
`
`is stored in the form of blockchains. Blockchain represents a public ledger, which is
`
`used in one embodiment to provide a one-way unforgeable signature of the files. The
`
`format of the blockchain selected may be bitcoin, or some other active Blockchain
`
`(Ethereum, Litecoin, Doge, NXT, etc.) This enables the system to push verified,
`
`
`
`WO 2018/031703
`
`PCT/USZOl7/046176
`
`unchallenged signatures to a public blockchain, which is made freely available. This
`
`may be used to establish proof of Existence, Establish proof of ’first’ creation, Establish
`
`proof of ‘prior art’.
`
`In one embodiment an ’open’ signature algorithm is used.
`
`In one
`
`embodiment, the signature algorithm, instead of hash, supports partial matching. The
`
`signature algorithm is robust against code alterations, and thus supports matching
`
`partial code snippets, and simple modifications such as renaming functions and
`
`variables, or removing comments does not impact the match. This enables the system
`
`to match code snippets, such as a function that has been copied vs. the entire the
`
`source file.
`
`[0037]
`
`Figure 3 is a block diagram of one embodiment of the system.
`
`In one
`
`embodiment, the system includes a vendor system 305, central signature processor
`
`360, a signature validator 330, and a matching and authentication server 380. Although
`
`shown separately, the signature validator 330, central signature processor 360, and
`
`matching and authentication server 380 may be parts of the same system, located on
`
`the same server, or located on a distributed system which works together.
`
`[0038]
`
`The vendor system 305 in on embodiment, is a downloadable tool,
`
`which is made available to a vendor.
`
`In one embodiment, the vendor system
`
`305enables a vendor to process their proprietary files locally, without providing them to
`
`the system. This enables the vendor to maintain trade secrets, and not reveal the exact
`
`details of their files. The vendor system 305includes local signature generator 310, and
`
`signature store 315. In one embodiment, the signatures have associated metadata.
`
`The metadata may include the vendor’s identification, licensing information, file
`
`associations, and other relevant data. The signatures and associated metadata
`
`1O
`
`
`
`WO 2018/031703
`
`PCT/US2017/046176
`
`generated are stored in signature store 315, and in one embodiment communicated via
`
`communication system 320 to the signature validator 330. Communications system
`
`320, in one embodiment, comprises a network connection, or secure upload mechanism
`
`or cloud storage mechanism, or another way of providing the signatures to the IP
`
`protection server.
`
`In one embodiment, the vendor may choose send some or all of their
`
`proprietary files to the central signature processor 360, which can generate signatures,
`
`instead of generating them on-site.
`
`[0039]
`
`Center signature processor 360, in one embodiment, processes open
`
`source files, and optionally files provided by vendors who want off-site signature
`
`generation. The open source scraper 365 obtains open source files from repositories
`
`such as GitHub and SourceForge, as well as site that provide a way to download files,
`
`such as Wikileaks, Tor, and Pastebin, or other known sources of open source files.
`
`[0040]
`
`Signature & metadata generator 370 generates the signatures and
`
`metadata for open source files. For files obtained from vendors, the vendor provides
`
`the metadata for inclusion. The metadata for open source files in one embodiment
`
`includes source (e.g. Github), file associations, license, creation date, version, and other
`
`relevant information.
`
`[0041]
`
`Signature store 375 temporarily stores the generated signatures, while
`
`communications system 363 provides the files to the signature validator 330.
`
`[0042]
`
`Signature validator 330 includes comparator 335 to compare the
`
`signatures from vendor system 305, and central signature processor 360, which are
`
`stored in its storage 355.
`
`If a conflict is identified, validator 340 attempts to resolve the
`
`conflict, and if there is insufficient information, alerts the vendor.
`
`In one embodiment,
`
`11
`
`
`
`WO 2018/031703
`
`PCT/US2017/046176
`
`signature validator 330 is used to ensure that signatures are unique, and that multiple
`
`copies of the same file are not claimed by different originators.
`
`[0043]
`
`In one embodiment, signature validator 330 includes a block chain
`
`generator 345. Blockchain generation creates a unique validation key for each
`
`signature, in on embodiment, once the signatures are validated as being unique. Using
`
`blockchain enables the use of a distributed database 399, which can serve as an
`
`escrow and validation, as will be described below. The signature data is sent, via
`
`communication system 350 to matching and authentication server 380, and distributed
`
`database 399.
`
`[0044] Matching and authentication server 380 in one embodiment maintains
`
`a global database 385 of signatures. Since the signatures are validated by validator
`
`330, each signature in the database is unique 385. The signatures also include
`
`metadata, providing information about the file(s) associated with the signature.
`
`[0045]
`
`In one embodiment, the matching and authentication server 380
`
`includes a signature matcher 390, which enables matching of signatures in the
`
`database, whether proprietary or open source to identify leakage/misappropriation
`
`(when proprietary files of one vendor appear in the files of an open source project or
`
`another vendor) and contamination (when open source files, or files of another vendor
`
`appear in the files of a vendor). Alert system 395 sends out alerts, via communication
`
`system 383, to the appropriate vendor(s).
`
`In one embodiment, a vendor is informed of
`
`leakage/misappropriation or contamination, to enable them to take action.
`
`[0046]
`
`Updater/versioning logic enables the system to update signatures
`
`when new versions of products or files are released.
`
`In one embodiment, the system
`
`12
`
`
`
`WO 2018/031703
`
`PCT/US2017/046176
`
`does not re-generate all signatures, but only tracks alterations, and provides versioning
`
`and changes in ownership or licensing.
`
`In one embodiment, the blockchain generator
`
`345 is used to update the blockchain to reflect such changes.
`
`In another embodiment,
`
`a new blockchain transaction may be generated when such changes are made.
`
`[0047]
`
`Each of the systems and logics described herein run on a computer
`
`system or processor, and are algorithmic implementations to solve the technological
`
`problem presented by validating the authenticity and uniqueness of code.
`
`In one
`
`embodiment, the algorithms are implemented in software, such as C/C++, Go, Java,
`
`and Python. This problem, and thus this solution, is inherently linked to computing
`
`technology, since this problem only occurs because computer software and hardware lP
`
`have issues of leakage and contamination.
`
`[0048]
`
`In one embodiment, signature generators 115, 125 are embedded in
`
`one or more electronic design automation (EDA) tools and automatically generate
`
`signatures each time the tool is invoked by the vendor throughout the EDA flow. An
`
`EDA flow can include multiple steps, and each step can involve using one or more EDA
`
`software tools. Some EDA steps and software tools are described below, with respect to
`
`Figure 13. These examples of EDA steps and software tools are for illustrative purposes
`
`only and are not intended to limit the embodiments to the forms disclosed.
`
`[0049]
`
`To illustrate the EDA flow, consider an EDA system that receives one
`
`or more high level behavioral descriptions of an IC device (e.g., in HDL languages like
`
`VHDL, Verilog, etc.) and translates (“synthesizes”) this high level design language
`
`description into netlists of various levels of abstraction. A netlist describes the
`
`IC design and is composed of nodes (functional elements) and edges, e.g., connections
`
`13
`
`
`
`WO 2018/031703
`
`PCT/US2017/046176
`
`between nodes. At a higher level of abstraction, a generic netlist is typically produced
`
`based on technology independent primitives.
`
`[0050]
`
`The generic netlist can be translated into a lower level technology—
`
`specific netlist based on a technology-specific (characterized) cell library that has gate-
`
`specific models for each cell (functional element). The models define performance
`
`parameters for the cells; e.g., parameters related to the operational behavior of the
`
`cells, such as power consumption, delay, transition time, and noise. The netlist and cell
`
`library are typically stored in computer readable media within the EDA system and are
`
`processed and verified using many well-known techniques.
`
`[0051]
`
`Before proceeding further with the description, it may be helpful to
`
`place these processes in context. Figure 13 shows a simplified representation of an
`
`exemplary digital ASIC design flow. At a high level, the process starts with the product
`
`idea (step E100) and is realized in an EDA software design process (step E110). When
`
`the design is finalized, it can be taped—out (event E140). After tape out, the fabrication
`
`process (step E150) and packaging and assembly processes (step E160) occur
`
`resulting, ultimately, in finished chips (result E170).
`
`[0052]
`
`The EDA software design process (step E110) is actually composed of
`
`a number of steps E112-E130, shown in linear fashion for simplicity. In an actual
`
`ASIC design process, the particular design might have to go back through steps until
`
`certain tests are passed. Similarly, in any actual design process, these steps may occur
`
`in different orders and combinations. This description is therefore provided by way of
`
`context and general explanation rather than as a specific, or recommended, design flow
`
`for a particular ASIC.
`
`14
`
`
`
`WO 2018/031703
`
`PCT/US2017/046176
`
`[0053]
`
`A brief description of the components steps of
`
`the EDA software design process (step E110) will now be provided:
`
`[0054]
`
`System design (step E112): The designers describe the functionality
`
`that they want to implement and can perform what-if planning to refine functionality,
`
`check costs, etc. Hardware-software architecture partitioning can occur at this stage.
`
`Exemplary EDA software products from Synopsys, Inc. that can be used at this step
`
`include Model Architect, Saber, System Studio, and DesignWare® products.
`
`[0055]
`
`Logic design and functional verification (step E114): At this stage, the
`
`VHDL or Verilog code for modules in the system is written and the design is checked for
`
`functional accuracy. More specifically, the design is checked to ensure that it produces
`
`the correct outputs. Exemplary EDA software products from Synopsys, Inc. that can be
`
`used at this step include VCS, VERA, DesignWare®, Magellan, Formality, ESP and
`
`LEDA products.
`
`[0056]
`
`Synthesis and design for test (step E116): Here, the VHDL/Verilog is
`
`translated into a netlist. The netlist can be optimized for the target technology.
`
`Additionally, the design and implementation of tests to permit checking of the finished
`
`chip occurs. Exemplary EDA software products from Synopsys, Inc. that can be used at
`
`this step include Design Compiler®, Physical Compiler, Test Compiler, Power Compiler,
`
`FPGA Compiler, Tetramax, and DesignWare® products.
`
`[0057]
`
`Design planning (step E118): Here, an overall floorplan for the chip is
`
`constructed and analyzed for timing and top-level routing. Exemplary EDA software
`
`products from Synopsys, Inc. that can be used at this step include Jupiter and
`
`Floorplan Compiler products.
`
`15
`
`
`
`WO 2018/031703
`
`PCT/US2017/046176
`
`[0058]
`
`Netlist verification (step E120): At this step, the netlist is checked for
`
`compliance with timing constraints and for correspondence with the VHDL/Verilog
`
`source code. Exemplary EDA software products from Synopsys, Inc. that can be used
`
`at this step include VCS, VERA, Formality and PrimeTime products.
`
`[0059]
`
`Physical implementation (step E122): The placement (positioning of
`
`circuit elements) and routing (connection of the same) occurs at this step.
`
`Exemplary EDA software products from Synopsys, Inc. that can be used at this step
`
`include the Astro product.
`
`[0060]
`
`Analysis and extraction (step E124): At this step, the circuit function is
`
`verified at a transistor level, this in turn permits what-if refinement.
`
`Exemplary EDA software products from Synopsys, Inc. that can be used at this step
`
`include Star RC/XT, Raphael, and Aurora products.
`
`[0061]
`
`Physical verification (step E126): At this step various checking
`
`functions are performed to ensure correctness for: manufacturing, electrical issues,
`
`lithographic issues, and circuitry. Exemplary EDA software products , Inc. that can be
`
`used at this step include the Hercules product.
`
`[0062]
`
`Resolution enhancement (step E128): This step involves geometric
`
`manipulations of the layout to improve manufacturability of the design.
`
`Exemplary EDA software products from Synopsys, Inc. that can be used at this step
`
`include iN-Phase, Proteus, and AFGen products.
`
`[0063] Mask data preparation (step E130): This step provides the “tape-out”
`
`data for production of masks for lithographic use to produce finished chips.
`
`16
`
`
`
`WO 2018/031703
`
`PCT/US2017/046176
`
`Exemplary EDA software products from Synopsys, Inc. that can be used at this step
`
`include the CATS(R) family of products.
`
`[0064] With embedded signature generators 115, 125, each of the above
`
`described EDA tools can generate and transmit unique signatures upon completion of
`
`each portion of the EDA flow. Thus a signature can be generated at the HDL stage, the
`
`netlist stage or after completion of place and route. Similarly, the software design flow
`
`can include various tools each of which can include signature generators 115, 125. By
`
`way of example, the Synopsys Software Security includes various tools such as the
`
`Synopsys’ state-of—the-art static application security testing (SAST) product, Coverity.
`
`The Coverity tool can generate signatures on code following the completion of a static
`
`check prior to checking new code into a build. For the present application, regardless of
`
`which version of the design is used, the application will reference “language” and “code”
`
`and “code segment,” for simplicity. However, it should be understood that these terms
`
`are meant to encompass the various versions of the EDA generated elements.
`
`[0065]
`
`Figure 4 is an overview flowchart of one embodiment of the system.
`
`The process starts at block 410. At block 420, signatures are generated locally for
`
`proprietary files.
`
`In one embodiment, the proprietary files may be hardware description
`
`language, such as HDL files. The signatures are generated, in one embodiment, using
`
`the process described below.
`
`[0066]
`
`At block 430, the system determines whether the signatures are
`
`unique. This ensures that the system can uniquely identify the file segment associated
`
`with the signature. Note that the signature generation algorithm is such that the
`
`signatures are unique. Therefore, if the signature is not unique, that means that the
`
`17
`
`
`
`WO 2018/031703
`
`PCT/US2017/046176
`
`same code was submitted multiple times to signature generation.
`
`If the signatures are
`
`unique, they are added to a database at block 440.
`
`In one embodiment, in addition to
`
`the signature, the relevant metadata is also added to the database. The metadata may
`
`include information about the vendor, license, and other relevant information.
`
`[0067]
`
`At block 445, a blockchain transaction is generated for each of the
`
`validated signatures, and the transactions are recorded to the blockchain that acts as a
`
`distributed database. The distributed database makes the signature available. This
`
`enables the use of the signature for authentication, proof of authorship, ownership, and
`
`existence.
`
`In one embodiment, this enables the distributed database to become a
`
`central ‘registrar’ for authenticity of the proprietary files.
`
`In one embodiment, the
`
`blockchain acts as a sort of ‘escrow’ in validation that does not require users to store
`
`their proprietary files. This is cheaper to manage than traditional escrow services. In
`
`one embodiment submissions to blockchain are securely signed to identify submitting
`
`organization, and associated metadata to support trail of ownership, licensing, and other
`
`metadata.
`
`[0068]
`
`The process then continues to block 460.
`
`If the signature was not
`
`unique, at block 450 the vendor is alerted to the policy violation, and directed to resolve
`
`it.
`
`In one embodiment, such issues may be resolved by identifying licensed content,
`
`acquisitions, or other reasons for overlap.
`
`[0069]
`
`At block 460, the system processes open source content to generate
`
`signatures.
`
`In one embodiment, the system scrapes multiple repositories of open
`
`source data.
`
`In one embodiment, the system scrapes data from appropriate type(s) of
`
`repositories. For example, there may be repositories of hardware description language
`
`18
`
`
`
`WO 2018/031703
`
`PCT/US2017/046176
`
`(HDL), which may be processed for a system which evaluates HDL. One example of an
`
`open source hardware repository is OpenCores found athttp://opencores.org/
`
`[0070]
`
`At block 470, the process determines whether there are any overlaps.
`
`Overlaps may be evidence of open source data contaminating a vendor’s product, or
`
`the vendor’s proprietary code being leaked into open source.
`
`If overlap is detected, at
`
`block 480 the vendor is alerted to the policy violation, and the open source issue
`
`detected. The process then ends, at block 490.
`
`In one embodiment, this process runs
`
`continuously as new data is acquired from vendors and/or open source repositories.
`
`In
`
`one embodiment, as versions are released and updated, the process is again run.
`
`In
`
`one embodiment, the process is only run on newly added content.
`
`[0071]
`
`Of course, though this is shown as a flowchart, in one embodiment it
`
`may be implemented as an interrupt-driven system, or executed over multiple devices
`
`and in multiple time frames. For example, signature uniqueness verification may occur
`
`periodically, and at a remote system from the system which generates the signatures.
`
`Similarly, open source processing may occur in parallel with other processes.
`
`Therefore, one of skill in the art should understand this flowchart, and all other
`
`flowcharts in this application to describe a set of actions that are related to a particular
`
`process, but not assume the ordering of the elements of the flowchart cannot be altered
`
`while staying within the scope of the disclosure.
`
`[0072]
`
`Figure 5 is a flowchart of one embodiment of generating a code
`
`signature for a source file. The process begins at stage 504 by determining a language
`
`of the source file. In an embodiment, the language may be detected based on the file
`
`extension. For example, the file extension “py” may indicate the Python programming
`
`19
`
`
`
`WO 2018/031703
`
`PCT/US2017/046176
`
`language. In an embodiment, the programming language may also be determined
`
`through analysis of the file content. For example, presence of ‘magic numbers,’ unique
`
`language—specific reserved keywor