throbber
Nearest neighbor search - Wikipedia, the free encyclopedia
`
`Page 1 of 8
`
`Nearest neighbor search
`
`From Wikipedia, the free encyclopedia
`
`Nearest neighbor search (NNS), also known as proximity search, similarity search or closest point
`search, is an optimization problem for finding closest (or most similar) points. Closeness is typically
`expressed in terms of a dissimilarity function: the less similar the objects, the larger the function values.
`Formally, the nearest-neighbor (NN) search problem is defined as follows: given a set S of points in a
`space M and a query point q (cid:1223) M, find the closest point in S to q. Donald Knuth in vol. 3 of The Art of
`Computer Programming (1973) called it the post-office problem, referring to an application of
`assigning to a residence the nearest post office. A direct generalization of this problem is a k-NN search,
`where we need to find the k closest points.
`
`Most commonly M is a metric space and dissimilarity is expressed as a distance metric, which is
`symmetric and satisfies the triangle inequality. Even more common, M is taken to be the d-dimensional
`vector space where dissimilarity is measured using the Euclidean distance, Manhattan distance or other
`distance metric. However, the dissimilarity function can be arbitrary. One example are asymmetric
`Bregman divergences, for which the triangle inequality does not hold.[1]
`
`Contents
`
`(cid:1389) 1 Applications
`(cid:1389) 2 Methods
`(cid:1389) 2.1 Linear search
`(cid:1389) 2.2 Space partitioning
`(cid:1389) 2.3 Locality sensitive hashing
`(cid:1389) 2.4 Nearest neighbor search in spaces with small intrinsic dimension
`(cid:1389) 2.5 Projected radial search
`(cid:1389) 2.6 Vector approximation files
`(cid:1389) 2.7 Compression/clustering based search
`(cid:1389) 2.8 Greedy walk search in small-world graphs
`(cid:1389) 3 Variants
`(cid:1389) 3.1 k-nearest neighbor
`(cid:1389) 3.2 Approximate nearest neighbor
`(cid:1389) 3.3 Nearest neighbor distance ratio
`(cid:1389) 3.4 Fixed-radius near neighbors
`(cid:1389) 3.5 All nearest neighbors
`(cid:1389) 4 See also
`(cid:1389) 5 Notes
`(cid:1389) 6 References
`(cid:1389) 7 Further reading
`(cid:1389) 8 External links
`
` NETWORK-1 EXHIBIT A2008
` Google Inc. v. Network-1 Technologies, Inc.
` IPR2015-00345
`Page 1 of 8
`9/10/2015
`file:///C:/Users/kla/AppData/Local/Temp/YSXBL31N.htm
`
`

`
`Nearest neighbor search - Wikipedia, the free encyclopedia
`
`Page 2 of 8
`
`Applications
`
`The nearest neighbor search problem arises in numerous fields of application, including:
`
`(cid:1389) Pattern recognition - in particular for optical character recognition
`(cid:1389) Statistical classification- see k-nearest neighbor algorithm
`(cid:1389) Computer vision
`(cid:1389) Computational Geometry - see Closest pair of points problem
`(cid:1389) Databases - e.g. content-based image retrieval
`(cid:1389) Coding theory - see maximum likelihood decoding
`(cid:1389) Data compression - see MPEG-2 standard
`(cid:1389) Robotic sensing[2]
`(cid:1389) Recommendation systems, e.g. see Collaborative filtering
`(cid:1389) Internet marketing - see contextual advertising and behavioral targeting
`(cid:1389) DNA sequencing
`(cid:1389) Spell checking - suggesting correct spelling
`(cid:1389) Plagiarism detection
`(cid:1389) Contact searching algorithms in FEA
`(cid:1389) Similarity scores for predicting career paths of professional athletes.
`(cid:1389) Cluster analysis - assignment of a set of observations into subsets (called clusters) so that
`observations in the same cluster are similar in some sense, usually based on Euclidean distance
`(cid:1389) Chemical similarity
`(cid:1389) Sampling-Based Motion Planning
`Methods
`
`Various solutions to the NNS problem have been proposed. The quality and usefulness of the algorithms
`are determined by the time complexity of queries as well as the space complexity of any search data
`structures that must be maintained. The informal observation usually referred to as the curse of
`dimensionality states that there is no general-purpose exact solution for NNS in high-dimensional
`Euclidean space using polynomial preprocessing and polylogarithmic search time.
`
`Linear search
`
`The simplest solution to the NNS problem is to compute the distance from the query point to every other
`point in the database, keeping track of the "best so far". This algorithm, sometimes referred to as the
`naive approach, has a running time of O(dN) where N is the cardinality of S and d is the dimensionality
`of M. There are no search data structures to maintain, so linear search has no space complexity beyond
`the storage of the database. Naive search can, on average, outperform space partitioning approaches on
`higher dimensional spaces.[3]
`
`Space partitioning
`
`Page 2 of 8
`file:///C:/Users/kla/AppData/Local/Temp/YSXBL31N.htm
`
`9/10/2015
`
`

`
`Nearest neighbor search - Wikipedia, the free encyclopedia
`
`Page 3 of 8
`
`Since the 1970s, branch and bound methodology has been applied to the problem. In the case of
`Euclidean space this approach is known as spatial index or spatial access methods. Several space-
`partitioning methods have been developed for solving the NNS problem. Perhaps the simplest is the k-d
`tree, which iteratively bisects the search space into two regions containing half of the points of the
`parent region. Queries are performed via traversal of the tree from the root to a leaf by evaluating the
`query point at each split. Depending on the distance specified in the query, neighboring branches that
`might contain hits may also need to be evaluated. For constant dimension query time, average
`complexity is O(log N) [4] in the case of randomly distributed points, worst case complexity analyses
`have been performed.[5] Alternatively the R-tree data structure was designed to support nearest neighbor
`search in dynamic context, as it has efficient algorithms for insertions and deletions such as the R* tree.
`[6] R-trees can yield nearest neighbors not only for Euclidean distance, but can also be used with other
`distances.
`
`In case of general metric space branch and bound approach is known under the name of metric trees.
`Particular examples include vp-tree and BK-tree.
`
`Using a set of points taken from a 3-dimensional space and put into a BSP tree, and given a query point
`taken from the same space, a possible solution to the problem of finding the nearest point-cloud point to
`the query point is given in the following description of an algorithm. (Strictly speaking, no such point
`may exist, because it may not be unique. But in practice, usually we only care about finding any one of
`the subset of all point-cloud points that exist at the shortest distance to a given query point.) The idea is,
`for each branching of the tree, guess that the closest point in the cloud resides in the half-space
`containing the query point. This may not be the case, but it is a good heuristic. After having recursively
`gone through all the trouble of solving the problem for the guessed half-space, now compare the
`distance returned by this result with the shortest distance from the query point to the partitioning plane.
`This latter distance is that between the query point and the closest possible point that could exist in the
`half-space not searched. If this distance is greater than that returned in the earlier result, then clearly
`there is no need to search the other half-space. If there is such a need, then you must go through the
`trouble of solving the problem for the other half space, and then compare its result to the former result,
`and then return the proper result. The performance of this algorithm is nearer to logarithmic time than
`linear time when the query point is near the cloud, because as the distance between the query point and
`the closest point-cloud point nears zero, the algorithm needs only perform a look-up using the query
`point as a key to get the correct result.
`
`Locality sensitive hashing
`
`Locality sensitive hashing (LSH) is a technique for grouping points in space into 'buckets' based on
`some distance metric operating on the points. Points that are close to each other under the chosen metric
`are mapped to the same bucket with high probability.[7]
`
`Nearest neighbor search in spaces with small intrinsic dimension
`
`The cover tree has a theoretical bound that is based on the dataset's doubling constant. The bound on
`search time is O(c12 log n) where c is the expansion constant of the dataset.
`
`Page 3 of 8
`file:///C:/Users/kla/AppData/Local/Temp/YSXBL31N.htm
`
`9/10/2015
`
`

`
`Nearest neighbor search - Wikipedia, the free encyclopedia
`
`Page 4 of 8
`
`Projected radial search
`
`In the special case where the data is a dense 3D map of geometric points, the projection geometry of the
`sensing technique can be used to dramatically simplify the search problem. This approach requires that
`the 3D data is organized by a projection to a two dimensional grid and assumes that the data is spatially
`smooth across neighboring grid cells with the exception of object boundaries. These assumptions are
`valid when dealing with 3D sensor data in applications such as surveying, robotics and stereo vision but
`may not hold for unorganized data in general. In practice this technique has an average search time of O
`(1) or O(K) for the k-nearest neighbor problem when applied to real world stereo vision data. [2]
`
`Vector approximation files
`
`In high dimensional spaces, tree indexing structures become useless because an increasing percentage of
`the nodes need to be examined anyway. To speed up linear search, a compressed version of the feature
`vectors stored in RAM is used to prefilter the datasets in a first run. The final candidates are determined
`in a second stage using the uncompressed data from the disk for distance calculation.[8]
`
`Compression/clustering based search
`
`The VA-file approach is a special case of a compression based search, where each feature component is
`compressed uniformly and independently. The optimal compression technique in multidimensional
`spaces is Vector Quantization (VQ), implemented through clustering. The database is clustered and the
`most "promising" clusters are retrieved. Huge gains over VA-File, tree-based indexes and sequential
`scan have been observed.[9][10] Also note the parallels between clustering and LSH.
`
`Greedy walk search in small-world graphs
`
` is
`, where every point
`One possible way to solve NNS is to construct a graph
`. The search of the point in the set S closest to the query q takes
`uniquely associated with vertex
`the form of the search of vertex in the graph
`. One of the basic vertex search algorithms in
`graphs with metric objects is the greedy search algorithm. It starts from the random vertex
`. The
`algorithm computes a distance value from the query q to each vertex from the neighborhood
` of the current vertex
`, and then selects a vertex with the minimal distance
`value. If the distance value between the query and the selected vertex is smaller than the one between
`the query and the current element, then the algorithm moves to the selected vertex, and it becomes new
`current vertex. The algorithm stops when it reaches a local minimum: a vertex whose neighborhood does
`not contain a vertex that is closer to the query than the vertex itself. This idea was exploited in VoroNet
`system [11] for the plane, in RayNet system [12] for the
` and for the general metric space in Metrized
`Small World algorithm.[13] Java and C++ implementations of the algorithm together with sources are
`hosted on the GitHub (Non-Metric Space Library
`(https://github.com/searchivarius/NonMetricSpaceLib), Metrized Small World library
`(http://aponom84.github.io/MetrizedSmallWorld/)).
`
`Page 4 of 8
`file:///C:/Users/kla/AppData/Local/Temp/YSXBL31N.htm
`
`9/10/2015
`
`

`
`Nearest neighbor search - Wikipedia, the free encyclopedia
`
`Page 5 of 8
`
`Variants
`
`There are numerous variants of the NNS problem and the two most well-known are the k-nearest
`neighbor search and the (cid:304)-approximate nearest neighbor search.
`
`k-nearest neighbor
`
`k-nearest neighbor search identifies the top k nearest neighbors to the query. This technique is
`commonly used in predictive analytics to estimate or classify a point based on the consensus of its
`neighbors. k-nearest neighbor graphs are graphs in which every point is connected to its k nearest
`neighbors.
`
`Approximate nearest neighbor
`
`In some applications it may be acceptable to retrieve a "good guess" of the nearest neighbor. In those
`cases, we can use an algorithm which doesn't guarantee to return the actual nearest neighbor in every
`case, in return for improved speed or memory savings. Often such an algorithm will find the nearest
`neighbor in a majority of cases, but this depends strongly on the dataset being queried.
`
`Algorithms that support the approximate nearest neighbor search include locality-sensitive hashing, best
`bin first and balanced box-decomposition tree based search.[14]
`
`Nearest neighbor distance ratio
`
`Nearest neighbor distance ratio do not apply the threshold on the direct distance from the original point
`to the challenger neighbor but on a ratio of it depending on the distance to the previous neighbor. It is
`used in CBIR to retrieve pictures through a "query by example" using the similarity between local
`features. More generally it is involved in several matching problems.
`
`Fixed-radius near neighbors
`
`Fixed-radius near neighbors is the problem where one wants to efficiently find all points given in
`Euclidean space within a given fixed distance from a specified point. The data structure should work on
`a distance which is fixed however the query point is arbitrary.
`
`All nearest neighbors
`
`For some applications (e.g. entropy estimation), we may have N data-points and wish to know which is
`the nearest neighbor for every one of those N points. This could of course be achieved by running a
`nearest-neighbor search once for every point, but an improved strategy would be an algorithm that
`exploits the information redundancy between these N queries to produce a more efficient search. As a
`simple example: when we find the distance from point X to point Y, that also tells us the distance from
`point Y to point X, so the same calculation can be reused in two different queries.
`
`Page 5 of 8
`file:///C:/Users/kla/AppData/Local/Temp/YSXBL31N.htm
`
`9/10/2015
`
`

`
`Nearest neighbor search - Wikipedia, the free encyclopedia
`
`Page 6 of 8
`
`Given a fixed dimension, a semi-definite positive norm (thereby including every Lp norm), and n points
`in this space, the nearest neighbour of every point can be found in O(n log n) time and the m nearest
`neighbours of every point can be found in O(mn log n) time.[15][16]
`See also
`
`(cid:1389) Ball tree
`(cid:1389) Closest pair of points problem
`(cid:1389) Cluster analysis
`(cid:1389) Content-based image retrieval
`(cid:1389) Curse of dimensionality
`(cid:1389) Digital signal processing
`(cid:1389) Dimension reduction
`(cid:1389) Fixed-radius near neighbors
`(cid:1389) Fourier analysis
`(cid:1389) Instance-based learning
`(cid:1389) k-nearest neighbor algorithm
`(cid:1389) Linear least squares
`(cid:1389) Locality sensitive hashing
`
`Notes
`
`(cid:1389) MinHash
`(cid:1389) Multidimensional analysis
`(cid:1389) Nearest-neighbor interpolation
`(cid:1389) Neighbor joining
`(cid:1389) Principal component analysis
`(cid:1389) Range search
`(cid:1389) Set cover problem
`(cid:1389) Singular value decomposition
`(cid:1389) Sparse distributed memory
`(cid:1389) Statistical distance
`(cid:1389) Time series
`(cid:1389) Voronoi diagram
`(cid:1389) Wavelet
`
`1. Cayton, Lawerence (2008). "Fast nearest neighbor retrieval for bregman divergences.". Proceedings of the
`25th international conference on Machine learning: 112–119.
`2. Bewley, A., & Upcroft, B. (2013). Advantages of Exploiting Projection Structure for Segmenting Dense 3D
`Point Clouds. In Australian Conference on Robotics and Automation [1]
`(http://www.araa.asn.au/acra/acra2013/papers/pap148s1-file1.pdf)
`3. Weber, Schek, Blott. "A quantitative analysis and performance study for similarity search methods in high
`dimensional spaces" (http://www.vldb.org/conf/1998/p194.pdf) (PDF).
`4. Andrew Moore. "An introductory tutorial on KD
`trees" (http://www.autonlab.com/autonweb/14665/version/2/part/5/data/moore-tutorial.pdf?
`branch=main&language=en) (PDF).
`5. Lee, D. T.; Wong, C. K. (1977). "Worst-case analysis for region and partial region searches in
`multidimensional binary search trees and balanced quad trees". Acta Informatica 9 (1): 23–29.
`doi:10.1007/BF00263763 (https://dx.doi.org/10.1007%2FBF00263763).
`6. Roussopoulos, N.; Kelley, S.; Vincent, F. D. R. (1995). "Nearest neighbor queries". Proceedings of the 1995
`ACM SIGMOD international conference on Management of data - SIGMOD '95. p. 71.
`doi:10.1145/223784.223794 (https://dx.doi.org/10.1145%2F223784.223794). ISBN 0897917316.
`7. A. Rajaraman and J. Ullman (2010). "Mining of Massive Datasets, Ch.
`3." (http://infolab.stanford.edu/~ullman/mmds.html).
`8. Weber, Blott. "An Approximation-Based Data Structure for Similarity Search".
`9. Ramaswamy, Rose, ICIP 2007. "Adaptive cluster-distance bounding for similarity search in image
`databases".
`10. Ramaswamy, Rose, TKDE 2010. "Adaptive cluster-distance bounding for high-dimensional indexing".
`11. Olivier, Beaumont; Kermarrec, Anne-Marie; Marchal, Loris; Rivière, Etienne (2006). "VoroNet: A scalable
`object network based on Voronoi tessellations". INRIA. RR-5833 (1): 23–29. doi:10.1007/BF00263763
`(https://dx.doi.org/10.1007%2FBF00263763).
`
`Page 6 of 8
`file:///C:/Users/kla/AppData/Local/Temp/YSXBL31N.htm
`
`9/10/2015
`
`

`
`Nearest neighbor search - Wikipedia, the free encyclopedia
`
`Page 7 of 8
`
`12. Olivier, Beaumont; Kermarrec, Anne-Marie; Rivière, Etienne (2007). "Peer to Peer Multidimensional
`Overlays: Approximating Complex Structures". Principles of Distributed Systems 4878 (.): 315–328.
`doi:10.1007/978-3-540-77096-1_23 (https://dx.doi.org/10.1007%2F978-3-540-77096-1_23). ISBN 978-3-
`540-77095-4. Missing |last3= in Authors list (help)
`13. Malkov, Yury; Ponomarenko, Alexander; Krylov, Vladimir; Logvinov, Andrey (2014). "Approximate
`nearest neighbor algorithm based on navigable small world graphs". Information Systems 45: 61–68.
`doi:10.1016/j.is.2013.10.006 (https://dx.doi.org/10.1016%2Fj.is.2013.10.006).
`14. S. Arya, D. M. Mount, N. S. Netanyahu, R. Silverman and A. Wu, An optimal algorithm for approximate
`nearest neighbor searching, Journal of the ACM, 45(6):891-923, 1998. [2]
`(http://www.cse.ust.hk/faculty/arya/pub/JACM.pdf)
`15. Clarkson, Kenneth L. (1983), "Fast algorithms for the all nearest neighbors problem", 24th IEEE Symp.
`Foundations of Computer Science, (FOCS '83), pp. 226–232, doi:10.1109/SFCS.1983.16
`(https://dx.doi.org/10.1109%2FSFCS.1983.16), ISBN 0-8186-0508-1.
`16. Vaidya, P. M. (1989). "An O(n log n) Algorithm for the All-Nearest-Neighbors
`Problem" (http://www.springerlink.com/content/p4mk2608787r7281/?
`p=09da9252d36e4a1b8396833710ef08cc&pi=8). Discrete and Computational Geometry 4 (1): 101–115.
`doi:10.1007/BF02187718 (https://dx.doi.org/10.1007%2FBF02187718).
`References
`
`(cid:1389) Andrews, L.. A template for the nearest neighbor problem. C/C++ Users Journal, vol. 19, no 11
`(November 2001), 40 - 49, 2001, ISSN:1075-2838, www.ddj.com/architect/184401449
`(http://www.ddj.com/architect/184401449)
`(cid:1389) Arya, S., D. M. Mount, N. S. Netanyahu, R. Silverman, and A. Y. Wu. An Optimal Algorithm for
`Approximate Nearest Neighbor Searching in Fixed Dimensions. Journal of the ACM, vol. 45, no.
`6, pp. 891–923
`(cid:1389) Beyer, K., Goldstein, J., Ramakrishnan, R., and Shaft, U. 1999. When is nearest neighbor
`meaningful? In Proceedings of the 7th ICDT, Jerusalem, Israel.
`(cid:1389) Chung-Min Chen and Yibei Ling - A Sampling-Based Estimator for Top-k Query. ICDE 2002:
`617-627
`(cid:1389) Samet, H. 2006. Foundations of Multidimensional and Metric Data Structures. Morgan
`Kaufmann. ISBN 0-12-369446-9
`(cid:1389) Zezula, P., Amato, G., Dohnal, V., and Batko, M. Similarity Search - The Metric Space Approach.
`Springer, 2006. ISBN 0-387-29146-6
`Further reading
`
`(cid:1389) Shasha, Dennis (2004). High Performance Discovery in Time Series. Berlin: Springer.
`ISBN 0-387-00857-8.
`External links
`
`(cid:1389) Nearest Neighbors and Similarity Search
`(http://simsearch.yury.name/tutorial.html) – a website
`dedicated to educational materials, software, literature,
`researchers, open problems and events related to NN
`searching. Maintained by Yury Lifshits
`
`Wikimedia Commons has
`media related to Nearest
`neighbours search.
`
`Page 7 of 8
`file:///C:/Users/kla/AppData/Local/Temp/YSXBL31N.htm
`
`9/10/2015
`
`

`
`Nearest neighbor search - Wikipedia, the free encyclopedia
`
`Page 8 of 8
`
`(cid:1389) Similarity Search Wiki (http://sswiki.tierra-aoi.net) – a collection of links, people, ideas,
`keywords, papers, slides, code and data sets on nearest neighbours
`(cid:1389) dD Spatial Searching (http://www.cgal.org/Pkg/SpatialSearchingD) in CGAL – the Computational
`Geometry Algorithms Library
`(cid:1389) Non-Metric Space Library (https://github.com/searchivarius/NonMetricSpaceLib) – An open
`source similarity search library containing realisations of various Nearest neighbor search
`methods.
`
`Retrieved from "https://en.wikipedia.org/w/index.php?
`title=Nearest_neighbor_search&oldid=679970516"
`
`Categories: Approximation algorithms Classification algorithms Data mining Discrete geometry
`Geometric algorithms Machine learning Numerical analysis Mathematical optimization
`Search algorithms
`
`(cid:1389) This page was last modified on 7 September 2015, at 22:59.
`(cid:1389) Text is available under the Creative Commons Attribution-ShareAlike License; additional terms
`may apply. By using this site, you agree to the Terms of Use and Privacy Policy. Wikipedia® is a
`registered trademark of the Wikimedia Foundation, Inc., a non-profit organization.
`
`Page 8 of 8
`file:///C:/Users/kla/AppData/Local/Temp/YSXBL31N.htm
`
`9/10/2015

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket