throbber
US008160400B2
`
`(12) United States Patent
`Snavely et al.
`
`(10) Patent No.:
`(45) Date of Patent:
`
`US 8,160,400 B2
`Apr. 17, 2012
`
`(54) NAVIGATING IMAGES USING IMAGE BASED
`GEOMETRICALIGNMENT AND OBJECT
`BASED CONTROLS
`
`2002/01 13872 A1* 8/2002 Kinjo ............................ 348,116
`2008. O150890 A1* 6, 2008 Bell et al. ....
`... 345,156
`2008. O150913 A1* 6/2008 Bell et al. ...................... 34.5/175
`
`(75) Inventors: Keith Noah Snavely, Seattle, WA (US);
`Stely G St. WA
`SS Icnard SZeIISKI, Sellevue,
`(US)
`(73) Assignee: Microsoft Corporation, Redmond, WA
`(US)
`
`(*) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 1245 days.
`
`(21) Appl. No.: 11/493,436
`9
`Jul. 25, 2006
`Prior Publication Data
`US 2007/O110338A1
`May 17, 2007
`
`(22) Filed:
`(65)
`
`Related U.S. Application Data
`(60) Provisional application No. 60/737,908, filed on Nov.
`17, 2005.
`(51) Int. Cl.
`(2006.01)
`G06K9/54
`(2006.01)
`G06K 9/46
`(52) U.S. Cl. ........ 382/305; 382/100; 382/154; 382/190;
`382/201:382/206: 707/E17.029
`(58) Field of Classification Search .................. 382/100,
`382/154, 190, 201, 206, 214, 216, 305, 325:
`707/E17.029
`See application file for complete search history.
`
`56
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`7.263,230 B2 * 8/2007 Tashman ....................... 382,232
`7,353,114 B1 * 4/2008 Rohlfetal. ....................... 7O2/5
`7.693,702 B1 * 4/2010 Kerner et al. ...
`TO3/22
`
`OTHER PUBLICATIONS
`TED Talk “Blaise Agueray Arcas demos Photosynth” filmed Mar.
`2007 at TED conference in Monterey, California, available to view at:
`http://www.ted.com/talkS/lang/eng/blaise aguera y arcas
`demos photosynth.html.*
`Arya, S. et al., “An optimal algorithm for approximate nearest neigh
`bor searching fixed dimensions,” Journal of the ACM 45, 1998, 6.
`891-923.
`Brown, M. et al., “Unsupervised 3D object recognition and recon
`struction in unordered datasets.” International Conference On 3D
`Imaging and Modeling, Ontario, Canada, Jun. 13-16, 2005, 56-63.
`Canny, J., “A computational approach to edge detection.” IEEE
`Trans. Pattern Anal. Mach. Intell., 1986, 8(6), 679-698.
`(Continued)
`
`Primary Examiner — Stephen Koziol
`(74) Attorney, Agent, or Firm — Woodcock Washburn LLP
`
`ABSTRACT
`(57)
`Over the past few years there has been a dramatic prolifera
`tion of digital cameras, and it has become increasingly easy to
`share large numbers of photographs with many other people.
`These trends have contributed to the availability of large
`databases of photographs. Effectively organizing, browsing,
`and visualizing Such seas. of images, as well as finding a
`particular image, can be difficult tasks. In this paper, we
`demonstrate that knowledge of where images were taken and
`where they were pointed makes it possible to visualize large
`sets of photographs in powerful, intuitive new ways. We
`present and evaluate a set of novel tools that use location and
`orientation information, derived semi-automatically usin
`y
`9.
`structure from motion, to enhance the experience of exploring
`Such large collections of images.
`
`9 Claims, 10 Drawing Sheets
`
`Map 501
`
`
`
`Information
`and Search
`tools 502
`
`Selectable
`Digital
`Photographs
`503
`
`Petitioner Apple Inc. - Ex. 1008, p. 1
`
`

`

`US 8,160.400 B2
`Page 2
`
`OTHER PUBLICATIONS
`Debevec, P. E. et al., “Modeling and rendering architecture from
`photographs: a hybrid geometry- and image-based approach. SIG
`GRAPH '96. Proceedings of the 23rd annual conference on Com
`puter graphics and interactive techniques, ACM Press, New York,
`NY, USA, 1996, 11-20.
`Yahoo, Inc., “Popular Tags on Flickr Photo Sharing.” Flickr, http://
`www.flickr.com/photoStags, 2006, 2 pages.
`Hartley, R. I. et al., Multiple View Geometry in Computer Vision,
`second ed. Cambridge University Press, 2004.
`Johansson, B. et al., “A System for automatic pose-estimation from a
`single image in a city scene.” IASTED Int. Conf. Signal Processing,
`Pattern Recognition and Applications, Crete, Greece, Jun. 25-28,
`2002, 68-73.
`Lourakis, M. I. et al., “The design and implementation of a generic
`sparse bundle adjustment Software package based on the levenberg
`marquardt algorithm.” Tech. Rep. 340, Institute of Computer Sci
`ence—FORTH. Heraklion, Crete, Greece, Aug. 2004.
`Mikolajczyk, K. et al., “A performance evaluation of local descrip
`tors.” IEEE Transactions on Pattern Analysis & Machine Intelli
`gence, 2005, 27(10), 1615-1630.
`
`Rubner, Y. et al., “A metric for distributions with applications to
`image databases.” Int'l Conf. On Computer Vision (ICCV), 1998,
`59-66.
`Schaffalitzky, F. et al., “Multi-view matching for un-ordered image
`sets, or “How do I organize my holiday Snaps?” Proceedings of the
`7" European Conference on Computer Vision, Copenhagen, Den
`mark, May 28-31, 2002, 1, 414-431.
`Sutherland, I. E., "Sketchpad: a man-machine graphical communi
`cation system.” Proceedings Spring Joint Computer Conference,
`1963, 329-346.
`Szeliski, R., “Image alignment and Stitching: A tutorial. Tech. Rep.
`MSR-TR-2004-92, Microsoft Research, 2004, 1-57.
`Werner, T. et al., “New techniques for automated architecture recon
`struction from photographs.” Proceedings of the 7th European Con
`ference on Computer Vision, Copenhagen, Denmark, May 28-31,
`2002, 2, 541-555.
`Microsoft Co., “What can you do with agazillion photos on a single
`database indexed by their locations?' World-Wide Media eXchange.
`WWMX. http://www.wwmx.org, Apr. 7, 2005, downloaded Sep. 27.
`2006, 2 pages.
`Yeh, T. et al., “Searching the web with mobile images for location
`recognition.” CVPR (2), 2004, 76-81.
`* cited by examiner
`
`Petitioner Apple Inc. - Ex. 1008, p. 2
`
`

`

`U.S. Patent
`
`Apr. 17, 2012
`
`Sheet 1 of 10
`
`US 8,160,400 B2
`
`
`
`
`
`Server 100
`
`Database 101
`
`Image
`Processing
`102
`
`
`
`
`
`
`
`Computer 110
`
`Display 120
`
`Image
`Processing
`112
`
`Database 111
`
`Selection
`Device
`
`Fig. I
`9.
`
`Petitioner Apple Inc. - Ex. 1008, p. 3
`
`

`

`U.S. Patent
`
`Apr. 17, 2012
`
`Sheet 2 of 10
`
`US 8,160,400 B2
`
`
`
`ITZ
`
`deu o, uonelys1601
`
`Petitioner Apple Inc. - Ex. 1008, p. 4
`
`

`

`U.S. Patent
`
`Apr. 17, 2012
`
`Sheet 3 of 10
`
`US 8,160,400 B2
`
`
`
`Frusta Showing Camera
`Positions and Orientations
`
`Map View of 3D
`Geometry
`
`Fig. 3
`
`Petitioner Apple Inc. - Ex. 1008, p. 5
`
`

`

`U.S. Patent
`
`Apr. 17, 2012
`
`Sheet 4 of 10
`
`US 8,160,400 B2
`
`Map 401
`
`Frustum 401
`
`
`
`Information
`and Search
`Tools 502
`
`Selectable
`Digital
`Photographs
`503
`
`Petitioner Apple Inc. - Ex. 1008, p. 6
`
`

`

`U.S. Patent
`
`Apr. 17, 2012
`
`Sheet 5 of 10
`
`US 8,160,400 B2
`
`translucent projection of
`a digital photograph 601
`
`
`
`Fig. 7 Selectable Digital
`Photograph 703
`
`Petitioner Apple Inc. - Ex. 1008, p. 7
`
`

`

`U.S. Patent
`U.S. Patent
`
`Apr. 17, 2012
`Apr. 17, 2012
`
`Sheet 6 of 10
`Sheet 6 of 10
`
`US 8,160,400 B2
`US 8,160,400 B2
`
`
`
`
`
`Fig. 8
`Fig. 8
`
`Petitioner Apple Inc. - Ex. 1008, p. 8
`
`Petitioner Apple Inc. - Ex. 1008, p. 8
`
`

`

`U.S. Patent
`
`Apr. 17, 2012
`
`Sheet 7 of 10
`
`US 8,160.400 B2
`
`
`
`Search Tools 900
`
`First ROW 91
`
`Petitioner Apple Inc. - Ex. 1008, p. 9
`
`

`

`U.S. Patent
`
`Apr. 17, 2012
`
`Sheet 8 of 10
`
`US 8,160,400 B2
`
`Search Tools 1001
`
`Object 1000
`
`Photograph
`Information
`1010
`
`Search
`Object
`1011
`Zoom ln, NY
`Zoom Out, a
`Fu Size
`1012
`Step Left,
`Step Right,
`Step Back
`1013
`
`Fig. I0
`
`Main Location 1002
`
`Search Tools 501
`
`
`
`Main Location 1100
`
`Fig. II
`
`Sequentially Ranked
`Alternate images 1102
`
`Petitioner Apple Inc. - Ex. 1008, p. 10
`
`

`

`U.S. Patent
`
`Apr. 17, 2012
`
`Sheet 9 of 10
`
`US 8,160,400 B2
`
`Tag 1202
`Tag 1203
`Tag 1204
`
`Attribute 1205
`
`Attribute 1206
`
`Attribute 12O7
`
`Attribute 1208
`
`
`
`Photo 1200
`
`Annotation 1211
`
`POrtion of Photo
`1201
`
`Annotation 1210
`
`Fig. 12
`
`Petitioner Apple Inc. - Ex. 1008, p. 11
`
`

`

`U.S. Patent
`U.S. Patent
`
`
`
`oy<
`
`—oSN~=he
`
`oS=3=M
`
`oVOEL9}OUdCOELO}OUdCOELO}OUd=oO
`
`
`
`
`
`0}0Ud
`
`US 8,160,400 B2
`“Aa==t+Ss
`
`\o=GOD§1sy—e
`
`£I 81-I
`
`
`
`
`
`(7}Sul]72Uaxe})(¢}Suu]JeUA}e})(Z}atu|yeUaxe})
`
`(1}alu]JeUB}e})
`
`Petitioner Apple Inc. - Ex. 1008, p. 12
`
`LOEL
`
`
`
`
`
`
`
`
`
`OZE}YUOHEDOTJeysues)OLE}YOIHO,payejouuy
`
`
`
`
`
`Petitioner Apple Inc. - Ex. 1008, p. 12
`
`

`

`US 8,160,400 B2
`
`1.
`NAVIGATING IMAGES USING IMAGE BASED
`GEOMETRICALIGNMENT AND OBJECT
`BASED CONTROLS
`
`CROSS-REFERENCE TO RELATED
`APPLICATIONS
`
`This application claims priority to U.S. Provisional Appli
`cation 60/737,908, filed Nov. 17, 2005.
`
`GOVERNMENT RIGHTS
`
`This invention was funded in part with grants (No. IIS
`0413198 and DGE0203031) by the National Science Foun
`dation. The University of Washington has granted a royalty
`free non-exclusive license to the U.S. government pursuant to
`35 USC Section 202(c)(4) for any patent claiming an inven
`tion subject to 35 Section 201.
`
`10
`
`15
`
`BACKGROUND
`
`2
`similarity (histogram distances such as the Earth Mover's
`Distance Rubner et al. 1998 are often used). A similarity
`score gives a basis for performing tasks such as creating
`spatial layouts of sets of images or finding images that are
`similar to a given image, but often the score is computed in a
`way that is agnostic to the objects in the scene (for instance,
`the score might just compare the distributions of colors in two
`objects). Therefore, these methods are most suitable for orga
`nizing images of classes of objects, such as mountains or
`SunSetS.
`Finally, several tools have been developed for organizing
`large sets of images contributed by a community of photog
`raphers. For example, the World-Wide Media eXchange
`(WWMX) is one such tool. WWMX allows users to contrib
`ute photographs and provide geo-location information by
`using a GPS receiver or dragging and dropping photos onto a
`map. However, the location information may not be
`extremely accurate, and the browsing interface of WWMX is
`limited to an overhead map view. Other photo-sharing tools,
`such as FLICKRR), do not explicitly use location information
`to organize users’ photographs, although FLICKRR Supports
`tools such as "Mappr” for annotating photos with location,
`and it is possible to link images in FLICKRR) to external
`mapping tools such as GOOGLE(R) Earth.
`Finally, the following references are relevant to the
`description of the invention.
`ARYA, S., MOUNT, D. M., NETANYAHU, N. S., SIL
`VERMAN, R., AND WU, A.Y. 1998. An optimal algorithm
`for approximate nearest neighbor searching fixed dimen
`sions. Journal of the ACM 45, 6, 891-923.
`BROWN, M., AND LOWE, D. G. 2005. Unsupervised 3D
`object recognition and reconstruction in unordered datasets.
`In International Conference on 3D Imaging and Modeling.
`CANNY.J. 1986. A computational approach to edge detec
`tion. IEEE Trans. Pattern Anal. Mach. Intell. 8, 6, 679-698.
`DEBEVEC, P. E., TAYLOR, C.J., AND MALIK, J. 1996.
`Modeling and rendering architecture from photographs: a
`hybridgeometry- and image-based approach. In SIGGRAPH
`96: Proceedings of the 23rd annual conference on Computer
`graphics and interactive techniques, ACM Press, New York,
`N.Y., USA, 11-20.
`Flickr. http://www.flickr.com.
`HARTLEY, R.I., AND ZISSERMAN, A. 2004. Multiple
`View Geometry in Computer Vision, second ed. Cambridge
`University Press, ISBN: 0521540518.
`JOHANSSON, B., AND CIPOLLA, R. 2002. A system for
`automatic pose-estimation from a single image in a city
`scene. In IASTED Int. Conf. Signal Processing, Pattern Rec
`ognition and Applications.
`LOURAKIS, M. I., AND ARGYROS, A. A. 2004. The
`design and implementation of a generic sparse bundle adjust
`ment software package based on the levenberg-marquardt
`algorithm. Tech. Rep. 340, Institute of Computer Science—
`FORTH, Heraklion, Crete, Greece, August.
`MIKOLAJCZYK, K., AND SCHMID, C. 2005. A perfor
`mance evaluation of local descriptors. IEEE Transactions on
`Pattern Analysis & Machine Intelligence 27, 10, 1615-1630.
`RUBNERY TOMASI, C., AND GUIBAS, L.J. 1998. A
`metric for distributions with applications to image databases.
`In Int'l Confon Computer Vision (ICCV), 59-66.
`SCHAFFALITZKY, F, AND ZISSERMAN, A. 2002.
`Multi-view matching for n-ordered image sets, or “How do I
`organize my holiday snaps?” In Proceedings of the 7" Euro
`pean Conference on Computer Vision, Copenhagen, Den
`mark, Vol. 1, 414-431.
`SUTHERLAND, I. E. 1964. Sketchpad: a man-machine
`graphical communication system. In DAC 64: Proceedings
`
`25
`
`30
`
`35
`
`40
`
`Digital cameras have become commonplace, and advances
`in technology have made it easy for a single person to take
`thousands of photographs and store all of them on a hard
`drive. At the same time, it has become much easier to share
`photographs with others, whether by posting them on a per
`Sonal web site, or making them available to a community of
`enthusiasts using a photo-sharing service. As a result, anyone
`can have access to millions of photographs through the Inter
`net. Sorting through and browsing Such huge numbers of
`photographs, however, is a challenge. At the same time, large
`collections of photographs, whether belonging to a single
`person, or contributed by thousands of people, create exciting
`opportunities for enhancing the browsing experience by gath
`ering information across multiple photographs. Some photo
`sharing services, such as FLICKRR), available at www.flick
`r.com, allow users to tag photos with keywords, and provide
`a text search interface for finding photos. However, tags alone
`often lack the level of specificity required for fine-grained
`searches, and can rarely be used to organize the results of a
`search effectively. For example, searching for “Notre Dame'
`in FLICKRR results in a list of thousands of photographs,
`sorted either by date or by other users interest in each photo.
`Within this list, photographs of both the inside and the outside
`of Notre Dame cathedral in Paris are interspersed with pho
`45
`tographs taken in and around the University of Notre Dame.
`Finding a photograph showing a particular object, for
`instance, the door of the cathedral, amounts to inspecting
`each image in the list. Searching for both “Notre Dame' and
`“door limits the number of images to a manageable number,
`but almost certainly excludes relevant images whose owners
`simply omitted the tag "door.”
`The computer vision community has conducted work on
`recovering camera parameters and scene geometry from sets
`of images. The work of Brown and Lowe 2005 and of
`Schaffalitzky and Zisserman 2002 involves application of
`automatic structure from motion to unordered data sets. A
`more specific line of research focuses on reconstructing
`architecture from multiple photographs, using semi-auto
`matic or fully automatic methods. The semi-automatic
`Facade system of Debevec, et al. 1996 has been used to
`create compelling fly-throughs of architectural scenes from
`photographs. Werner and Zisserman 2002 developed an
`automatic system for reconstructing architecture, but was
`only demonstrated on Small sets of photographs.
`Techniques have been developed for visualizing or search
`ing through large sets of images based on a measure of image
`
`50
`
`55
`
`60
`
`65
`
`Petitioner Apple Inc. - Ex. 1008, p. 13
`
`

`

`US 8,160,400 B2
`
`3
`of the SHARE design automation workshop, ACM Press, New
`York, N.Y., USA, 6.329-6.346.
`SZELISKI, R. 2005. Image alignment and stitching: A
`tutorial. Tech. Rep. MSR-TR-2004-92, Microsoft Research.
`WERNER, T., AND ZISSERMAN, A. 2002. New tech
`niques for automated architecture reconstruction from pho
`tographs. In Proceedings of the 7" European Conference on
`Computer Vision, Copenhagen, Denmark, vol. 2, 541-555.
`WWMX. World-Wide Media exchange.
`YEH, T, TOLLMAR, K., AND DARRELL. T. 2004.
`Searching the web with mobile images for location recogni
`tion. In CVPR (2), 76-81.
`
`SUMMARY
`
`10
`
`15
`
`25
`
`30
`
`35
`
`4
`FIG. 2 illustrates an exemplary method for determining
`relative and absolute location information for a plurality of
`digital photographs.
`FIG.3 illustrates an exemplary overheadmap interface that
`may be used, in one embodiment, for registering new photo
`graphs in a photo set, and in another embodiment, for brows
`ing photos by selecting from the map a camera location that is
`desired for viewing.
`FIG. 4 illustrates a plurality of user interface features in an
`exemplary “free-flight' browsing mode, in which a user can
`move a virtual camera in a representation of a 3D geometry
`and select desired camera positions for viewing a correspond
`ing digital photo.
`FIG. 5 illustrates a plurality of user interface features in an
`exemplary “image-based' browsing mode, in which a user
`may see a first photograph in a main location in the interface
`and also have access to a plurality of selectable alternate
`images that may have image content related to the first pho
`tograph.
`FIG. 6 illustrates another exemplary embodiment of a “free
`flight' browsing mode such as presented in FIG. 4.
`FIG. 7 illustrates another exemplary embodiment of a
`“image-based' browsing mode such as presented in FIG. 5.
`FIG. 8 illustrates a sample triangulation of a set of sparse
`3D points and line segments, used for morphing. The trian
`gulation is Superimposed on the image that observed the 3D
`features.
`FIG. 9 illustrates an exemplary information and search
`pane comprising a plurality of search tools that may be incor
`porated into embodiments of the invention.
`FIG.10 illustrates a plurality of user interface features in an
`exemplary “object-based’ browsing mode, in which a user
`can select an object and find other images also containing the
`object, and moreover may sort images by which have “best”
`views of the selected object.
`FIG.11 illustrates a plurality of user interface features in an
`exemplary “object-based' browsing mode, in which a user
`selected an object in FIG. 10, and was presented with a best
`view of the object in FIG. 11 along with a plurality of other
`views of the object in 1102, which may be ordered according
`to which have best views of object 1000.
`FIG. 12 illustrates an exemplary digital photograph 1200
`which may be presented in various user interfaces presented
`herein, and metadata relating to image attributes, tags, and
`annotations to portions of the photograph which may also be
`presented along with the photograph 1200.
`FIG. 13 illustrates images from a Notre Dame data set
`showing the cathedral from approximately the same view
`point, but at different times. The various images 1301-1304
`may be presented in a stabilized slide show. The annotation of
`the rose window 1310 has been transferred from image 1301
`to the other three images 1302-1304.
`
`DETAILED DESCRIPTION
`
`Certain specific details are set forth in the following
`description and figures to provide a thorough understanding
`of various embodiments of the invention. Certain well-known
`details often associated with computing and Software tech
`nology are not set forth in the following disclosure, however,
`to avoid unnecessarily obscuring the various embodiments of
`the invention. Further, those of ordinary skill in the relevant
`art will understand that they can practice other embodiments
`of the invention without one or more of the details described
`below. Finally, while various methods are described with
`reference to steps and sequences in the following disclosure,
`the description as Such is for providing a clear implementa
`
`Many collections of photos can be organized, browsed, and
`visualized more effectively using more fine-grained knowl
`edge of location and orientation. As a simple example, if in
`addition to knowing simply that a photograph was taken at a
`place called “Notre Dame' we know the latitude and longi
`tude the photographer was standing along with the precise
`direction he was facing, then an image of the door to Notre
`Dame cathedral can be found more easily by displaying
`search hits on a map interface, and searching only among the
`images that appear in front of the cathedral door.
`As well as improving existing search tools, knowing where
`a photo was taken makes many other browsing modes pos
`sible. For instance, relating images by proximity makes it
`possible to find images that were taken nearby, or to the left of
`or north of a selected image, or to find images that contain a
`close-up of a part of another image. With knowledge of loca
`tion and orientation, it is easier to generate morphs between
`similar photographs, which can make the relationship
`between different images more explicit, and a browsing expe
`rience more compelling. Location and orientation informa
`tion can be combined with other metadata, Such as date, time,
`photographer, and knowledge of correspondence between
`images, to create other interesting visualizations, such as an
`animation of a building through time. With additional knowl
`edge of the geometry of the scene, location information also
`allows tags associated with parts of one photograph to be
`transferred to other similar photographs. This ability can
`improve text searches, and the access to additional informa
`tion for each photo can further enhance the browsing experi
`CCC.
`These browsing tools can be applied to a single user's
`photo collection, a collection of photos taken for a special
`purpose (such as creating a virtual tour of a museum), or a
`database containing photos taken by many different people.
`We also describe herein new tools and interfaces for visu
`alizing and exploring sets of images based on knowledge of
`three-dimensional (3D) location and orientation information,
`and image correspondence. We present semi-automatic tech
`niques for determining the relative and absolute locations and
`orientations of the photos in a large collection. We present an
`interactive image exploration system. These and other aspects
`and embodiments of the invention are described in detail
`below.
`
`40
`
`45
`
`50
`
`55
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`60
`
`The systems and methods for navigating images using
`image based geometric alignment and object based controls
`in accordance with the present invention are further described
`with reference to the accompanying drawings in which:
`FIG. 1 illustrates a general operating environment for the
`invention.
`
`65
`
`Petitioner Apple Inc. - Ex. 1008, p. 14
`
`

`

`US 8,160,400 B2
`
`10
`
`5
`tion of embodiments of the invention, and the steps and
`sequences of steps should not be taken as required to practice
`this invention.
`The systems and methods for navigating images using
`image based geometric alignment and object based controls
`described herein have been applied to a variety of data sets
`comprising images from various locations. Thus, the various
`techniques and figures discussed may make occasional refer
`ence to the tested data sets. Tested data sets include, for
`example, a set of photographs of the Old Town Square in
`Prague, Czech Republic, a set of photographs taken along the
`Great Wall of China, a set of photos resulting from an internet
`search for “notredame AND paris,” a set of photos resulting
`from an internet search for “halfdome ANDYosemite a set
`of photos resulting from an internet search for “trevi AND
`15
`rome.” and a set of photos resulting from an internet search for
`“trafalgarsquare.”
`General Operating Environment
`FIG. 1 presents a general operating environment for
`aspects of the invention. In general, computer hardware and
`software such as that depicted in FIG.2 may be arranged in
`any configuration and using the full extent of presently avail
`able or later developed computing and networking technolo
`gies. In one configuration, a server 100 may be connected to
`a network 105 such as the internet. The server 100 may
`receive and respond to requests from client computers such as
`110 that are also coupled to the network 105. Server 100 may
`be equipped with or otherwise coupled to a database or data
`store 101 containing images such as digital photographs, as
`well as metadata or other useful information that can be used
`to categorize and process the images. Server 100 may also be
`equipped with or otherwise coupled to image processing
`logic 102 for carrying out various processing tasks as dis
`cussed herein.
`Thus, in one arrangement, a client 110 may request data
`from a server 100 via network 105. The request may be in the
`form of a browser request for a web page, or by other means
`as will be appreciated by those of skill in the art. The server
`100 may provide the requested information, which can be
`used by the client 110 to present a user interface on display
`120. A user can interact with the user interface by activating
`selectable objects, areas, icons, tools and the like using a
`selection device 130 such as a mouse, trackball or touchpad.
`In connection with providing Such information, the server
`100 may access database 101 for appropriate images and may
`45
`apply image processing logic 102 as necessary. Certain image
`processing logic in 102 may also be applied before and after
`the client request to properly prepare for and if necessary
`recover from satisfaction of the client 110 request. In connec
`tion with displaying the requested information, client 110
`may, in some embodiments, access its own database 111 and
`image processing 112, for example when the client 110 and
`server each contain information to be presented in a particular
`user interface on electronic display 120. In other embodi
`ments, the client 110 may simply rely on the server 100 to
`provide Substantially all of the image processing functions
`associated with carrying out the invention.
`In another arrangement, the client 110 may implement the
`systems and methods of the invention without relying on
`server 100 or network 105. For example, client 110 may
`contain images in database 111, and may apply image pro
`cessing logic 112 to the images to produce a user interface
`that can be presented to a user via display 120. Thus, while the
`invention can be performed over a network using client/server
`or distributed architectures as are known in the art, it is not
`limited to Such configurations and may also be implemented
`on a stand-alone computing device.
`
`55
`
`6
`The description and figures presented herein can be under
`stood as generally directed to hardware and software aspects
`of carrying out image processing logic Such as 102 and 112
`that produces at least in part a user interface that may be
`presented on an electronic display 120. Many of the remain
`ing figures, as will be appreciated, are directed to exemplary
`aspects of a user interface that may be presented on a display
`120. Aspects of the invention comprise novel features of such
`user interfaces, as will be appreciated, and optionally also
`Supporting logic 112, 102 that produces such aspects of user
`interfaces or that processes images such that they may be
`presented in a user interface as disclosed herein.
`Determining Geo-Location
`In order to effectively use our browsing tools on a particu
`lar set of images, we need fairly accurate information about
`the location and orientation of the camera used to take each
`photograph in the set. In addition to these extrinsic param
`eters, it is useful to know the intrinsic parameters, such as the
`focal length, of each camera. How can this information be
`derived? GPS is one way of determining position, and while
`it is not yet common for people to carry around GPS units, nor
`do all current GPS units have the accuracy we desire, a first
`Solution is to equip digital cameras with GPS units so that
`location and orientation information can be gathered when a
`photograph is taken. As for the intrinsic parameters, many
`digital camera models embed the focal length with which a
`photo was taken (as well as other information, such as expo
`Sure, date, and time) in the Exchangeable Image File Format
`(EXIF) tags of the image files. EXIF is the present standard
`for image metadata, but any image metadata may also be
`used. However, EXIF and/or other metadata values are not
`always accurate.
`A second solution does not rely on the camera to provide
`accurate location information; instead, we can derive location
`using computer vision techniques. Brown and Lowe 2005
`provides useful background for this discussion. We first
`detect feature points in each image, then match feature points
`between pairs of images, keeping only geometrically consis
`tent matches, and run an iterative, robust structure from
`motion procedure to recover the intrinsic and extrinsic cam
`era parameters. Because structure from motion only esti
`mates the relative position of each camera, and we are also
`interested in absolute coordinates (e.g., latitude and longi
`tude), we use a novel interactive technique to register the
`recovered cameras to an overhead map. A flowchart of the
`overall process is shown in FIG. 2.
`As can be observed in FIG. 2, a set of input images 200 can
`be processed through a variety steps as may be carried out by
`one or more computer software and hardware components, to
`ultimately produce information regarding the absolute loca
`tion of the input images (photographs), and the 3D points
`within Such images 212. Exemplary steps can include key
`point detection 201, keypoint matching 202, estimating epi
`polar geometry and removing outliers 203, applying a struc
`ture from motion procedure 204 that produces an output
`comprising the relative locations of photographs and 3D
`points 210, and map registration 211.
`The exemplary structure from motion procedure 204 may
`comprise choosing a pair of images I and I with a large
`mumber of matches and wide baseline 205, running bundle
`adjustment 206, choosing a remaining image I with the most
`matches to existing points in the scene and adding image I to
`the optimization 207, again running bundle adjustment as
`necessary 208, adding well-conditioned points to the optimi
`Zation 209. Additional images can be processed as necessary
`by returning to step 206. After all images are processed,
`output 210 can be used in map registration 211 as described
`
`25
`
`30
`
`35
`
`40
`
`50
`
`60
`
`65
`
`Petitioner Apple Inc. - Ex. 1008, p. 15
`
`

`

`7
`above. Various exemplary aspects of a system Such as that of
`FIG. 2 are discussed in greater detail in the below sections,
`entitled “keypoint detection and matching.” “structure from
`motion.” “interactive registration to overhead map.” “regis
`tering new photographs.” and “line segment reconstruction.”
`Keypoint Detection and Matching
`Detecting feature points in a plurality of images and match
`ing feature points between two or more of said plurality of
`images may comprise the following procedures for estimat
`ing image location. The first step is to use a keypoint detector,
`Such as any of the various keypoint detectors described in
`Mikolajczyk and Schmid 2005. A keypoint detector detects
`keypoints for each image. We then match keypoint descrip
`tors between each pair of images. This can be done, for
`example, using the approximate nearest neighbors technique
`of Arya et al. 1998. Any other acceleration technique could
`also be used, including but not limited to hashing or context
`sensitive hashing. For each image pair with a large enough
`number of matches, we estimate a fundamental matrix using,
`for example Random Sampling Consensus (RANSAC), or
`any other robust estimation technique, and remove the
`matches that are outliers to the recovered fundamental matrix.
`After finding a set of putative, geometrically consistent
`matches, we organize the matches into a set of tracks, where
`a track is simply a set of mutually matching keypoints; each
`track ideally contains projections of the same 3D point.
`If the keypoints in every image form the vertex set of a
`graph, and there is an edge in the graph between each pair of
`matching keypoints, then every connected component of this
`graph comprises a track. However, the tracks associated with
`Some connected components might be inconsistent; in par
`ticular, a track is inconsistent if it contains more than one
`keypoint for the same image. We keep only the consistent
`tracks containing at least two keypoints for the next phase of
`the location estimation procedure. Note that this simple rejec
`tion of nominally inconsistent tracks will not reject all physi
`cally inconsistent tracks (i.e., tracks that contain keypoints
`that are projections of different 3D points).
`Structure from Motion
`Next, we wish to determine a plurality of relative locations
`40
`of said images. This step can comprise recovering a set of
`camera parameters and a 3D location for each track. We make
`the common assumption that the intrinsic parameters of the
`camera have a single degree of freedom, the fo

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket