throbber
Generating Summaries and Visualization for Large
`Collections of Geo-Referenced Photographs
`
`Alexandar Jaffe,
`Mor Naaman
`Yahoo! Research Berkeley
`Berkeley, CA, USA
`ajaffe@cs.washington.edu,
`mor@yahoo-inc.com
`
`Tamir Tassa
`Department of Mathematics
`and Computer Science
`The Open University of Israel
`Ra’anana, Israel
`tamirta@openu.ac.il
`
`Marc Davis
`Yahoo! Inc.
`Sunnyvale, CA, USA
`marcd@yahoo-inc.com
`
`ABSTRACT
`We describe a framework for automatically selecting a sum-
`mary set of photos from a large collection of geo-referenced
`photographs. Such large collections are inherently difficult
`to browse, and become excessively so as they grow in size,
`making summaries an important tool in rendering these col-
`lections accessible. Our summary algorithm is based on spa-
`tial patterns in photo sets, as well as textual-topical patterns
`and user (photographer) identity cues. The algorithm can
`be expanded to support social, temporal, and other factors.
`The summary can thus be biased by the content of the query,
`the user making the query, and the context in which the
`query is made.
`A modified version of our summarization algorithm serves
`as a basis for a new map-based visualization of large collec-
`tions of geo-referenced photos, called Tag Maps. Tag Maps
`visualize the data by placing highly representative textual
`tags on relevant map locations in the viewed region, effec-
`tively providing a sense of the important concepts embodied
`in the collection.
`An initial evaluation of our implementation on a set of
`geo-referenced photos shows that our algorithm and visual-
`ization perform well, producing summaries and views that
`are highly rated by users.
`
`Categories and Subject Descriptors
`H.3.3 [Information Systems]: Information Storage and
`Retrieval—Information Search and Retrieval
`
`General Terms
`Algorithms, Human Factors
`
`Keywords
`Photo Collections, Geo-Referenced Photos, Summarization,
`Clustering, Image Search, Collection Visualization
`
`Permission to make digital or hard copies of all or part of this work for
`personal or classroom use is granted without fee provided that copies are
`not made or distributed for profit or commercial advantage and that copies
`bear this notice and the full citation on the first page. To copy otherwise, to
`republish, to post on servers or to redistribute to lists, requires prior specific
`permission and/or a fee.
`MIR’06, October 26–27, 2006, Santa Barbara, California, USA.
`Copyright 2006 ACM 1-59593-495-2/06/0010 ...$5.00.
`
`INTRODUCTION
`1.
`With the popularization of digital photography, people
`are now capturing and storing far more photographs than
`ever before. Indeed, we are moving towards Susan Sontag’s
`1977 vision of a world where “everything exists to end up
`in a photograph” [18]. As a result, billions of images, many
`of which are on the Web, constitute a growing record of
`our culture and shared experience. Viewing and interact-
`ing with such collections has a broad social and practical
`importance. However, these collections are inherently diffi-
`cult to navigate, due to their size and the inability of com-
`puters to understand the content of the photographs. The
`prospects of ‘making sense’ of these photo collections has
`become largely infeasible.
`Some steps forward have been made through geo-referencing
`of digital photographs, whereby photos are connected to
`metadata describing the geographic location in which they
`were taken [12, 19]. Capture devices such as camera-phones
`and GPS-enabled cameras can automatically associate geo-
`graphic data with images1 and will significantly increase the
`number of geo-referenced photos available online. Already,
`an increasing number of photographs on the Web are associ-
`ated with GPS coordinates. Such geo-referenced photos can
`be categorized geographically or displayed on a digital map,
`providing a rich spatial context in which to view subsets of a
`collection. Yet even here, we run into the problem of being
`able to filter, sort and summarize the data. The viewable
`space inevitably becomes cluttered after the data set has sur-
`passed a certain size, with overlapping photographs making
`viewing and finding specific photographs ever more difficult
`as the collection grows. Figure 1(a) exemplifies the problem
`by showing an unfiltered view of San Francisco photos.
`Our goal is thus to facilitate a system which can automat-
`ically select representative and relevant photographs from a
`particular spatial region. A result of our algorithm is illus-
`trated in Figure 1(b), where a limited set of eleven photos
`that were selected by our system are marked on the San
`Francisco map. Such collection summaries will enable users
`to find items more easily and browse more efficiently through
`large scale geo-referenced photo collections, in a manner that
`improves rather than degrades with the addition of more
`photos.
`Selecting the most representative photos from a given re-
`gion is a difficult task for several reasons. For instance:
`
`1See,
`ZoneTag
`the
`example,
`for
`http://zonetag.research.yahoo.com.
`
`application
`
`at
`
`MemoryWeb Ex. 2003
`Apple v. MemoryWeb – IPR2022-00032
`1 of 10
`
`

`

`• Image analysis alone is poor at understanding the se-
`mantic content of an image, making visual relevance in-
`sufficient for summarization.
`• In multi-user sets, the biases of one user’s data may skew
`the selection towards generally insignificant subjects.
`• It is difficult for an automated system to learn and assess
`the relevance of photos without appropriate models of
`human interest, as the notion of relevance is not well
`defined, and often subjective.
`
`(a) All San Francisco photos
`
`(b) An automatic summary of San
`Francisco photos
`
`Figure 1: All San Francisco photos from our dataset
`of 2200 geo-referenced photos, versus an automatic
`summary of photos, as generated by our system.
`One summary photo is enlarged for illustration.
`
`We have designed and implemented a simple algorithm
`that attempts to address the challenges stated above. Our
`algorithm utilizes metadata-based heuristics that capital-
`ize on patterns in users’ photographic behavior. Foremost
`among these heuristics is the premise that photographs taken
`at a location ‘vote’ for the presence of something interesting
`at that location.
`Our algorithm considers a multitude of spatial, social and
`temporal metadata (such as where the photo was taken, by
`whom, at what time), as well as textual-topical patterns in
`the data, such as textual tags associated with the photo.
`Furthermore, the algorithm can be tuned to bias the set
`of results using various factors such as the social network
`distance of the photographers to the user making the query.
`The summarization algorithm can be used in a number
`of applications. For example, the algorithm could be used
`for geographic image search, returning a summary of pho-
`tographs from a region in response to a search query (that
`can be specified as a text term or a map region). In addi-
`
`tion, the summarization can be used to assist in map-based
`browsing of images, for example, by selecting a subset of rep-
`resentative photos to show according to the map’s coverage
`and zoom level. With or without a map, summarization can
`help in browsing one’s photos or a group of individuals’ pho-
`tos to get an overview of a location or discover personally
`interesting areas for further exploration; automatic travel
`guide is a scenario that comes to mind.
`Key insights from our algorithm helped us generate a new
`way of visualizing large collections of geo-referenced pho-
`tographs. We use the techniques we developed to gener-
`ate map-based tag clouds, which are described in Section 6.
`“Tag Maps”, as we call them, can be used to visualize the
`contents of the collection, giving a quick overview of the
`textual-topical concepts that appear in the data as well as
`their location, importance and recency. The photos them-
`selves are not necessarily part of the visualization. Tag Maps
`concepts can be applied to many other multimedia (or other)
`applications that exhibit patterns in text and locations.
`To summarize, the contributions of this paper are:
`• A new approach for generating summaries of photo col-
`lections based on geographic as well as other contextual
`data associated with the photographic media (Section 3).
`• An outline of the requirements and the useful features
`for these context-based summaries (Section 3).
`• An implementation of an algorithm that generates such
`summaries using a public set of “geo-tagged” photographs
`(Section 4).
`• A new map-based visualization technique for photo col-
`lections that helps indicate both the important regions
`on the map and the textual concepts represented in those
`regions (Section 6).
`• A proposed evaluation for geo-referenced collection sum-
`maries; we use this evaluation to compare our algorithm
`to several baseline methods (Section 7).
`
`In addition, Section 5 briefly touches on potential applica-
`tions. We begin by discussing the related work.
`
`2. RELATED WORK
`Since 2003, a number of different research efforts have
`considered geographic location information associated with
`photographs. In [19], the authors describe WWMX, a map-
`based system for browsing a global collection of geo-referenced
`photos. Several similar map-based photo browsing systems
`appeared on the Web in the last few years2, most of them
`using “geo-tagged” images from Flickr [5] for content. All
`of those systems face the problem of clutter in the map in-
`terface: as the number of photos available in each location
`grow, the full set of images cannot possibly be shown on
`the map at once. While some systems default to showing
`the most recent photos, the WWMX system tries to handle
`clutter by consolidating multiple photograph markers into a
`single marker according to the zoom level. In our system, we
`avoid clutter by utilizing the additional metadata to select
`the best set of photographs from a region, providing poten-
`tially a better selection than the “most recent” strategy, and
`a more meaningful one than the “consolidation” approach.
`Several projects [12, 15] use geographic data to organize
`photo collections in novel ways, for example, by detecting
`
`2like http://geobloggers.com and http://mappr.com
`
`MemoryWeb Ex. 2003
`Apple v. MemoryWeb – IPR2022-00032
`2 of 10
`
`

`

`significant events and locations in a photo collection. Such
`structures could indeed be the basis for collection summa-
`rization. However, these projects considered personal photo
`collections only, and did not consider public shared pools of
`photos.
`Looking at shared collections, some research [3, 4, 11,
`14, 16] tries to use context (mostly location) information
`and sometimes visual features to identify landmarks in pho-
`tographs. Visual analysis could be integrated in our system—
`once our algorithm recognizes significant locations, it can
`attempt to select a photo of a prominent landmark there.
`Work in both [3, 11] considers, in a similar fashion to this
`work, patterns and distributions of textual terms that are
`associated with geo-referenced digital photos, and uses them
`to generate tag suggestions for new photographs. However,
`those projects are not designed to support collection sum-
`marization.
`In the absence of location metadata, temporal metadata
`was also considered in the past for the purpose of photo col-
`lection summarization.
`In [8], Graham et al. describe an
`algorithm to heuristically select representative photos for a
`given time period in a personal collection, utilizing patterns
`in human photo-taking habits (later studied in [6]). Ad-
`ditional time-based work aims to detect events in personal
`collections (e.g., [2]), which could be the basis for collection
`summarization. However, again, all these projects consid-
`ered single-photographer collections only. In public collec-
`tions of timestamped photos, only when additional meta-
`data is available (for example, the fact that all shared pho-
`tos were taken in the same event), there exists the potential
`for time-based summaries [13].
`Another possible approach for summarizing photo collec-
`tions is using textual tags that are associated with the im-
`age.
`In Flickr [5], popular tags have pre-computed clus-
`ters of related tags. For example, the “San Francisco” tag
`on Flickr has several associated tag clusters3: “california,
`bridge, goldengate”; “baseball, giants, sbcpark”, “deyoung,
`museum”, “sfo, airport” and “halloween, castro”. These
`clusters can potentially be used to generate a summary of
`San Francisco photos. This approach is not location-based,
`and the clusters often do not represent concepts that are dis-
`tinct (e.g., one of Boston’s clusters is “massachusetts, city,
`cambridge, building, architecture”). The tag clusters could
`possibly be used in conjunction with our method. In fact,
`we are using some tag-based computation to select summary
`photos. More directly related is a tag subsumption model
`[17] that can use the tag corpus to derive tags that are sub-
`sumed, for example, by the tag “San Francisco”. Again, this
`approach can be integrated with our location-based sum-
`maries.
`These projects, and others, consider various ways to al-
`leviate the difficulties of browsing large collections of pho-
`tographs, but do not provide effective ways to summarize
`multi-user photo collections or visualize them using maps.
`We believe that the potential of a geographic-based summa-
`rization method is significant, especially in conjunction with
`the current state of the art.
`
`3. THE SUMMARIZATION APPROACH
`In this section, we define the problem of summarizing a
`photo collection, then describe the guidelines and insights
`
`3http://flickr.com/photos/tags/sanfrancisco/clusters/
`
`that have informed the implementation of our summariza-
`tion algorithm. In Section 4 we provide the details of the
`algorithm.
`We formalize the summarization problem as that of pro-
`ducing a ranking on the collection in question.
`In other
`words, we summarize a set of photos by ordering the set
`and selecting the top ranked photos. More formally, we
`are looking at the following problem: Given an album of
`n photos, A = {P1, . . . , Pn}, we wish to find an ordering
`ω of A such that any k-length prefix of ω(A) is the best
`possible k-element summary of A. A summary is loosely
`defined as a subset that captures representativeness, rele-
`vance, and breadth in the original collection. These notions
`are captured through some of the following metadata at-
`tributes that are associated with the photos:
`• Location. Photo Pi was taken at location (xi, yi).4
`• Time. Photo Pi was taken at time ti.
`• Photographer. Photo Pi was taken by user ui.
`• Tags. Photo Pi was manually assigned the list of tags
`(i.e., textual labels) wi.
`• Quality. Photo Pi is associated with an externally de-
`rived parameter qi that represents its quality.
`• Relevance. Photo Pi is associated with a relevance fac-
`tor ri. Relevance can include arbitrary bias based on
`parameters such as recency, the time of day, the day of
`the week, the social network of the user, user attributes,
`and so forth.
`
`Note that The relevance attribute can introduce subjectiv-
`ity, allowing us, for example, to tune the results to the user
`who is making the query and the context of the query.
`While there is no accurate formal model for what con-
`stitutes a “good” summary of a collection of geo-referenced
`photographs, we follow a few simple heuristics that try to
`model and capture human attention, as reflected in the set
`of photos taken in a region. Among these heuristics are the
`notions that:
`• Photographs are taken at locations that provide views of
`some important object or landmark.
`• A location is more relevant if the photos around it were
`taken by a large number of distinct photographers.
`• If available, location-based patterns of textual tags can
`reflect the presence of an important landmarks in a lo-
`cation.
`
`In addition to the heuristics listed above, a desired sum-
`mary would also (a) represent a broad range of subjects,
`instead of thoroughly displaying a few, and (b) allow per-
`sonal or query bias to modify the algorithm’s results.
`In the next section we describe the summarization algo-
`rithm that we developed based on these guidelines.
`
`4. ALGORITHM FOR SUMMARIZATION
`As described in Section 3, our summarization algorithm
`produces a ranking of the photos in the collection; each pre-
`fix of this ranking can serve as a collection summary of the
`corresponding size. Producing this ranking is a two-step
`process, a clustering step followed by a ranking step on the
`resulting clustering hierarchy. In particular:
`
`4Notice that this ‘photo origin location’ is different than the
`‘target location’, the location of the photographed object.
`
`MemoryWeb Ex. 2003
`Apple v. MemoryWeb – IPR2022-00032
`3 of 10
`
`

`

`1. We apply a modified version of the Hungarian clus-
`tering algorithm [7] to our collection of photographs.
`This algorithm receives the photograph locations as an
`input, and organizes them into a hierarchical clustered
`structure.
`
`2. We compute a score for each cluster in the hierarchy.
`
`3. Finally, we generate a flat ordering of all photos in the
`dataset by recursively ranking the sub-clusters at each
`level, starting from the leaf clusters, and ending at the
`root.
`
`Note that while the clustering is a fixed one-time compu-
`tation, the ranking step can be re-evaluated, allowing users
`to specify a personal bias or preference towards any of the
`metadata features. Alternatively, the ranking can also be
`modified to utilize implicit bias in the query context (e.g.,
`the identity of the user making the query).
`To illustrate the process and the scoring mechanism we
`use a hypothetical example, presented in Figure 2. In this
`figure, a leaf node represents a single photograph, annotated
`with the identity of the photographer and a single textual
`tag (in practice, of course, more tags can be associated with
`each photo). The tree represents the hierarchy created by
`the clustering algorithm.
`Next, we describe the algorithm in detail. First, we dis-
`cuss the clustering algorithm that produces the clustering
`hierarchy. Then, we describe how to produce a ranking of
`all photos in a single node of the above mentioned clustering
`hierarchy, assuming that all nodes in the hierarchy are as-
`sociated with scores. Finally, we show how we can generate
`such scores for the nodes in the hierarchy.
`4.1 Clustering
`Our method requires a hierarchical clustering algorithm;
`as noted above, we use the Hungarian clustering algorithm
`[7]. This algorithm identifies a hierarchy of clusters within
`a given dataset of n points, based only on the distances
`between those points.
`In our system, the input to the clustering algorithm is a
`set of points in the plane, representing the locations of the
`photographs,5
`A = {(xi, yi) ∈ R2, 1 ≤ i ≤ n} .
`(1)
`The output is a clustering of these photo locations, C(A),
`where C(A) is a tree. Each node in the tree represents a
`subset of A, the root of the tree represents the entire set,
`the children of each node are a partition (or clustering) of
`the subset that is associated with that parent node, and the
`leaves of the tree are the points in A.
`The classical Hungarian method is an efficient algorithm
`for solving the problem of minimal-weight cycle cover. In
`that problem, one is given a weighted graph and needs to
`find a cover of that graph by disjoint cycles with minimal to-
`tal weight. This algorithm serves as the basic building block
`for a clustering method that is dubbed The Hungarian clus-
`tering method. Viewing A as a complete weighted graph,
`where the weight of each edge is the Euclidean (geographic,
`in this case) distance between the two nodes that it connects,
`the Hungarian clustering method subjects that graph to the
`classical Hungarian method. The disjoint cycles, produced
`5For convenience, we use the same notation, A, to denote
`the photo set as well as the set of photo locations.
`
`by the Hungarian method, are viewed as a partition of the
`data-set. The clustering algorithm then proceeds by hier-
`archical merging of the disjoint cycles, until the produced
`clusters are perceived as complete clusters (through some
`”completeness” criteria) and then the hierarchical merging
`stops. We use the Hungarian Clustering algorithm because
`of two features that it boasts: It is an hierarchical clustering
`algorithm, and it does not depend on the number of clusters
`as an input.
`The clustering hierarchy C(A) is used to create a ranking
`of all photos. In order to describe the ranking algorithm,
`let us first assume that the nodes in the hierarchy have been
`assigned a score that embodies the importance of the cluster
`of photos that corresponds to that node.
`4.2 Ranking Framework
`Given a hierarchical clustering C(A) on the locations of
`all photographs, and a score for every node (cluster) in that
`hierarchy, our goal is then to produce a ranking of all items
`in the collection. We describe a recursive interleaving algo-
`rithm that uses the clustered structure and the correspond-
`ing scores in order to produce a natural flat ordering. In the
`next section we outline a way to generate the scores.
`Going bottom up, the ranking algorithm considers each
`node B in the hierarchy C(A) and outputs an ordering ω(B)
`that represents a ranking of photos in B. Finally, when ex-
`ecuting on the root node that corresponds to the entire set
`A, we get the ordered sequence, S := ω(A), that describes
`a ranking of all photos in A. Applying this algorithm to the
`example in Figure 2, a possible output could be the ranking
`S = (6, 8, 4, 5, 7), where the numbers in the sequence corre-
`spond to the numerals of the leaves in the tree in Figure 2.
`For simplicity of notations, we describe the action of the
`algorithm on the root node, A. Actions on other nodes
`are performed in the same manner. We assume that we
`i=1 Ai; namely, node
`A has m direct descendents. In addition, assume that the
`photos in each sub-cluster of A have been ranked recursively
`according to this algorithm, and that each of the nodes Ai
`is associated with some score s(Ai) such that (without loss
`of generality)
`
`identified m sub-clusters in A, A =Sm
`
`s(A1) ≥ s(A2) ≥ · · · ≥ s(Am) .
`
`(2)
`
`Figure 2: A sample hierarchy; the leaves are photos,
`each associated with a user and a single tag.
`
`Our goal is to produce a ranking that would balance the
`contradicting properties of depth and breadth of coverage.
`In the field of Information Retrieval, some measures are used
`to balance results in terms of relevance (depth) and broad-
`ness (breadth) [1, 10, 20]; for various reasons, these measures
`are not applicable here. For our problem, depth requires
`
`1
`
`3
`
`6
`
`7
`
`8
`
`2
`
`5
`
`4
`
`(U1,Bridge) (U1,Car) (U2,Bridge) (U3,Car)(U3,Museum)
`
`MemoryWeb Ex. 2003
`Apple v. MemoryWeb – IPR2022-00032
`4 of 10
`
`

`

`that the photos in a cluster are selected from sub-clusters
`roughly according to the ratio of their scores. For example,
`consider the second level of the hierarchy in Figure 2, which
`consists of two clusters, denoted by C2 and C3, and assume
`that s(C2) : s(C3) = 5 : 3. We would like to interleave the
`photos from these two clusters so that in any section of the
`sequence S, the frequencies of photos from the two clusters
`relate to each other as closely as possible to their score ratio
`in the whole dataset, i.e., 5 : 3. On the other hand, breadth
`requires that each sub-cluster should be represented to some
`extent early in the ranking of its parent cluster.
`In order to attain some amount of depth, breadth, and
`consistency, we interleave photos from sub-clusters in the
`following manner. The ordered sequence of photos for A
`will have two parts: a short header H followed by a trailer
`T , where S(A) = H||T .
`The header H will include a photo from all prominent
`sub-clusters. To that end, we define a threshold 0 < w < 1,
`and then a cluster Ai is deemed prominent if
`Pm
`s(Ai)
`≥ w .
`j=1 s(Aj)
`Assume that there are m0 prominent sub-clusters among the
`m sub-clusters, with 0 ≤ m0 ≤ m. Then in view of assump-
`tion (2), the header is
`H = (A1,1, A2,1 · · · , Am0,1) ,
`where Ai,1 is the most relevant photo from cluster Ai. This
`header is then followed by a trailer, T .
`In order to gen-
`erate the trailer, we first remove from each sub-cluster the
`photo that was selected for the header, recalculate the sub-
`cluster scores, and then assign each sub-cluster a probabil-
`ity that equals its score divided by the sum of scores of all
`sub-clusters. Those probabilities are then used to randomly
`select a sub-cluster. If sub-cluster Ai was selected, we re-
`move its top-ranked photo, append it to T and repeat, until
`all photos have been selected.
`By now we have described how to generate the cluster
`hierarchy and produce a ranking on the photos in that hi-
`erarchy, under the assumption that all nodes are associated
`with scores. We therefore proceed to describe a key aspect
`of the algorithm: the computation of the scores for a given
`cluster (node).
`4.3 Scoring Clusters
`The score of a cluster Ai depends on several factors, in-
`cluding the following:
`
`1. The sum of relevance factors (see Section 3) of all pho-
`tos in the cluster,
`
`X
`
`ρi =
`
`rj .
`
`Pj∈Ai
`
`2. The tag-distinguishability of the cluster, τi (explained
`below).
`
`3. The photographer-distinguishability of the cluster, φi
`(explained below).
`
`We define the cluster density as
`
`δi = 1/(1 + σi) .
`
`(3)
`
`5. The sum of image qualities (see Section 3) of all photos
`in the cluster,
`
`X
`
`κi =
`
`qj .
`
`Pj∈Ai
`
`While most of the above factors are derived only from
`data that is contained in the photo collection, the relevance
`factor can introduce bias by subjective requirements. The
`relevance factor ri of a photo Pi can incorporate parameters
`such as recency, the time of day, the time of the week, the
`identity of the photographer, etc. These parameters can be
`specified by a user making the query, or set by the system
`according to the application or the query context. Each
`photo is assigned a score θ(Pi) in the range [0, 1] for each
`such parameter. The final relevance score, ri, may be a
`weighted average of all those parameter scores.
`The two interesting factors in the score computation are
`the tag- and photographer-distinguishability scores of clus-
`ters. These values are intended to represent how strongly a
`particular cluster is associated with specific tags or photog-
`raphers.
`4.3.1 Tag-distinguishability of clusters
`Tag-distinguishability aims at detecting distinct or unique
`concepts that are represented in a given cluster, as those
`may indicate the presence of some interesting landmarks
`or objects in that cluster. For example, in Figure 2, the
`tag “bridge” appears in two photos from Cluster C2, and
`does not appear elsewhere. As a consequence, C2’s score im-
`proves. On the other hand, the tag “car” appears in photos
`from both C2 and C3 and therefore does not help to distin-
`guish either of them.
`Formally, each photo Pj, 1 ≤ j ≤ n, is tagged with tags
`that are drawn from a finite dictionary, T . Hence, tagging
`may be viewed as a mapping Pj 7→ T (Pj) ⊂ T . For all t ∈ T
`and 1 ≤ i ≤ m, let
`
`|{Pj ∈ Ai : t ∈ T (Pj)}|
`|Ai|
`(4)
`tft,i =
`denote the relative frequency of the tag t in Ai, (or term
`frequency as it is referred to in Information Retrieval). We
`often found that this measure biases towards tags that have
`been used frequently by one user in the same cluster. An
`alternative frequency calculation can be based on the frac-
`tion of photographers in this cluster that have used the tag
`t:
`
`tft,i =
`
`(5)
`
`|{u ∈ Ui : t ∈ T (Pj), Pj ∈ Ai, Pj ∈ Bu}|
`|Ui|
`where Ui is the set of users that have taken photos in cluster
`Ai, and Bu is a set of photos taken by user u.
`We also use the inverse document frequency, which is a
`measure of the overall frequency of the tag t in the entire
`photo collection,
`
`4. The density of the cluster. More specifically, let σx,i
`and σy,i denote the standard deviation of the x and y
`coordinates, respectively, of all points in Ai, and let
`
`σi =`(σx,i)2 + (σy,i)2´1/2
`
`.
`
`idft =
`
`n
`|{Pj ∈ A : t ∈ T (Pj)}| .
`There are several ways to combine these two scores to mea-
`sure how the term t distinguishes the cluster Ai from other
`
`(6)
`
`MemoryWeb Ex. 2003
`Apple v. MemoryWeb – IPR2022-00032
`5 of 10
`
`

`

`!1/2
`
`
`
`X u
`
`φi =
`
`φ2
`u,i
`
`.
`
`(14)
`
`(15)
`
`According to the guidelines in Section 3, while large tag-
`distinguishabilities should contribute towards an increase in
`a cluster’s score, the photographer-distinguishability should
`have an opposite effect. The more a given cluster is associ-
`ated with a single photographer (or few photographers), the
`less we are interested in that cluster.
`Next, we describe how to merge all these factors into a
`single score for each cluster.
`4.3.3 Overall Cluster Score
`The score s(Ai) of the cluster Ai should depend in a
`monotonically increasing manner on the relevance factor,
`ρi, and the image quality factor, κi. The score should
`also depend in a monotonically increasing manner on the
`density measure of the cluster, δi (3), and on τi, the tag-
`distinguishability measure of the cluster. Finally, the score
`must depend in a monotonically decreasing manner on φi,
`the photographer-distinguishability measure of the cluster,
`as discussed above. Therefore, the overall score is:
`) · ρi
`s(Ai) = h(κi, δi, τi, φ
`−1
`i
`where h could be, typically, a geometric mean or a weighted
`average of its variables, and the weights may be chosen and
`fine-tuned by experimentation.
`4.4 Final Considerations
`We have described in this section the full framework of
`our algorithm: how the clustering is done, how scores are
`computed for each node in the cluster hierarchy, and how an
`ordering is produced on all photos given the scores and the
`cluster hierarchy. In practice, we found that it was necessary
`to prevent clusters from subdividing past some minimum
`size. Computing a ranking for a small cluster is meaningless.
`For example, there is not enough information (photos in
`the clusters) to compute a relevant tag- or photographer-
`distinguishability score. In order to solve this problem, we
`simply enforced a minimum size on all non-leaf clusters, by
`merging nodes at the lowest levels of the hierarchy.
`The way ranking is performed for these flat “edge” clus-
`ters is simple, yet different than the generalized method de-
`scribed above. The system computes the top-scoring tags
`for each flat cluster using the tag-distinguishability method,
`and then picks a photo with tags that best match these
`top tags. For example, in Figure 2, photo 6 is likely to
`be picked first because of the tag ‘bridge’, which is asso-
`ciated with that photo, and also appears often in C2 and
`rarely elsewhere. Other approaches for ranking photos for
`the flat leaf clusters may be choosing photos that maximize
`the visual similarity to other photos in this cluster, or some
`combination of such tag- and image-based similarity, as well
`as quality and relevance factors associated with the photos.
`Joshi et at [9] propose, in a different context, a solution that
`can be applied here.
`
`5. SAMPLE APPLICATION
`The summarization algorithm has a number of possible
`uses. As mentioned above, the summaries could be used to
`
`Finally, the overall photographer-distinguishability is de-
`fined as
`
`clusters. Let us denote such measures by τt,i. The usual
`measure in Information Retrieval is the tf-idf weight (term
`frequency – inverse document frequency). That measure is
`defined as
`
`τt,i := tfidft,i = tft,i · idft .
`Another alternative to (7) which is used in Information Re-
`trieval is
`
`(7)
`
`(8)
`
`τt,i := tfidft,i = tft,i · log (idft) .
`In both cases, large values of τt,i indicate that the number
`of occurrences of t in Ai is large with respect to its number
`of occurrences elsewhere.
`We would like to note that in the usual idf weight, the
`inverse document-frequency involves the number of clusters
`in which the tag appears, as opposed to the total number
`of actual tag occurrences, as given in (6). However, the
`usual definition is not suitable for cases where the number
`of clusters (documents) is small.
`In such cases, a single
`random occurrence of a tag in a cluster has a significant
`effect on the usual measure, while in the alternate approach
`we opted for it would be hardly noticeable.
`Next, we need to define an overall tag-distinguishability
`measure for Ai, denoted τi, based on the tag-distinguishability
`measures of all tags in the cluster. We compute the overall
`score by using the Euclidean measure based on the ‘2-norm,
`
`!1/2
`
`
`
`X t
`
`τi =
`
`∈T
`
`τ 2
`t,i
`
`.
`
`(9)
`
`We directly evaluate the effectiveness of our approach to
`“tag scoring” in Section 7.
`4.3.2 Photographer-distinguishability of clusters
`The measure of photographer-distinguishability (or user-
`distinguishability) is, roughly,
`inversely correlated to the
`number of photographers associated with a given cluster.
`The fewer active photographers in a cluster, the lower the
`likelihood the cluster will be semantically meaningful. For
`example, in Figure 2, all the

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket