throbber
WebSeer: An Image Search Engine for the
`World Wide Web1
`
`Charles Frankel, Michael J. Swain, and Vassilis Athitsos
`
`The University of Chicago
`Computer Science Department
`1100 East 58th Street
`Chicago, Illinois 60637
`
`Technical Report 96-14
`
`August 1, 1996
`
`Abstract
`
`Because of the size of the World Wide Web and its inherent lack of structure, finding what one is
`looking for can be a challenge. PC-Meter’s March, 1996, survey found that three of the five most
`visited Web sites were search engines. However, while Web pages typically contain both text
`and images, all the currently available search engines only index text. This paper describes
`WebSeer, a system for locating images on the Web. WebSeer uses image content in addition to
`associated text to index images, presenting the user with a selection that potentially fits her
`needs.
`
`
`
`1This work was supported in part by ONR contract N00014-93-1-0332 and NSF Grant No. IRI-9210763-
`A01.
`
`Petitioner Apple Inc. - Exhibit 1015, p. 1
`
`

`

`Introduction
`
`The explosive growth of the World Wide Web has proven to be a double-edged sword. While an
`immense amount of material is now easily accessible on the Web, locating specific information
`remains a difficult task. An inexperienced user may find it next to impossible to find the
`information she wants; even an experienced user may miss relevant Web pages. Users
`searching the World Wide Web have a number of options presently available to them. Lycos,
`Excite, Alta Vista, and Yahoo! are but a few examples of useful search engines. All of these
`systems have been designed primarily to find text-based information on the Web. WebSeer is
`designed to find the other major source of information currently available on the Web: images.
`
`In order to find specific images, a method of discovering and indexing their contents must first
`be developed. The pixels alone are insufficient; some additional information is needed to make
`it possible to search for specific images. Discovering image content has proven to be an
`extremely difficult problem. One approach has been to supplement the image content with other
`types of information associated with the image. Ogle and Stonebraker’s Cypress system [1][2]
`uses information contained in hand-keyed database fields to supplement image content
`information. Srihari’s Piction system [3] uses the captions of newspaper photographs containing
`human faces to help locate the faces. IBM’s QBIC system [4] relies on the user specifying
`specific visual cues or providing an example image (e.g. a sketch) to begin a query for an image.
`In Picard and Minka’s Foureyes system [5][6] close interaction with a human user supplements
`information derived from the image content.
`
`WebSeer uses the textual information surrounding an image and the image header to supplement
`the information derived from analyzing the image content. This additional information is used to
`create a context in which image analysis algorithms can effectively operate. Image analysis
`algorithms are then used to classify the image within a taxonomy of types (drawing, photograph,
`portrait, etc.) and to extract useful semantic information.
`
`Like text search engines, WebSeer does not have to access the original data to respond to a
`query; all analysis of the image and surrounding text is done off-line during the creation of the
`database. In this way, WebSeer will be able to give fast query responses to a possibly huge
`number of users.
`
`The first part of this report describes the types of information that an image search engine such
`as WebSeer could extract from the Web and how they could be stored in a database. The second
`part illustrates a number of example queries and how images could be found that fit the queries.
`The final part details the prototype WebSeer program and the algorithms that have been
`implemented to date.
`
`How to Find Images on the Web
`
`2
`
`Petitioner Apple Inc. - Exhibit 1015, p. 2
`
`

`

`Information for finding images on the World Wide Web can come from two sources: the
`relevant text and the image itself. Using information from both sources, a program
`should be able to successfully retrieve requested images.
`
`Cues from the Text and HTML Source Code
`
`An HTML document is a structured document. Understanding the structure of a particular
`document can reveal valuable information about the images contained on that page. There are
`several places relevant information about image content may be located within the document.
`These are ordered by the likelihood of the text being useful in an image search.
`
`1. Image File Names: File names contain important clues as to the content of the file. In many
`cases the full URL of an image (e.g. “http://www2.int-evry.fr/~hug/images/marilyn/mm-
`trans.gif”) contains more information than a relative URL (e.g. “mm-trans.gif”), which may
`be included in the HTML source. We have found the file name and the name of the top-level
`directory containing the file are most useful for indexing the content of the image. A caveat
`is that filenames may be difficult to decipher because of abbreviations used for brevity or out
`of necessity (e.g. in the FAT file system used in DOS/Windows 3.1).
`
`2. Image Captions: Often, images have captions that describe the picture. Although HTML
`does not have an explicit caption tag, captions are often signaled as text that is located within
`the
`same
`center
`tag
`as
`the
`image
`(http://www.lib.uchicago.edu/~mozart/Marilyn/monthly.html), within the same cell of a
`table as an image (http://studentweb.tulane.edu/~llovejo/marilyn/), or within the same
`paragraph (see Appendix A).
`
`3. ALT= Text: Image fields in HTML documents may contain a special ALT= section for
`alternate text which is displayed in the event that the images can not be loaded. This
`alternate text may briefly describe the content of the image. For instance, the alternate text
`for the a photograph of a person may contain simply the name of the person in the
`photograph. The alternate text may also give hints as to the intended “goal” of the image.
`For instance, an image containing an ad for Coca-Cola may have the alternate text “Buy
`Coca-Cola.”
`
`4. HTML Titles: The HTML title of the document (which is displayed above the viewed page
`by the browser, is used for history lists, and so on) often provides information about the
`content of the images contained within the document.
` For example, the page
`http://tomahawk.cf.ac.uk/~scotw/marilyn/page2/page2.html is entitled “Marilyn Monroe
`Images - The ventilator scene”, which is highly descriptive of the contents, and more
`descriptive than any other cues that can be found on the page (see Appendix A).
`
`5. HyperLinks: The text of a hyperlink usually gives clues about what the link refers to. In
`many cases, images are referenced by hyperlinks rather than being imbedded within the text
`of the page. An example of such a link is “Marilyn JPG #21 (Jumping for joy)”, at
`
`3
`
`Petitioner Apple Inc. - Exhibit 1015, p. 3
`
`

`

`http://www.netins.net/showcase/agency/marilyn.html. The same photo is referenced as
`“Marilyn
`joy”
`at
`jumping
`for
`http://www.rtmol.stt.it/rtmol/onsite/stars/marilyn/m_imag~1.html, where only the italicized
`text is the text of the hyperlink. In many cases, text hyperlinks assume the reader
`understands that the pictures are of Monroe and does not give any information about who is
`in the picture. These links say “Simply gorgeous ....”, “Leaning on a banister”, and so on.
`Consequently, other text in the same sentence or phrase with the hyperlink can also be
`relevant to deciphering the image content.
`
`6. Other text: In addition to those types mentioned above, other text can give cues about the
`images contained on the page. This text could be used to index the images on the page, with
`a lower likelihood of reference than the text categories listed above.
`
`Cues from the Image Content
`
`Although it is clear that the context provided by the surrounding text can produce an effective
`image search engine, examining the images themselves can substantially enrich the search. In
`this section we will summarize some of the relevant information that can be derived directly
`from the image content.
`
`The following attributes can be easily obtained from the image header.
`• Grayscale vs. color: Some may wish to search for only color images. While color images
`can be converted to grayscale, those in search of black-and-white photography will want the
`real thing.
`
`•
`
`Image Size: This is usually useful for those who wish to screen out images smaller than a
`certain size.
`• File Type (JPEG, GIF, GIF89, etc.): Some searchers may wish to avoid the compromises in
`quality introduced by constraints of these formats, especially the limited color palette in the
`GIF format. GIF89s are animations, and so will often be of separate interest than the other
`static image formats.
`• File Size: For those with slow connections or limited RAM or hard drive space, overly large
`files may be of less interest
`• File Date (when present): If someone is looking for newer images, the date can be used to
`screen out images that are certainly older than a certain date.
`
`Obtaining information beyond that listed above generally requires some image analysis. The
`most useful sorts of image analysis can be grouped into three overlapping categories: 1.
`classifying the image into one of several types (e.g. photograph, hand-drawing, computer-
`generated drawing), 2. recognizing image structures (e.g. faces, horizons), and 3. distinguishing
`specific regions primarily by color/texture attributes (e.g. skin, sky, foliage).
`
`4
`
`Petitioner Apple Inc. - Exhibit 1015, p. 4
`
`

`

`Classifying Images
`Creating a taxonomy of image types can facilitate separating images the user is interested in
`from those that she is not. However, the most important aspect of classification is creating the
`categories. The authors of an image search engine must either design a system that uses helpful
`categories, or include a mechanism to allow the user to guide her own search using relevant
`schema. Ideally, a system would use aspects of both types of categorization.
`
`One important classification which can be made is between images that are photographs and
`hand-drawn and computer-created drawings. In many cases users may wish to find only
`photographs of faces, and ignore the many other images that match the search string. Some of
`these non-photographs may be weeded out because they are too small (e.g. buttons, dividers,
`arrows). Others, such as image-maps containing text, advertisements, or drawings, must be
`eliminated on other criteria. Of course, more realistic drawings will be likely to be mistakenly
`categorized as photographs. Likewise photographs can be thresholded and manipulated in other
`ways to make them look more like drawings. Nonetheless, we have found that many images can
`be successfully classified as either drawings or photographs. It may also be possible to separate
`computer-generated drawings (i.e. those that are artificial-looking, with precisely straight lines
`and precise color gradients) from hand drawings.
`
`There are many other potentially useful categories. Images can be classified as portraits, which
`can be further classified by the number of people in the image and the amount of the body shown
`along with the face (close-up, head-and-shoulders, upper-body, etc.). Images can also be
`classified as indoor or out, and outdoor images can be classified as landscapes.
`
`Some of these categorizations may be best determined by analyzing global statistics, or the
`statistics of relatively large portions of the image (e.g. photo vs. drawing). Others require other
`sorts of analysis, which are described below.
`
`Recognizing Image Structures
`There are many types of objects to identify in images, which would aid content-based indexing.
`We have identified two “objects” that are both important to the typical WebSeer user and within
`the capabilities of existing computer vision technology. They are:
`
`1. Faces: Face-finding algorithms have recently become fairly successful at finding faces in
`grayscale photographs [7][8]. Faces in photographs are often seen upright, with the subject
`looking at the camera; the easiest situation for face-finding algorithms. The high quality of
`many photographs found on the Web, in many cases produced by professionals with careful
`control of the lighting and high quality film and optics, improves the reliability of face
`detectors. Combining face detection with color information from skin detection should
`make a reliable face detector. A primary role of a face detector is to distinguish images of
`people from other types of images that may show up on the same page. However, face
`detection can do more. Users may wish to find a portrait of a single person, a group photo,
`or pictures of a “crowd” containing many faces. Portraits can be close-ups, head-and-
`shoulders shots, upper-body, or full-body shots; the size and location of the face can
`differentiate among these choices.
`
`5
`
`Petitioner Apple Inc. - Exhibit 1015, p. 5
`
`

`

`2. Horizon: A number of systems search images for the presence of a horizon [9] (e.g. the
`Cypress System). Detecting a horizon in a photograph tells us that the image has been taken
`outdoors and allows us to roughly classify it as a landscape photograph.
`
`Face recognition is another possible task one might think of assigning the system. For instance,
`WebSeer could have a database of people that users are likely to search for, and identify their
`faces when creating the index. Restricting possible matches to those people mentioned in the
`text would greatly reduce the number of false positives the face recognizer would return. Further
`research will likely define other categories of objects that could be searched for and indexed in
`the same way as faces and horizons.
`
`Recognizing Regions by Color/Texture Analysis
`Some color and texture analysis may be useful for finding image regions containing skin, sky,
`foliage, etc. in photographs. Skin may be used for finding faces and determining if images are
`portraits; it is in itself of interest to some image searchers. Sky and foliage can also be used in
`image classification. Other less common color textures, including sand, wood, etc., may also be
`identified. If more classes such as these are created, it may prove useful to look for them only
`when the context suggests they may be present.
`
`The WebSeer Implementation
`WebSeer was implemented with three guiding principles in mind:
`
`First, WebSeer must have acceptable performance. We need to allow for extremely high
`speed searches, as we expect a large number of people to be using our system simultaneously.
`Since indexing occurs off-line, the performance of the indexing engine is less crucial than that of
`the search engine. Nonetheless, indexing the entire web in a reasonable amount of time requires
`the processing to proceed quite quickly. For example, assuming there are 10 million unique
`images on the web, indexing them in two weeks would require eight images to be processed
`every second. Crawler speed is also important. Preliminary results indicate that for every 100
`HTML pages on the Web, there are 40 (unique) GIF images and one (unique) JPEG image. In
`contrast to the file size of HTML pages, which averages around 6k, the average file size for GIF
`files is 11k, and the average file size for JPEGs is 35k.
`
`Second, we tried to incorporate standard commercial software and hardware whenever
`possible. Much work has been put into developing advanced database engines, and WebSeer’s
`ability to leverage technology adds significantly to its power. Microsoft’s Visual C++
`development environment was used for much of this project. Microsoft’s Foundation Classes in
`conjunction with the Standard Template Libraries (STL) provided many useful functions and
`abstractions during the development of this project. On the hardware front, the use of relatively
`inexpensive PCs allowed us to ramp up our storage and processing capacities quickly and within
`reasonable cost.
`
`Third, the project should provide a basis for experimentation. We foresee WebSeer
`evolving in the following ways:
`
`6
`
`Petitioner Apple Inc. - Exhibit 1015, p. 6
`
`

`

`• better image understanding algorithms.
`
`• advanced text indexing capabilities
`
`• improved interactions between the image understanding and text indexing algorithms
`
`• more complex transformations from form query submissions to search actions taken .
`
`To facilitate this type of research, the project is divided in such a way that each component can
`be worked on independently. Additionally, we wish to facilitate the incorporation of relevant
`new technologies as they become available. The WebSeer project is composed of five major
`executables. With the exception of the URL Server, which is written in Java, all executables are
`written in C++ and run on a M.S. Windows NT 3.51 platform.
`
`1) The WebSeer Crawler crawls the web downloading both HTML pages and images.
`The crawler is multi-threaded so that the delay downloading pages is spread over
`multiple threads. Each thread is connected to a database of previously visited (and
`waiting to be visited) URLs using the ODBC 2.0 database protocol.
`
`2) The URL Server is a multi-threaded java application which receives requests to
`download URLs from the WebSeer Crawler. Separating the URL server from the
`Crawler application allows us to download pages from multiple machines (with
`different operating systems) simultaneously.
`
`3) The WebSeer Indexer creates the index which is searched by the users. The indexer
`parses the HTML code and executes the appropriate image understanding
`applications.
`
`4) The WebSeer CGI script is called when the user submits (POSTs) a query from the
`WebSeer form. This script opens a TCP/IP connection to the WebSeer Search
`Server, and formats the results for display to the user.
`
`5) The WebSeer Search Server accepts requests from the WebSeer CGI script and
`performs the appropriate searches based on the form fields which the user has filled
`in.
`
`The WebSeer Crawler largely obeys the Robot Exclusion Protocola. The Protocol is dependent
`on system administrators including a robots.txt file on their web site, which lists areas of
`the web site which robots should not visit. Most robots are designed to download only HTML
`files, and so visiting a directory which includes images would be inappropriate. For this reason,
`some robots.txt files exclude directories which contain only image files (See Appendix B).
`Since the WebSeer Crawler is designed to download images, we obey the restrictions specified
`by the robots.txt file when deciding whether to download an html file, but not when
`downloading an image file.
`
`
`
`a The Protocol is specified at http://info.webcrawler.com/mak/projects/robots/norobots.html
`
`7
`
`Petitioner Apple Inc. - Exhibit 1015, p. 7
`
`

`

`The WebSeer Indexer is perhaps the most interesting component to this system. The indexer
`processes a single HTML file at a time. As each image is encountered in the HTML file (either
`through an “<img src=…>” tag or an “<a href src=“this_image.gif”>” tag, the appropriate text is
`extracted in the form of whole words. These words may be present in any of the ways described
`above. Some of the words are more likely to contain information about the content of the image
`they refer to. For this reason, we weight the words according to the likelihood that they contain
`useful information. Words contained in the title of the HTML page, for example, have a lower
`weight than those in the ALT tag of an image. When a user performs a search the summed
`weights of the matching words are used as one criterion for sorting the resulting images. These
`words and their weights are stored in a single database table. Part of an example WebSeer text
`table is show below.
`
`Image ID
`31308
`31308
`31308
`31309
`31309
`31309
`31309
`31309
`31311
`31311
`31311
`31311
`31311
`31311
`
`Text Fragment
`kathy
`ireland
`vivek
`tn
`ki
`vivek
`kathy
`ireland
`marilyn
`gallery
`lovejoy
`marilyn
`monroe
`collection
`
`Fragment Weight
`1
`1
`1
`3
`3
`1
`1
`1
`3
`2
`1
`1
`1
`1
`
`Image understanding algorithms are run whenever an image is encountered. The figure below
`indicates the fields which WebSeer currently saves for each image.
`
`8
`
`Petitioner Apple Inc. - Exhibit 1015, p. 8
`
`

`

`Sample Data
`Field Name
`http://www.cdt.org/images/cdtlgo.gif
`File Name
`http://www.cdt.org/index.html
`Source URL
`8 bit color
`Color Depth
`3,129 bytes
`File Size
`gif
`File Type
`204
`Image Width
`107
`Image Height
`No
`Is Image a Photograph?
`No
`Is Image an Image Map?
`No
`Is Image included as a Reference?
`0
`Number of Faces
`Largest Face Size
`0%
`Three fields contain information about how the image was included in the HTML
`document. The “Is Image a Photograph” field contains the results of the photograph/drawing
`algorithm described below. “Is Image an Image Map” indicates whether the image appeared in
`the HTML code with an ISMAP tag. These images appear to the user as clickable images.
`
`“Is Image included as a Reference?” indicates whether the image appears as part of an HTML
`document, or whether only a reference is included. If an image is part of an HTML document,
`users viewing that document will immediately be able to see that image. If a reference is
`included, the user may have to click on some part of the document in order to view the image.
`References to images are common when the images are large and/or slow to download.
`
`The “Number of Faces” field indicates the number of faces which were detected in the given
`image. Each detected face has four attributes associated with it: horizontal position, vertical
`position, height, and width. Since an image may contain a number of different faces, the
`attributes (fields) associated with each particular face are saved in a separate “face table”.
`
`The “Largest Face Size” field saves the height of the largest face detected in the given image, as
`a percentage of image height. Although this information is also contained in the face table,
`saving the information in the image table speeds some searches by eliminating the need to
`perform a JOIN of the image table with the face table. An example is a search for a “close-up”
`image. We define close-up images as images where the largest face has a height greater than
`50% the size of the image.
`
`An Example Search
`A user is interested in finding small, close-up images of Rebecca De Mornay. They type
`“Rebecca De Mornay” as the search text, and make selections as displayed in Figure 1.
`
`9
`
`Petitioner Apple Inc. - Exhibit 1015, p. 9
`
`

`

`The results page interface, shown in Figure 2, works as follows. Thumbnails of the resulting
`images are displayed above a bar which indicates the size of the original image. Clicking on the
`image will launch your browser to the URL original image. Clicking on the page icon to the
`right of the image will launch your browser to the URL of the page which contains the image.
`
`Figure 1: An image search query
`
`The results were obtained as follows. Webseer searches for images which contain any of the
`words contained in the text, after eliminating words that are especially common (e.g. connecting
`words such as “and” and “the”). The results are sorted by the summed weight of the words
`associated with that image. For instance, if one image has associated words “Rebecca”, with a
`
`10
`
`Petitioner Apple Inc. - Exhibit 1015, p. 10
`
`

`

`weight of 3, and “Mornay” with a weight of 1, that image would have a summed weight of 4.
`Only images with a height or width < 150 pixels, with file size < 10K, which were determined to
`be photographs, and which contain exactly one face whose height is at least 50% of the size of
`the image are included in the results. Lastly, the results are sorted, first by the weighted sum of
`associated words, and then, since “close up” was selected, by the detected face size (with the
`largest faces appearing first).
`
`Figure 2: Results of the query shown in Figure 1.
`
`Image Content Analysis: Classifying Photographs and Drawings
`
`The algorithm classifies images into photographs and artificial images, where the terms
`"artificial images" and “drawings” are used to denote all images that are not photographs. There
`are some borderline cases. The most common is images that contain a photograph and an
`artificial part, which could be, for example, a computer-generated background or super–imposed
`text. Currently, the algorithm is not designed to handle such images consistently.
`
`The algorithm can be split into two independent modules. One module consists of tests
`that separate photographs from drawings. After the image has been submitted to all the tests, the
`decision-making module decides how it should be classified.
`
`11
`
`Petitioner Apple Inc. - Exhibit 1015, p. 11
`
`

`

`Tests
`
`In every test an image gets a positive real number as a score. The images are represented
`by three matrices, corresponding to their red (R), green (G), and blue (B) color bands, whose
`entries are integers from 0 to 255. Photographs and drawings tend to score in different ranges.
`These are the tests we currently use in the algorithm:
`
`•The band difference test. We pick a threshold T between 0 and 255 and initialize the
`counter C to 0. For every pixel in the image, let r, g, b be its respective values in the R, G, B
`matrices. We look at |r - g|, |g - b| and |r - b|, and we increase the counter C by one for each of
`
` C S
`
`those absolute values that is greater than T. The score of the image is
`
`
`of pixels in the image. We expect drawings to score higher than photographs, because the colors
`used in drawings are usually highly saturated.
`
`, where S is the number
`
`•The farthest neighbor test. We pick a threshold T between 0 and 255 and initialize
`counters C and S to 0. For each inner pixel P of the image, we consider its top, bottom, left, and
`right neighbor. For each neighbor, we calculate the absolute difference of its R value from the R
`value of P, and we look at the maximum M of those four absolute differences. If M > 0, we
`
` C S
`
`increase S. If, in addition, M > T, we increase C. The score of the image is
`
`
`The rationale behind this test is to see how abrupt the color changes are in an image. In
`photographs they tend to be smoother than in drawings. Therefore, drawings tend to score
`higher than photographs in this test.
`
`.
`
`• The color test. We count the number of different colors in the image. Drawings tend
`to have fewer distinct colors than photographs.
`
`• The most common color test. We find the color that occurs most frequently in the
`
` C S
`
`image. The score is
`
`
`is the total number of pixels in the image. Again, artificial images usually score higher than
`photographs, because they tend to have a one-color background.
`
` where C is the number of pixels in the image that have that color, and S
`
`• The narrowness test. Let R be the number of rows and C be the number of columns in
`
` C R
`
`, otherwise N =
`
` R C
`
`the image. If R > C, then N =
`
`
`
`Photographs tend to score between 1 and 2, whereas drawings often score above 2.
`
`, where the score of the image is N.
`
`For images saved in the JPEG format, the color test doesn't work because of the way
`JPEG compression works. Neighboring pixels that have the same color in the original image
`usually have similar but not identical colors after the compression. In addition, abrupt color
`changes in the original images usually become smoother after compression. This makes the
`farthest neighbor test less powerful.
`
`12
`
`Petitioner Apple Inc. - Exhibit 1015, p. 12
`
`

`

`Decision making
`
`Since GIF and JPEG images tend to follow different scoring patterns in the tests we use,
`the decision maker needs to know how images tend to score based both on their category
`(photograph or drawing) and their compression format. So, for each of the four cases (GIF
`photographs, GIF drawings, JPEG photographs, JPEG drawings), we created a training set of at
`least 100 images, selecting representative images from the Web. For each test, we calculated
`and stored the scores of all images in the training set. The minimum score is set so that 1% of
`the images score below it. The maximum is defined in an analogous way. The set of scores
`together with the minimum and the maximum is called the distribution of scores for that test. In
`particular, if the training set contains photographs, the distribution is called natural, otherwise it
`is called artificial.
`
`Suppose we have a GIF image and calculate its score S on a particular test T. Consider
`the natural distribution of scores for that test. If the score S is between the minimum and the
`maximum score in that distribution, L is the percentage of scores in the distribution that are less
`than or equal to S, and M is the percentage of scores that are greater than or equal to S. We
`define the natural grade N of the image to be the minimum of L and M.
`
`If the score S is less than the minimum, then R =
`
`S
`minimum and the natural grade N is
`. If S = 0, however, we set N to 0.001. If S is greater than the maximum, then R =
`
`R
`1000
`
`maximum
`S
`
`. We calculate the artificial grade A of the image in a similar way.
`
`R
` and N =
`1000
`
`Based on these scores, we assign an overall grade between 0 and 1 to the image. The
`higher the grade is, the more likely the image is to be a photograph. We fix a threshold T, and
`we classify all images whose grade is over T as photographs and all other images as drawings.
`This is how we calculate the grade:
`
`• We initialize P and D to 1.
`
`• After we are done with all tests, we set the grade of the image to
`
`• For each test, we calculate the score of the image, and from the score we calculate the
`natural grade N and the artificial grade A for that test. We set P to PN and D to DA.
`P
`P + D
`
`The reader may have recognized that if, for each test, A was the conditional probability
`that an image got the score S given that it was a drawing, and N was the conditional probability
`that an image got the score S given that it was a photograph, then the grade of the image would
`be the probability that it was a photograph given all its test scores and assuming that the different
`tests have independent score distributions. In that case, we should classify all images scoring
`over 0.5 as photographs and the rest as drawings. However, A and N are not defined to be the
`conditional probabilities. So, the optimal classification threshold is not necessarily 0.5.
`
`.
`
`13
`
`Petitioner Apple Inc. - Exhibit 1015, p. 13
`
`

`

`Nevertheless, it remains true that the grade is more reliable when the tests we use are pairwise
`independent or approximately independent.
`
`Results
`
`The algorithm has been tested both on the training sets that we used, and on images that
`we didn’t include in those sets. With the group of JPEG artificial images the algorithm was only
`tested on a training set. We have found that purely artificial images in the JPEG format occur
`less frequently than images in other formats.
`
`The following tables give some experimental results:
`
`• Table 1 gives results on GIF images. The first row gives the number of images for each
`group. After that, each entry gives the percentage of correct classifications of images in the
`given group when grades are calculated based on that test only. The group "GIF art1" is the
`training set of GIF drawings. “GIF nat1” is the training set of GIF photographs. “GIF art2” and
`“GIF nat2” are, respectively, sets of drawings and photographs that were not used in training.
`
`• Table 2 gives results on JPEG images, in the same format as Table 1.
`
`• Table 3 gives the values of all thresholds used.
`
`Table 1: Results on GIF Images
`
`GIF art1 GIF art2 GIF nat1 GIF nat2
`
`number of images
`
`band difference test
`
`farthest neighbor test
`
`color test
`
`most common color test
`
`ratio test
`
`All tests combined
`
`395
`
`50.9
`
`86.6
`
`60.8
`
`81.0
`
`71.4
`
`96.2
`
`339
`
`45.4
`
`89.1
`
`79.4
`
`88.5
`
`74.6
`
`94.1
`
`129
`
`94.6
`
`89.1
`
`83.7
`
`92.2
`
`77.5
`
`88.4
`
`112
`
`90.2
`
`83.1
`
`85.7
`
`85.7
`
`77.7
`
`81.2
`
`14
`
`Petitioner Apple Inc. - Exhibit 1015, p. 14
`
`

`

`Table 2: Results on JPEG Images
`
`JPEG art1
`
`JPEG nat1
`
`JPEG nat2
`
`number of images
`
`band test
`
`farthest neighbor test
`
`most common color test
`
`ratio test
`
`All tests combined
`
`150
`
`13.3
`
`55.3
`
`66.7
`
`66.7
`
`92.0
`
`228
`
`98.7
`
`95.2
`
`98.2
`
`96.5
`
`93.9
`
`118
`
`97.5
`
`97.5
`
`91.5
`
`99.2
`
`95.8
`
`Table 3: Thresholds
`
`band difference test
`
`farthest neighbor test
`
`grade
`
`GIF
`
`JPEG
`
`233
`
`100
`
`0.3
`
`185
`
`150
`
`0.1
`
`Current Work
`We are currently working on improving the algorithm in three directions:
`
`• Finding more tests and perfecting the ones we already have.
`
`

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket