throbber
i
`
`CONTEN -BASED TRANSCODING OF IMAGES IN THE INTERNET
`
`John R. Smith, Rakesh Mohan and Chung-Sheng La
`
`IBM T.J. Watson Research Center
`30 Saw Mill River Road
`Hawthorne, NY 10532
`{jrsmith,rakesh,csli}@watson. ibm. corn
`
`I Workstation
`
`net in order to impr
`a wide range of CO
`display capabilities.
`
`delivery to client devices with
`ation, processing, storage and
`content-based image transcoder
`
`transcoding policies
`
`on the content classes to manip-
`
`I COlOrPC I browser I HHC I PDA I Phone I
`
`Smart
`
`Tv
`
`s and color PCs, and demon-
`very speed and accessibility of
`
`38 KB
`fidelity:
`color: 24 bit RGB
`
`23 KB
`24 bit RGB
`
`0.6KB
`4 K B
`8 K B
`256 colors 4 bit gray BMI
`
`IOOB
`-
`
`1. INTRODUCTION
`
`1.1. Related work
`
`for processing content in the net-
`its accessibility. Recent transcod-
`
`0-8186-8821-1/98 $10 ! 00 0 1998 IEEE
`
`7
`
`Figure 1: Image transcoding modifies the images along the
`dimensions of size, fidelity and color in order to adapt them
`to the client devices.
`
`ing efforts have focussed on compressing and caching im-
`ages in the Internet in order to reduce the data transmis-
`sion and speed-up delivery. Fox, et al.., developed a sys-
`tem for compressing Internet content lossily at a proxy in
`order to deal with client variability and improve end-to-
`end performance [2]. Ortega, et al., investigated a new
`image caching policy that reduces the resolution of infre-
`quently accessed images in order to conserve storage space
`and bandwidth [3]. Several commercial systems such as In-
`tel’s Quick Web [4] and Spyglass’ Prism [5] compress the
`images at the Internet service providers’ proxy to speed-up
`download time.
`We develop a more powerful image transcoding system
`that analyzes the images, the related text and Web doc-
`ument context in order to select policies for adapting the
`images to the client devices. The system transcodes the im-
`ages along the dimensions of size, fidelity, and color in order
`to better adapt them to the client device’s communication,
`processing, storage, and display capabilities.
`
`1.2. Outline
`In this paper, we present the content-based image transcoder
`system. In Section 2, we present the image content analysis
`system that classifies the images into image type and pur-
`pose classes. In Section 3, we describe the image transcod-
`ing functions and policies. Finally, in Section 4, we examine
`the potential improvement in accessibility of images for a
`growing diversity of client devices. We also demonstrate
`
`Authorized licensed use limited to: Finnegan Henderson Farabow Garrett & Dunner. Downloaded on April 09,2021 at 17:12:38 UTC from IEEE Xplore. Restrictions apply.
`
`

`

`the potential for end-to-end speed up in image access via a
`network-based image transcoding proxy.
`
`2. IMAGE CONTENT ANALYSIS
`
`The image analysis system classifies the images based on
`their content into image type and purpose classes. We de-
`fine the following image type classes: T = {BWG, BWP,
`GRG, GRP, SCG, CCG, and CP}, where
`1. BWG - b/w graphic
`2. BWP - b/w photo
`3. GRG - gray graphic
`4. GRP - gray photo
`5. SCG - simple color graphic
`6. CCG - complex color graphic
`7. CP - color photo
`The graphics vs. photographs distinction is loosely targeted
`for distinguishing between synthetic and natural images.
`Although the distinction is not always clear for images on
`the Web [6], the incentive for distinguishing between them is
`to use transcoding functions that are separately tuned for
`handling these types. We define also the following image
`purpose classes P = {ADV, DEC, BUL, RUL, MAP, INF,
`NAV, CON}, where
`1. ADV - advertisement, i.e., banner ads
`2. DEC - decoration, i.e., background textures
`3. BUL - bullets, points, balls, dots
`4. RUL - rules, lines, separators
`5. MAP - maps, i.e., images with click focus
`6. INF - information, i.e., icons, logos, mastheads
`7. NAV - navigation, i.e., arrows
`8. CON - content related, i.e., news photos
`We also map the images into subject classes using related
`text. The semantic information potentially provides sub-
`stitute text for the images for client devices that cannot
`handle images.
`
`2.1. Image type classification
`The image type classification system utilizes a decision tree
`classifier. The decision tree, depicted in Figure 2, classi-
`fies the images along the dimensions of color content (color,
`gray, b/w), and source (photographs, graphics). Distin-
`guishing between b/w, gray and color is often not trivial
`because of artifacts introduced in the image production and
`compression. Examples of the seven image type classes are
`illustrated at the bottom of Figure 2.
`The image type decision tree consists of five decision
`points, each of which utilises a set of features extracted from
`the images. Keeping in mind the need for real-time, on-
`line transcoding, the features are extracted only as needed
`for the tests in order to minimize processing. The image
`features are derived from several color and texture measures
`computed from the images. We obtained the classification
`parameters for these measures from a training set of 1,282
`images retrieved from the Web.
`
`complex
`simple
`gray
`blw
`gray
`b/w
`graphic
`graphic
`graphic graphic
`photo
`photo
`(BWG) (BWP) . (GRG) (GRP) (SCG) (CCG)
`
`color
`photo
`(CP)
`
`Figure 2: Image type decision tree consisting of five decision
`points for classifying the images into image type classes.
`
`Each image X [ m , n] has three color components, cor-
`responding to the RGB color channels as follows: XTgb =
`(z,, z g , z b ) , where z,, z g , 26 E (0,255). The decision tree
`performs the following tests for each image X:
`1. Color vs. non-color. The first test distinguishes
`between color and non-color images using the mea-
`sure of the mean saturation per pixel ps. The satu-
`ration channel ys of the image is computed from X
`as follows:
`
`y. = max(z,, z9, Zb) - min(zT, z g , zb).
`Then, p8 = 1 M N E,,, ys[m, n] gives the mean satu-
`
`ration, where M , N are the image width and height,
`respectively. Table 1 shows the mean E ( p , ) and stan-
`dard deviation a ( p s ) of the saturation measure for
`the set of 1,282 images. The mean saturation ps dis-
`criminates well between color and non-color images
`since the presence of color requires ps > 0, while
`strictly non-color images have ps = 0. However, due
`to noise, a small number of saturated colors often ap-
`pear in non-color images. For example, for the 464
`non-color images, E ( p d ) = 2.0.
`
`1
`I Color
`I 46.2
`I 818 I 63.0
`Table 1: The color vs. non-color test uses mean saturation
`per pixel p s .
`
`2. B/W vs. Gray. The second test distinguishes be-
`tween b/w and gray images using the entropy P,
`and variance Vv of the intensity channel y,,. The in-
`tensity channel of the image is computed as follows:
`
`8
`
`Authorized licensed use limited to: Finnegan Henderson Farabow Garrett & Dunner. Downloaded on April 09,2021 at 17:12:38 UTC from IEEE Xplore. Restrictions apply.
`
`

`

`+ 0.hb. Then, the intensity en-
`P,, = - c”,”=”,
`p [ k ] log, p[IC], where
`
`1
`if IC = y, [m, n]
`0 otherwise.
`
`The intensity v riance is given by
`
`f
`I
`where pv = &/ E,,, y,[m, n]. Table 2 shows the
`statistics of P, nd V , for 464 non-color images. We
`can see that for b/w images the expected entropy P,
`is low and expe ted variance Vi, is high. The reverse
`is true for gray mages.
`E(P,) I u(Pv) I E(K) 1 u(V,)
`1 Test 2 I # I
`I 11,644 I 4.993 I
`I B/W
`I 1.1
`1 300 1
`
`m8n
`
`I
`
`1.4
`
`SCG
`Test5
`#
`492
`E(p,.)
`69.7
`u(pLJ) 50.8
`2.1
`E(Pi66)
`0.8
`b(pi66)
`E(Wi.66) 0.24
`~(Wififi’l 0.16
`Table 4: The SCG vs. CCG vs. CP test uses mean satura-
`tion p s , HSV entropy P I 6 6 and HSV switches W166.
`
`CCG C P
`116
`210
`71.2
`42.5
`46.2
`23.5
`3.1
`3.3
`0.7
`1.0
`0.36
`0.38
`0.16
`0.15
`
`We use the 166-HSV color entropy Pi66 and mean
`color switch per pixel W166 measures. In the compu-
`tation of the 166-HSV color entropy, p [ k ] gives the fre-
`quency of pixels with color index value IC. The color
`switch measure is defined as in the test three measure,
`except that it is extracted from the 166-HSV color im-
`age y166. We use also the measure of mean saturation
`per pixel p s . Table 4 shows the statistics for ,us, P166,
`and W166 for 818 color images. Color graphics have
`a higher expected saturation E ( p I ) ) than color pho-
`tos. But, color photos and complex color graphics
`have higher expected entropies E(P166) and switch
`measures E(W166) in the quantized HSV color space.
`
`Image purpose classification
`2.2.
`documents often contain information related to each
`Web
`image that can be used to infer information about them [8,
`91. The system uses this information with the image type to
`classify the images into the image purpose classes P. The
`system makes use of five contexts for the images in the Web
`documents: C = {BAK, INL, ISM, REF, LIN}, defined in
`terms of HTML code as follows:
`1. BAK - background, i.e., <body backgr= ... >
`2. INL - inline, i.e., <img src= ... >
`3. ISM - ismap, i.e., <img SIC=... ismap>
`4. REF - referenced, i.e., <a href= ... >
`5. LIN - linked, i.e., <a href= ... ><img src= ... ></a>
`The system also uses a dictionary of terms extracted from
`the text related to the images. The terms are extracted
`from the ‘alt’ tag text, the image URL address strings, and
`the text nearby the images in the Web documents. The
`system makes use of terms such as D = {“ad”, “texture”,
`“bullet” , “map”, “logo”, “icon”}. The system also extracts
`a number of image attributes, such as image width (tu),
`height ( h ) , and aspect ratio ( T = w / h ) .
`The system classifies the images into the purpose classes
`using a rule-based decision tree framework described in [lo].
`The rules map the values for image type t E 7, context
`
`Table 2: The b/w us. gray test uses intensity entropy P,
`and variance V,.
`
`if yv[m - 1, n3 # y v [ m , nl
`1
`0 otherwise.
`
`Table 3: The BWG us. BWP test uses intensity switches
`W,. The GRG vs. GRP uses W, and intensity entropy P,.
`
`I
`
`9
`
`Authorized licensed use limited to: Finnegan Henderson Farabow Garrett & Dunner. Downloaded on April 09,2021 at 17:12:38 UTC from IEEE Xplore. Restrictions apply.
`
`

`

`c E C, terms d E V , and image attributes a E {w, h, T } into
`the purpose classes. The following examples illustrate some
`of the image purpose rules:
`t t = SCG, c = REF, d = “ad”
`p = ADV
`p = DEC t c = BAK, d = “texture”
`t = SCG, c = ISM, w > 256, h > 256
`p = MAP
`t
`t = SCG, T > 0.9, T < 1.1, w < 12
`p = BUL
`t = SCG, T > 20, h < 12
`p = RUL
`t = SCG, c = INL, h < 96, w < 96
`p = INF
`
`t
`c
`
`t
`
`2.3. Image summarizer
`In order t o provide feedback about the embedded images
`for text browsers, the system generates image summary in-
`formation. The summary information contains the assigned
`image type and purpose, the Web document context, and
`related text. The system uses an image subject classifi-
`cation system that maps images into subjects categories
`(s) using key-terms ( d ) , i.e., d -+ s, which is described
`in [SI. The summary information is made available to the
`transcoding engine t o allow the substitution of the image
`with text.
`
`3. IMAGE TRANSCODING
`
`The system transcodes the images using a set of transcoding
`policies. The policies apply the transcoding functions that
`are appropriate for the client devices.
`
`3.1. Transcoding functions
`The system provides a set of transcoding functions that
`manipulate the images along the dimensions of image size,
`fidelity, and color, and that substitute the images with text
`or HTML code. Example transcoding functions include
`Size: minify, crop, and subsample.
`Fidelity: J P E G compress, GIF compress, quantize,
`reduce resolution, enhance edges, contrast stretch,
`histogram equalize, gamma correct, smooth, sharpen,
`and de-noise.
`Color content: reduce color, map to color table,
`convert to gray, convert to b/w, threshold, and dither.
`Substitution: substitute attributes (U), text ( d ) ,
`type ( t ) , purpose ( p ) , and subject (s), and remove
`image.
`
`3.2. Client device characteristics
`The growing number of client devices that are gaining ac-
`cess to the Internet are varied in their communication, pro-
`cessing, storage and display capabilities. Table 5 illustrates
`some of the variability in device bandwidth, display size,
`display color and storage among devices
`Since many devices are constrained in their capabili-
`ties, they cannot simply access image content as-is on the
`Internet. For example, many PDAs cannot handle J P E G
`images, regardless of size. The HHCs cannot easily dis-
`play Web pages loaded with images because of screen size
`
`Client
`device
`PDA
`HHC
`TV browser
`Color P C
`Workstation
`
`Bandwidth
`(bps1
`14.4K
`28.8K
`56K
`56K
`10M
`
`Display
`size
`320 x 200
`640 x 480
`544x 384
`1024X 768
`1280 x 1024
`
`Display Device
`color
`storage
`b/w
`1MB
`gray
`4MB
`NTSC
`1GB
`RGB
`2-4GB
`RGB
`>4GB
`
`limitations. Color PCs often cannot access image content
`quickly over dial-up connections. The presence of fully sat-
`urated red or white images causes distortion on NTSC TV-
`browser displays. The transcoder framework allows the con-
`tent providers to publish content at the highest fidelity, and
`the system manipulates the content to adapt to the unique
`characteristics of the devices.
`
`3.3. Transcoding policies
`The transcoding system employs the transcoding functions
`in the transcoding policies. Consider the following exam-
`ple transcoding policies based upon image type and client
`device capabilities:
`
`minify(X) t type(X) = CP, device = HHC
`subsample(X) t type(X) = SCG, device = HHC
`dither(X) c type(X) = C P , device = PDA
`threshold(X) t type(X) = SCG, device = PDA
`JPEG(X) c type(X) = GRP, bandwidth 5 28.8K
`GIF(X) c type(X) = GRG, bandwidth 5 28.8K
`Notice that two methods of image size reduction are em-
`ployed: minify and subsample. The difference is that minify
`performs anti-aliasing filtering and subsampling. Minifying
`graphics often generates false colors during filtering, which
`increases the size of the file. This can be avoided by sub-
`sampling directly. We also distinguish between graphics
`and photographs for compressing and reducing the color of
`the images. For compression, JPEG works well for gray
`photographs but not for graphics. For GIF, the reverse
`is true. When converting color images to b/w, dithering
`the photographs improves their appearance, while simply
`thresholding the graphics improves their readability. By
`performing the image type content analysis, the system is
`able to better select the appropriate transcoding functions.
`The transcoding policies also make use of the image pur-
`pose analysis. Consider the following example transcoding
`policies:
`
`fullsize(X) e purpose(X) = MAP
`remove(X) t purpose(X) = ADV
`bandwidth 5 14.4K
`substitute(X, “<li>”) c purpose(X) = BUL,
`device = PDA
`substitute(X, t ) t purpose(X) = INF,
`display size = 320 x 200
`The first policy makes sure that map images are not reduced
`in size in order to preserve the click focus translation. The
`
`10
`
`Authorized licensed use limited to: Finnegan Henderson Farabow Garrett & Dunner. Downloaded on April 09,2021 at 17:12:38 UTC from IEEE Xplore. Restrictions apply.
`
`

`

`I
`i
`
`second policy illustrat s the removal of advertisement im-
`ages if the bandwidth is low. The third policy substitutes
`the bullet images with he HTML code “<li>,” which draws
`a bullet without requir‘ng the image. A similar policy sub-
`stitutes rule images w th “<hr>”. The last policy substi-
`tutes the information ‘mages, i.e., logos, icons, mastheads,
`with related text if th
`device screen is small.
`
`3.4. Image transco ing proxy
`The content-based im ge transcoder is part of a network-
`based transcoding pro y, see Figure 3. The transcoding
`proxy handles the req ests from the client devices for Web
`documents and images. The proxy retrieves the documents
`and images, analyzes, manipulates and transcodes them,
`and delivers them to t
`e devices.
`
`D,
`
`4. VALUATION
`
`the client. Reduc
`transcoding proxy
`duction can result
`accounting for the
`
`ata sizes of the images at the
`compression, size and color re-
`nd-to-end delivery, even when
`s introduced by the content analy-
`
`in retrieving the image via
`
`by Lt = Ds/Bp -k Ds/Bt +
`
`in a net speed-up by a factor
`sion ratio D,/Dt is
`
`i
`
`Consider a relatively high proxy-to-server bandwidth of
`B, = 1000 Kbps, a clie t-to-proxy bandwidth of B , =20
`Kbps, and a transcoder andwidth of B, = 2400 Kbps. A
`data compression ratio a the proxy of D,/Dt 1 1.03 results
`in a net end-to-end spe
`d-up. If the data is compressed
`
`by a factor of D,/Dt = 8, the speed-up is by a factor of
`L c / L t a 6.5. If B p = 50 Kbps, the data compression ratio
`needs to be increased to D,/Dt 2 1.8 to have a speed-up
`in delivery. In this case, data compression of D,/Dt = 8
`speeds up delivery by a factor of L c / L t x 1.9.
`
`5. SUMMARY
`
`We presented a system for transcoding images in the Inter-
`net in order to adapt them to client devices with a wide
`range of communication, processing, storage and display
`capabilities. The content-based image transcoder analyzes
`the images and classifies them into image type and image
`purpose classes. The system then utilizes transcoding poli-
`cies based on the content classes to manipulate, transcode,
`and adapt the images. The image transcoding system im-
`proves access of a variety of client devices, including PDAs,
`HHCs, TV browsers and color PCs to the images in the
`Internet.
`
`6. REFERENCES
`J. R. Smith, R. Mohan, and C.-S. Li. Transcoding
`Internet content for heterogenous client devices.
`In
`Proc. IEEE Inter. Symp. on Circuits and Syst. (IS-
`CAS), June 1998. Special session on Next Generation
`Internet.
`A. Fox, S. D. Gribble, E. A. Brewer, and E. Amir.
`Adapting to network and client variability via on-
`demand dynamic distillation. In ASPLOS- VII, Cam-
`bridge, MA, October 1996.
`A. Ortega, F. Carignano, S. Ayer, and M. Vetterli.
`Soft caching: Web cache management techniques for
`images. In Workshop on Multimedia Signal Processing,
`pages 475 - 480, Princeton, NJ, June 1997. IEEE.
`Intel Quick Web. http://www.intel.com/quickweb.
`Spyglass-Prism.
`http://www.spyglass/products/prism.
`V. Athitsos, M. J . Swain, and C. Frankel. Distinguish-
`ing photographs and graphics on the World-Wide Web.
`In Proc. IEEE Workshop on Content-based Access of
`Image and Video Libraries, June 1997.
`J. R. Smith and S.-F. Chang. Tools and techniques
`In Symposium on Elec-
`for color image retrieval.
`tronic Imaging: Science and Technology - Storage &
`Retrieval for Image and Video Databases I V , volume
`2670, pages 426 - 437, San Jose, CA, February 1996.
`IS&T/SPIE.
`N. C. Rowe and B. Frew. Finding photograph captions
`multimodally on the World Wide Web. Technical Re-
`port Code CS/Rp, Dept. of Computer Science, Naval
`Postgraduate School, 1997.
`J. R. Smith and S.-F. Chang. Visually searching the
`Web for content. IEEE Multimedia Mag., 4(3):12 - 20,
`July-September 1997.
`S. Paek and J. R. Smith. Detecting image purpose in
`World-Wide Web documents. In IS&T/SPIE Sympo-
`sium on Electronic Imaging: Science and Technology -
`Document Recognitaon, San Jose, CA, January 1998.
`
`11
`
`Authorized licensed use limited to: Finnegan Henderson Farabow Garrett & Dunner. Downloaded on April 09,2021 at 17:12:38 UTC from IEEE Xplore. Restrictions apply.
`
`

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket