throbber
Characteristics of Mobile Web Content
`
`Paul J. Timmins, Sean McCormick, Emmanuel Agu, Craig E. Wills
`Department of Computer Science
`Worcester Polytechnic Institute
`Worcester, MA U.S.A.
`Email: {ptimmins|mccorms}@wpi.edu, {emmanuel|cew}@cs.wpi.edu
`
`Abstract— The World Wide Web is no longer tethered to
`our desktops and laptops. The Web has gone mobile, providing
`instant access to information anywhere and anytime. The mobile
`Web can be considered a shadow of the World Wide Web,
`implemented using specialized markup languages and design
`techniques adapted for comparatively limited mobile phones and
`PDAs. Despite the growing importance and usage of the mobile
`Web, surprising little is known about it.
`This paper presents the results of a study of mobile Web
`content conducted in May and June of 2006. The study examines
`the content of over one-million mobile Web pages from around
`the world using a search-assisted crawling methodology to locate
`and study pages for three of the most popular mobile Web
`formats—WML 1.0, WML 2.0/XHTML Mobile Profile (XHTML-
`MP) and Compact HTML (C-HTML). The objective is to study
`the relative characteristics of these mobile Web content formats,
`as well as compare them with a similar sampling of non-mobile
`(HTML) content.
`We found that WML is the dominant mobile Web content
`type, although regional differences do exist. We found that all
`three mobile content types studied were on the same order of
`magnitude for average page characteristics such as number of
`links (under 10) and number of images (around 1), but pages in
`the newest format, XHTML-MP, are 50% larger on average than
`those in WML. Not surprisingly, all of these characteristics are
`much smaller than for HTML content pages gathered with the
`same methodology. In terms of specific features, only 7% of pages
`used WML cards, but 50% of XHTML-MP servers dynamically
`adapted the content served based on the user agent. Finally, we
`found less than 4% of mobile pages contained ad objects, which
`is much less than for HTML pages.
`
`I. INTRODUCTION
`The World Wide Web is no longer tethered to our desktops
`and laptops. Web content is increasingly available in mobile
`Web formats that facilitate information access by cell phones,
`PDAs, Internet connected watches and other portable comput-
`ing devices. Mobile users can now access sports, news, stock
`charts and other Web content while on the move.
`Content targeted at mobile devices is typically designed to
`mitigate the lower bandwidths of wireless networks as well as
`the reduced CPU and storage limitations of mobile devices. In
`order to maintain reasonable download times, Web designers
`reduce the size of mobile Web pages and the number of images
`per page.
`A number of early Web characterization studies informed
`and influenced the development of the wired Web [1], [2],
`[3]. As the mobile Web develops,
`it
`is important
`to un-
`derstand the characteristics of how it
`is being used. This
`characterization shall inform optimal choices for configuring
`
`network equipment and optimizing Web server parameters,
`such as buffer sizes and maximum number of incomplete
`TCP connections. PDA and cell phone manufacturers can also
`estimate minimum hardware specifications for future mobile
`devices. Understanding the nature of mobile Web content shall
`also drive more accurate simulations of Web content in the
`research community.
`In spite of these benefits, surprisingly few attempts have
`been made to measure mobile Web content and quantify
`typical page and site characteristics. In this paper, we present
`the results of a large-scale study to characterize mobile Web
`content. The study examines the content of over one-million
`mobile Web pages collected during search-assisted crawls in
`May and June 2006. The study located and studied pages
`for three of the most popular mobile Web formats—WML,
`XHTML Mobile Profile (XHTML-MP) and C-HTML. We also
`use this same methodology to retrieve and measure wired Web
`content as a baseline of comparison.
`We found that WML is the dominant mobile Web content
`type, although regional differences do exist—WML (WAP 1.0)
`is most popular in Europe and C-HTML (i-mode) is most
`popular in Japan. C-HTML crawling, and therefore C-HTML
`results, were significantly limited by the fact that many such
`sites are accessible only through NTT DoCoMo’s “i-mode
`menu” service. We found that all three mobile content types
`studied were on the same order of magnitude for average
`page characteristics such as number of links (under 10) and
`number of images (around 1), but pages in the newest format,
`XHTML-MP, are 50% larger on average than those in WML.
`Not surprisingly, all of these characteristics are much smaller
`than for HTML content pages gathered with the same method-
`ology. In terms of specific features, only 7% of pages used
`WML cards, but 50% of XHTML-MP servers dynamically
`adapted the content served based on the user agent. Finally,
`we found less than 4% of mobile pages contained ad objects,
`which is much less than for HTML pages.
`This paper is organized as follows. Section II provides
`background on mobile Web technologies. Section III outlines
`the questions we intend to answer with this study and Sec-
`tion IV details the methodology used. The results of our
`study are presented in Section V with a summary of these
`results in Section VI. Section VII describes related work
`and Section VIII outlines potential future work as well as
`summarizes the findings of this work.
`
`Zynga Ex. 1020, p. 1
` Zynga v. IGT
` IPR2022-00199
`
`

`

`II. MOBILE WEB BACKGROUND
`This section reviews technical details of three of the most
`popular mobile Web technologies (WAP 1.0, i-mode and WAP
`2.0) and compares them to the wired Web and HTML. These
`technologies can be distinguished by the devices that support
`them, their protocol stacks, and their markup language.
`WAP 1.0 was introduced in 1998 as the first Mobile Web
`standard [4]. Several companies including Nokia and Motorola
`teamed up to develop the initial Wireless Application Protocol
`(WAP) protocol stack. It was envisioned that WAP 1.0 would
`enable a wide range of devices including mobile phones,
`laptops and PDAs, to send email and access the Web. Due to
`the limited resources of mobile devices, a lightweight, XML-
`based standard called the Wireless Markup Language (WML),
`was developed. WML also supports a ”deck of cards” feature
`that allows the Web programmer to aggregate multiple related
`pages (cards) into a batch (deck). WAP 1.0 is connection-
`oriented and a mobile user has to make a telephone call to the
`web server while web pages are being downloaded.
`The i-mode (information mode) system was created in Japan
`at about the same time as WAP 1.0. It was deployed by NTT
`DoCoMo, the Japanese mobile network operator, in early 1999
`and is loosely based on the WWW protocols. The i-mode
`system allows its users to email, surf the Web, and exchange
`images but requires specialized mobile handsets. Pages in i-
`mode are programmed using compact HTML (C-HTML) [5],
`a markup language that is similar to HTML 1.1. Most i-mode
`sites are accessible through NTT DoCoMo’s “i-mode menu”
`service, which limits access to most i-mode sites to paying
`customers.
`The second generation WAP 2.0 was introduced in 2001. It
`was designed by the WAP forum to be backwards compatible
`with the WAP 1.X protocols and WML. It includes a lot of
`the features of i-mode and Internet protocols, as well as new
`features. Compared with WAP 1.0, which has a maximum
`speed of 9.6 kbps, WAP 2.0 operates at speeds of up to 384
`kbps. It supports XHTML, a new markup language that was
`developed for a variety of low computing power devices such
`as televisions, vending machines, mobile phones, PDAs, and
`watches. XHTML-MP [6], a mobile profile, extends XHTML
`basic by adding features to enhance the Web experience on
`resource constrained mobile devices. WAP 2.0 is designed to
`run over packet-switched networks and supports both push and
`pull models of content access.
`
`III. STUDY
`With this background, the broad goal of the study is to
`understand the characteristics of mobile Web content as it is
`currently being used. This broad goal encompasses a number
`of specific questions that form the basis for the methodology
`used in this work. These questions and their rationale are:
`1) Format usage: What is the relative usage of the three
`markup languages—WML vs. C-HTML vs. XHTML-
`MP? The answer to this question establishes the relative
`use of these three formats by content providers.
`
`2) Geographic distribution: What is the geographic distri-
`bution of content in the three markup languages? It is
`important to understand not only how much, but where
`the different content types are being used.
`3) Page sizes: What is the distribution of markup (base
`page) and total page size for each of the three formats
`as well as baseline HTML content? A standard charac-
`terization for Web content is to understand the size of
`pages in terms of the number of objects they contain,
`the number of servers these objects come from and the
`total number of bytes contained.
`4) Page connectivity: What is the degree of connectivity
`in terms of the number of links for mobile pages
`and are these links internal to the same domain or to
`different domains? This question examines how the level
`of connectedness compares amongst the three content
`formats and with HTML content.
`5) Image content: What are the characteristics of image
`objects in mobile Web content in terms of number on
`a page and size? Images are commonly used in wired
`Web content. It is important to understand how much
`they are used in mobile content.
`6) WML cards: Are unique schema features, such as WML
`cards, used? WML content can be served as bundles of
`pages or “decks.” An interesting question is to under-
`stand how much this feature is used.
`7) User agent adaptation: To what degree do servers adapt
`the markup type of the content based on the User-Agent
`field of the HTTP request header? This question affords
`better understanding on whether users need to explicitly
`identify the needed content type or whether servers can
`and do make the appropriate transformation.
`8) Advertisement content: What is the presence of adver-
`tisement content in the mobile Web world? Previous
`work found that ad content exists on the majority of
`popular pages [7] and we are interested to understand
`its use in mobile pages.
`
`IV. METHODOLOGY
`This study was conducted in two phases. In the first phase,
`mobile Web servers were crawled to find and retrieve mobile
`Web content. An open source Web crawler was modified to
`classify content as mobile, non-mobile HTML, image or other.
`Mobile and non-mobile content was retrieved in its entirety,
`whereas only the size and URL of images were obtained.
`The second phase of this study analyzed the retrieved pages
`to measure page size distributions, connectedness, and design
`features. Key features of each page were summarized in a
`MySQL database, allowing detailed analysis through SQL
`queries.
`
`A. Mobile Content Crawler
`To find and retrieve mobile Web pages, we modified
`Larbin, an open source Web crawler http://larbin.
`sourceforge.net/index-eng.html, to create a “Mo-
`bile Content Crawler.” Larbin provides a configurable and
`
`Zynga Ex. 1020, p. 2
` Zynga v. IGT
` IPR2022-00199
`
`

`

`extensible framework for Web crawling, with many options to
`control the crawler’s behavior. The Mobile Content Crawler
`extends Larbin to identify mobile content, record page and
`image metadata to disk, (including HTTP response headers and
`content size), and retrieve the individual pages. The crawler
`was configured to use a 30-second delay between consecutive
`retrievals from a single server, with no delay for links to
`new servers. The effect of this is a preference on discovering
`new servers, but continued discovery of new pages within a
`site. Additionally, modifications were made to the Crawler
`so that it filtered non-mobile HTML content, recorded the
`image size/URL then discarded the image file, and ignored
`“robots.txt” (used by servers to prevent crawler access) so it
`could crawl search engine results.
`In trial runs, Larbin was only modified to support mobile
`content, but not filter out non-mobile content, and seeded with
`a set of 14 starting mobile Web URLs, such as mobile.
`espn.com and wap.yahoo.com. These starting URLs
`were manually selected to represent a diverse population of
`content from a variety of mobile markup languages. During
`these trial runs, it was observed that a disproportionate number
`of HTML (non-mobile) Web pages were retrieved. As later
`results will show, this result is probably due to the higher
`connectivity (in terms of hyperlinks), of HTML content. The
`resulting effect was that retrieving HTML content reduced the
`number of mobile pages retrieved in that run.
`To improve the crawler’s capability to retrieve the desired
`type of content, Larbin was modified to filter non-mobile con-
`tent and retrieve only pages that could be explicitly identified
`as being mobile. This filtering was based on the content-type
`response header and the document type, if present. To reduce
`the volume of data stored, Larbin was also modified to first
`download images and store the size of the image in the header,
`using a preprocessor directive.
`In addition to encountering problems in filtering out non-
`mobile content, the trial runs also showed problems in using a
`small fixed set of starting URLs. The result were not as diverse
`as desired either in the content type or the subject matter. As
`a consequence a search-assisted strategy was employed.
`
`B. Search-Assisted Crawling
`To address problems encountered in the trial runs, a search-
`assisted crawling strategy was employed for this work. It was
`noted that Google’s Mobile Web search engine (mobile.
`google.com) provides access to a large index of mobile
`web sites, and thus the search results as crawling starting
`points. This is similar to the strategy used in a previous study
`of Spyware [8]. Rather than directly select a set of starting
`URLs, the results of a Google Mobile Web search were used
`as crawling starting points, passing in specific keywords to the
`search engine. Google Mobile Web search allows searching of
`content by markup type (WML, XHTML-MP, or C-HTML).
`As a comparison, we also issued a search for HTML-based
`content using Google’s standard search engine.
`A number of keywords were used to obtain search results
`for each of the four types of content to seed our crawler. These
`
`keywords are shown in Table I, chosen to ensure diversity of
`search results. The upper portion of the table shows category-
`based keywords while the lower portion shows that we focused
`some searches on servers from specific country-based Top
`Level Domains (TLDs). These keywords are intended to obtain
`a broad set of pages for seeding.
`
`TABLE I
`SEARCH KEYWORDS USED FOR CRAWL SEEDING
`news
`sports
`weather
`games
`portal
`science
`health
`business
`finance
`arts
`shopping
`world
`site:.jp
`site:.uk
`site:.ja
`site:.au
`site:.ve
`site:.cn
`site:.kp
`site:.ca
`
`This study is based on four crawl runs done in May/June
`2006, focusing on collecting HTML, WML, XHTML-MP/C-
`HTML, and C-HTML-only, all using the same keywords
`but different Google Mobile search restriction options. We
`found that, regardless of the search restriction option (ie:
`mrestrict=wml), crawl results are dominated by the most
`popular markup. Therefore, multiple runs were used, with later
`runs filtering out HTML and WML results. Additionally, the
`crawler was configured to report an appropriate User-Agent
`string in the HTTP request header, depending on the type of
`content we were attempting to crawl.
`As with any crawling, the choice of starting points can
`bias the results. Search-engine assisted crawling, as used in
`[8], is biased by the search-engine’s results. Fewer than 15%
`of servers were directly linked from the search results, with
`the remaining servers being crawled indirectly by following
`hyperlinks. This high percentage of indirectly-crawled servers
`lessens the impact of bias caused by the search results.
`An early goal of this research was to additionally contrast
`content based on subject matter, such as comparing news
`versus finance content. This goal evolved into the search-
`assisted crawling strategy, with the assumption that content
`could be characterize based on search keyword. However it
`was observed that a high degree of pages associated with
`multiple search keywords, thus this goal was set aside for
`future research. Alternative crawler strategies, including shal-
`lower searches, might yield results that are distinguishable by
`keywords.
`
`C. Identifying Mobile Content
`
`Two techniques were employed to identify content as be-
`ing mobile content. First, the Document Type Declarations
`(DOCTYPEs) identify the DTD for a particular XML docu-
`ment. DOCTYPEs are optional, however were present in over
`75% of servers crawled. Secondly, the HTTP CONTENT-
`TYPE response header identifies the expected type, such as
`HTML or image content. WML content is identified with a
`CONTENT-TYPE of “text/vnd.wap.wml”, and all other text
`content typically identified simply as “text/html”.
`
`Zynga Ex. 1020, p. 3
` Zynga v. IGT
` IPR2022-00199
`
`

`

`TABLE II
`PAGE, SERVER AND DOMAIN CONTENT TYPE STATISTICS
`
`Type
`WML
`XHTML-MP
`C-HTML
`HTML
`
`Num. Pages
`1,055,589
`145,314
`14,206
`227,462
`
`Num. Servers
`13,672
`842
`27
`47,110
`
`Num. Domains
`5,734
`446
`26
`38,143
`
`Avg Pages/Server
`77
`173
`526
`5
`
`Content was first characterized by the CONTENT-TYPE,
`then by the DOCTYPE. The DOCTYPE of each page was
`used to classify content into the following categories,
`• WML: DTD WML
`• XHTML-MP: “XHTML Mobile Profile” or “XHTML
`Basic”
`• C-HTML: “Compact HTML”
`A possible limitation is our method of identifying content
`based on the DOCTYPE XML tag, which is was only present
`in pages from 71% of servers. We observed the majority
`of the remaining pages contained “wap”, “imode”, “chtml”,
`“wml” or “mobile” in their URLs, but did not contain the
`necessary DOCTYPE. Manual
`inspections of these pages
`indicated that the presence of these keywords in a URL did not
`necessarily indicate mobile content, and therefore such content
`was excluded from the study.
`
`D. Analyzing Page Content
`
`Once the pages were retrieved and data about them stored
`in a MySQL database then the last phase of our study gathered
`data needed to answer the study questions. To obtain the size
`of a mobile Web page, queries were written to retrieve image
`and page sizes. This result indicates the amount of information
`that is downloaded by a mobile browser. From a resource per-
`spective, large Web pages would consume excessive amounts
`of memory, CPU, battery power and wireless bandwidth. They
`would also take too long to download. However, small pages
`may not contain all the information that a user wants, making
`it necessary to establish new TCP connections to download
`additional pages.
`We also examine the number of links on each Web page and
`distinguish whether the target pages are located on the same
`domain (internal link) or on a different domain (external link).
`Computing the internal versus external link numbers for both
`mobile and wired content is a means to compare their degrees
`of connectivity.
`Links and images were counted uniquely per page, therefore
`multiple links to the same URL counted as one link. As well,
`the total page size was computed by adding the markup size to
`the size of each image in the page. Only complete pages, pages
`where the size of each image was measured, are reported so as
`to not skew the results, and images were counted only once per
`page as most browsers will not retrieve the same image more
`than once per page. In addition, total page size was normalized
`to compensate for the fact that pages with images were less
`likely to be completely retrieved by summing the weighted
`
`average of the markup size of pages with no images with the
`markup and image size of pages with images.
`
`V. RESULTS
`This section presents results from our search-based crawling
`approach and subsequent analysis. The results are presented
`in a parallel order to the study questions posed in Section III.
`
`A. Format Usage
`Table II summarizes our crawl results concerning pages,
`servers and “domains” for each of the four formats. We define
`the domain of a server to be its 2nd-level domain1 so servers
`such as www.cnn.com and images.cnn.com are each
`part of the cnn.com domain.
`A key observation from Table II is that the number of WML
`pages found was an order of magnitude more than XHTML-
`MP, and and two orders of magnitude greater than C-HTML.
`This result could either imply that there are indeed more
`WML pages in existence, or the results could also be skewed
`by the chosen search strategy. The low representation of C-
`HTML content is presumably due to NTT DoCoMo’s “i-mode
`menu” service, which provides paying customers access to pre-
`approved i-mode sites and thus is not accessible by the general
`public. Subsequent crawling runs using different search and
`filtering tactics were used to attempt to increase the number
`of XHTML-MP and C-HTML pages collected, including using
`Google’s international web servers and filtering out WML
`results. These subsequent crawling runs did not significantly
`increase the number of pages.
`Table II also provides the average number of pages crawled
`per server. Recall from Section IV, the crawler was configured
`to wait at least 30 seconds before page retrievals from the same
`web site, and would immediately retrieve pages from newly
`identified servers. The resulting crawler behavior is to prefer
`breadth over depth. Thus, the HTML results are expected: an
`average of only 5 pages per server, showing that the crawler
`was continually identifying and crawling new servers and
`thus is expected from the highest degree of external links
`(Table VI). For mobile content, C-HTML and XHTML-MP
`had much higher numbers of pages per server than WML, a
`fact not explained by the number of external links per page.
`This results can be attributed to C-HTML and XHTML-MP
`pages linking to a less diverse set of servers than the WML
`pages.
`
`1In cases where the Top Level Domain (TLD) is a country code and the
`TLD is subdivided using recognizable domains such as “com” or “co” then
`the domain of a server is its 3rd-level domain.
`
`Zynga Ex. 1020, p. 4
` Zynga v. IGT
` IPR2022-00199
`
`

`

`B. Geographic Distribution
`Tables III and IV summarize the percentage of unique Web
`servers by top level domain, which provides some means to
`understand the geographic distribution of the gathered pages.
`The international flavor of the results show that the usage
`of mobile content
`is global. As expected, WML is more
`popular in Europe and China where WAP is mostly used.
`The breakdown of C-HTML sites by top level domain is not
`reported, due to the significantly fewer sites included in the
`study.
`
`TABLE III
`TOP LEVEL DOMAIN BREAKDOWN FOR WML CONTENT
`Domain
`% of WML Servers
`.com
`30%
`.ru (Russia)
`22%
`.cn (China)
`13%
`.net
`8%
`.hu (Hungary)
`3%
`.de (Germany)
`2%
`.org
`2%
`.cz (Czech Republic)
`2%
`.uk (United Kingdom)
`2%
`other
`18%
`
`TABLE IV
`TOP LEVEL DOMAIN BREAKDOWN FOR XHTML-MP CONTENT
`Domain
`% of XHTML-MP Servers
`.com
`47%
`.ru (Russian Federation)
`15%
`.net
`10%
`.jp (Japan)
`4%
`.de (Germany)
`3%
`.ch (China)
`2%
`.no (Norway)
`2%
`.cn (Canada)
`2%
`other
`14%
`
`Fig. 1. Distribution of Content (Markup) Sizes
`
`C. Page Sizes
`Figures 1 and 2 show the distribution of page sizes, in terms
`of markup alone and total page size, emphasizing the size
`difference between HTML and mobile content. Total page size
`was only reported for pages where all images were collected,
`so that partial page sizes were not reported.
`Table V shows average sizes of mobile markup, as well as
`the average total (markup + images). Total size counts each
`unique image in a page exactly once, but background images
`are not included. WML markup objects are the smallest on
`average with 2,159 bytes. As a comparison, results published
`in 2003 report an average of size of 1,230 bytes [9]. XHTML-
`MP markup objects were larger than WML pages by 40% on
`average, at 3,018 bytes. C-HTML markup objects were close
`on average to those of XHTML-MP type at an average size
`of 2,911 bytes. As is expected, HTML markup objects are the
`largest, by an order of magnitude at an average size of 35,490
`bytes. As shown in the last column of Table V, the relative
`results for the average total page size are comparable for all
`
`Fig. 2. Distribution of Total Page Size (Images + Markup)
`
`four content types, although the sizes of XHTML-MP and C-
`HTML pages are more than 50% larger on average than WML
`pages.
`Markup and image sizes are important considerations for
`content providers aiming to provide acceptable performance
`for their users. By providing content that is at or below average
`size, content providers can ensure that users will not suffer
`from abnormally long network transmission delays or memory
`consumption problems arising from large content. Mobile
`Web browsers and platforms should also be designed with
`expected content sizes in mind to ensure adequate memory
`and resources are provided to allow pages to be retrieved and
`cached. Results show that page sizes for the newer XHTML-
`MP format are an order of magnitude less than for HTML,
`but more than 50% larger than for WML.
`
`Zynga Ex. 1020, p. 5
` Zynga v. IGT
` IPR2022-00199
`
`

`

`TABLE V
`CRAWLING RESULTS SUMMARY
`Avg Markup Size (Bytes)
`Avg Total Size (Bytes)
`2,159
`3,223
`3,018
`5,109
`2,911
`4,835
`35,490
`50,187
`
`Type
`WML
`XHTML-MP
`C-HTML
`HTML
`
`D. Page Connectivity
`To examine the connectivity of pages, we measured the
`number of links from each page. Table VI shows the average
`number of internal and external links per page. Internal links
`are defined as links within the same domain, such as from
`www.cnn.com to edition.cnn.com. External links are
`defined as links across domains.
`
`Type
`
`TABLE VI
`AVERAGE NUMBER OF LINKS PER PAGE
`# Internal
`# External
`Link Density
`Links
`Links
`(bytes/link)
`7.6
`1.1
`248
`8.3
`1.1
`321
`6.2
`2.6
`331
`56.7
`24.2
`439
`
`WML
`XHTML-MP
`C-HTML
`HTML
`
`Fig. 3. Distribution of Internal Links Per Page
`
`Not surprisingly due to the significantly smaller page size,
`mobile content has significantly fewer links overall
`than
`HTML content. Figure 3 shows the median number of internal
`links for HTML pages is around 35 while for mobile pages the
`median is less than 5. Normalized by page sizes (see Table V),
`it is interesting to note that mobile markup had significantly
`higher link densities than non-mobile markup, with WML
`having nearly double the number of links per byte than HTML.
`By necessity, mobile pages tend to be more concise than
`non-mobile due to the limited display sizes and user interface
`capabilities of mobile devices. Small displays require concise
`page design, to minimize excessive scrolling/paging of infor-
`mation. However, it is notable that at the high end, 10% of
`mobile pages had greater than 20 internal links, a surprisingly
`high number that would require scrolling by the user on most
`mobile devices. Figure 4 shows the three content types have
`a comparable distribution of external links, although a longer
`tail for C-HTML results in a higher average in Table VI.
`
`E. Image Content
`Table VII lists the average number of internal and external
`images per page while the distribution of the number of images
`per page is shown in Figure 5. A large number of pages
`contained no images (in <IMG> tags). Our results found 44%
`of XHTML-MP pages, 61% of WML pages, and 70% of C-
`HTML pages contained no images. However, there is a long
`tail of pages that make use of more than five images. HTML
`pages, on the other hand, have significantly more images with
`a median of 12 images per page, and only 9% containing
`no images. Note that these results do not include background
`images, which are included through stylesheets. In addition,
`mobile content was less likely to contain external images,
`
`Fig. 4. Distribution of External Links Per Page
`
`which was generally served from a fewer number of servers.
`The implication is that fewer TCP connections would need
`to be established, resulting in reduced bandwidth, power and
`overall page loading time.
`Figure 6 shows the distribution of image sizes found for
`each content type. What is of interest in these results is that
`the median image sizes for each of the four content types is on
`the order of 1000 bytes. This result is because HTML pages
`use many images that are small in nature. As expected, HTML
`has more images that are larger.
`
`F. WML Cards
`The unit of transmission of WML content is a single deck,
`with each deck containing multiple individual pages which are
`referred to as cards. A card is the unit which is viewable in a
`Web browser. This feature allows a server to deliver multiple
`pages in a single transmission, eliminating a roundtrip to the
`
`Zynga Ex. 1020, p. 6
` Zynga v. IGT
` IPR2022-00199
`
`

`

`Type
`
`TABLE VII
`AVERAGE NUMBER OF IMAGES PER PAGE
`# Internal
`# External
`Pages w/
`Images
`Images
`No Images
`0.7
`0.2
`61%
`1.3
`0.3
`44%
`0.4
`0.04
`70%
`11.9
`3.7
`9%
`
`WML
`XHTML-MP
`C-HTML
`HTML
`
`Fig. 6. Distribution of Image Sizes
`
`perceived performance may be attributed to a number of
`causes. First, it is difficult to identify which content should
`be bundled with other pages. Second, the concept of cards
`and decks deviates from typical HTML Web design, requiring
`a shift by content providers These results indicate that the
`lack of an equivalent feature (bundling pages into decks) in
`the XHTML-MP is not currently a significant drawback, due
`to the low usage of WML cards. Further, content providers
`are assured that despite the benefits of cards, few servers used
`this technique to improve the user experience.
`
`G. User Agent Adaptation
`the markup type of
`Some servers were found to adapt
`content depending on the client browser, detected through the
`User-Agent client header. To measure the extent to which this
`technique is used, we conducted tests in which we probed a
`given URL using six different User-Agent strings, shown in
`Table IX. This test was repeated for servers selected at random
`from our crawl results.
`458 XHTML-MP servers were sampled, of which 50%
`(230) served different DOCTYPEs depending on the User-
`Agent. 98% (226) of these served WML content only to the
`WML User-Agent. In contrast, of the 5,291 WML servers sam-
`pled, only 0.5% (29) served different DOCTYPEs depending
`on the User-Agent.
`This large difference may be based on a concern of
`backwards compatibility with older WML 1.0 browsers. As
`XHTML-MP is a newer generation markup language, not
`supported on all mobile browsers, the safest route is to serve
`WML content to older WML 1.0 devices.
`
`H. Advertisement Content
`Another characteristic we examined is the presence of
`advertisement content in mobile pages. Work in [7] found
`that ad content was contained on the majority of popular
`pages and we were interested in its use for mobile pages. To
`test for the presence of ads, we used the same methodology
`
`Fig. 5. Distribution of Number of Images Per Page
`
`server in the event that a user accesses another page in the
`deck. This is a unique feature of WML 1.0.
`As part of this study, we set out to understand whether
`WML cards were indeed used to improve user performance.
`Table VIII shows that 93% of WML pages crawled contained
`a single card, indicating rare usage of cards for improving
`user performance. Of the pages that contained more than
`a single card, 4% of pages contained exactly 2 cards. The
`remaining 3% of pages contained more than 2 cards. Of the
`pages containing more than one card: the mean number of
`cards was 16, and the standard deviation was 88.4.
`From a server perspective, of the total of 13,672 unique
`WML servers, 15% contained at least one page with more
`than 1 card in the deck. There was a long tail of remaining
`servers, with 8% of servers containing more than 2 cards, 5%
`contained more than 3 cards, and 0.8% containing more than
`10 cards.
`There were a small, but nonetheless surprising, number of
`large outliers: over 2,000 pages from 16 servers had more than
`100 cards in a deck and 83 pages from 2 servers contained
`more than 1,000 cards in a deck.
`
`TABLE VIII
`NUMBER OF CARDS PER WML PAGE
`Avg # Max # Cards
`Pages
`Cards Per Page
`in a Page
`w/ 1 Card
`1.8
`1,085
`93%
`
`Pages
`w/ 2 Cards
`4%
`
`The lack of widespread use of cards to improve user
`
`Zynga Ex. 1020, p. 7
` Zynga v. IGT
` IPR2022-00199
`
`

`

`Browser Type
`WML 1.0
`XHTML-MP/WML 2.0
`
`i-mode
`Internet Explorer 6.0
`
`FireFox 1.5
`
`Palm Browser
`
`TABLE IX
`SELECTED USER AGENT STRINGS
`User-Agent String
`EricssonR320/R1A UP.Link/4.1.0.1
`KDDI-KC31 UP.Browser/6.2.0.5
`(GUI) MMP/2.0
`DoCoMo/2.0 SH901iC(c1

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket