throbber
A Network View of Social Media Platform History: Social Structure,
`Dynamics and Content on YouTube
`
`John C. Paolillo
`Indiana University Bloomington
` paolillo@indiana.edu
`
`Sharad Ghule
`Indiana University Bloomington
` sharadsghule@gmail.com
`
`Brian P. Harper
`Indiana University Bloomington
` bpharper@indiana.edu
`
`Abstract
`
`Social media sites are prone to change from many
`internal and external causes, yet it is difficult to
`directly explore their histories in terms of the content
`itself. Search and browsing features are biased toward
`new and paid content, archives are difficult to navigate
`systematically, and their scale makes any observations
`challenging to contextualize. Here, we present results
`of an ongoing study of YouTube’s history (currently
`with more than 76 million videos) using a combination
`of iterative browsing, network crawling and clustering
`within and across time periods. Through this method,
`we are able to identify historical patterns in YouTube's
`content related to internal and external events. Our
`approach thus illustrates an adaptation of network
`analysis for understanding the content histories of
`social media platforms.
`
`1. Introduction
`
`Currently, YouTube is at a crossroads: YouTube’s
`dominance in online video is now challenged by
`Amazon, Facebook, Hulu, Netflix and Twitch.
`YouTube’s visibility has exposed it to regulatory
`scrutiny and advertiser protests, threatening revenue. In
`response, YouTube has changed
`its advertising
`algorithms and upset the economic viability of many
`channels, alienating channel owners. Any of these
`conditions could induce large changes on the site,
`shaping its content or what we can access of it.
`We therefore need a history that would chronicle
`the emergence and
`influence of
`the platform's
`dominant genres and content types since 2005, ideally
`indexed to changes in the platform's features and
`incentives as well as external world and media events.
`YouTube has archival properties, however, and the
`YouTube public data API reflects
`the historical
`character of the site through the publication dates of
`video and channel metadata. Channels and their videos
`are also structured as a network, via relations such as
`liking and favoriting videos. Can this information be
`used to further illuminate the history of the site?
`
`Our answer to this question is yes, based on a
`network analysis in which the publication dates of
`videos are used to segment the YouTube network into
`a sequence of time slices, covering its entire history
`from May 2005 to December 2016. This analysis
`reveals the evolution of a range of different genres of
`content, which can be read in terms of responses to
`historical events and platform changes. This work
`provides a potentially
`important frame for
`the
`interpretation of past and current studies of YouTube
`content.
`
`2. Literature Review
`
`From its initial pre-launch public availability in
`2005, YouTube rapidly became the dominant platform
`for the distribution of online video. This 12-year
`history has been unstable, punctuated by technical
`changes
`to
`the platform, purchase by Google,
`introduction of advertising, international expansion, for
`example. External events have also had effects: large
`user migrations, political events, copyright lawsuits,
`changes in national and international regulation of
`internet technology, major studio participation in
`YouTube, and the US presidential elections have all
`been felt in different ways by YouTube users.
`Empirical research
`insufficiently contextualizes
`YouTube’s content and its evolution. Early attempts at
`a global-scale analysis of YouTube’s content exist [1],
`but they are either small in comparison to its actual
`scale at the time [2], are based on specific events [3],
`or they do little to address the nature of the content or
`how it might relate to platform features [4, 5]. A
`representative compilation of early
`research on
`YouTube is The YouTube Reader [6]. Early histories of
`the platform exist [7], but numerous changes in the site
`have obscured the relationships among YouTube’s
`features, users, content and external events.
`Other YouTube research has addressed YouTube’s
`politics as a platform [8], the recommender system [9,
`10, 11], social network effects on content propagation
`[3, 12, 13], the features of memes [14], multichannel
`
`Proceedings of the 52nd Hawaii International Conference on System Sciences | 2019
`
`URI: https://hdl.handle.net/10125/59701
`ISBN: 978-0-9981331-2-6
`(CC BY-NC-ND 4.0)
`
`Page 2632
`
`Comcast, Ex. 1143
`
`

`

`
`
`networks [15], and even specific genres of content
`[16]. These pieces often exist
`in
`isolation of
`YouTube’s development over time, as can be seen in
`the contradictory findings at different times regarding
`the popularity of longer videos [17, 18].
`An important contextual component missing from
`the discussion of YouTube is the role of mutual
`support among channels in the cultivation of its genres.
`YouTube’s liked and favorited video playlists offer one
`record of such support, which also flows and ebbs over
`time, as channels become active or dormant. Such
`social processes have been shown to be instrumental in
`genre emergence [19], and a network analysis offers
`one approach for revealing them [20]. Time in network
`analyses, however, has no standardized treatment. We
`therefore ask: how can we use the network of likes and
`favorites among channels to read a history of genre
`evolution on YouTube?
`
`3. Method
`
`The method employed in this study has three main
`components: (i) construction of a sample using
`browsing and crawling and the Google/YouTube
`public data API, (ii) extraction of time-located network
`samples and clustering them, and (iii) organizing and
`interpreting the timeline of network clusters. Each of
`these corresponded to three distinct phases of research,
`discussed in turn below.
`
`3.1 Sampling YouTube
`
`YouTube is large and unwieldy, and its complete
`data are accessible only within Google. Data for
`individual videos are exposed only through search and
`browsing functions that are subject to unknown biases
`(e.g., sponsored search
`features and
`the video
`recommendation algorithm) and cannot be sampled in
`a truly random manner, and we must resort to crawling
`a large sample. Problematically, crawled samples miss
`unconnected components. Consequently, a diversified
`strategy for sampling is necessary, relying on searching
`and browsing to identify starting points for crawling,
`and iterative phases of both activities.
`The initial sample for this study was based on a
`collection of YouTube channel IDs identified for a
`project on conspiracy theory videos in 2015, using
`searching and browsing strategies. A script written for
`the Firefox Greasemonkey plugin was used to collect
`channel IDs into a PostgreSQL database directly while
`browsing. In addition, the script reports whether the
`channel for the current page was already recorded in
`
`the database. YouTube search was used to initiate
`browsing, and browsing strategies were developed so
`as to rapidly gather distinct channel IDs. On a typical
`video page, the first video listed on the right bar often
`comes from the same channel, and the second is an
`advertisement. Videos from the third on come from a
`range of channels: the same channel, related channels
`and "recommended" channels. The last of these are fed
`by a YouTube algorithm that references a user’s
`viewing history; typically these have already been
`visited. We therefore focused attention on videos after
`the first two with unfamiliar channel names, using the
`thumbnails and titles to help recognize if a particular
`video had already been seen. When the initial project
`was broadened beyond conspiracy theories, the same
`strategies were employed, merely using different
`YouTube searches from which to begin browsing.
`Channel IDs from browsing became the seed set for a
`crawl collected through the YouTube public data API.
`Each channel
`is associated with
`three playlists:
`uploads, likes and favorites. The first is merely the list
`of the videos uploaded by the channel; the second and
`third represent videos that users have identified as ones
`they
`like or favorite, using YouTube's
`interface
`features. Typically, these videos are ones produced by
`other channels (though they need not be). YouTube's
`recommendations are generated partly from videos that
`are co-liked or co-favorited with the video being
`watched. Hence, crawling these two playlists to obtain
`the video information and that of their associated
`channels tends to expand the set of channels observed
`while mirroring YouTube's video recommendations.
`Unfortunately, crawling via the API has limitations.
`It does not list channels that liked or favorited a
`particular video, so we must always identify channels
`first. This requires that all our channels post videos,
`when many do not. Relations to such profiles could be
`crawled through the comments feature, but this would
`expand the data collection beyond the capabilities of
`our current system architecture. Similarly, channel
`subscriptions are treated as private by the API, and for
`non-posting channels, likes and favorites can also be
`made private. Without appropriate searching and
`browsing strategies it is likely that sections of the
`network would be missed, especially less popular
`channels. For this reason, the searching/browsing and
`crawling processes were repeated several times from
`July 2015 to March 2017, ending with a sample of
`76,081,372 videos and 549,383 channels.
`The resulting database contains metadata for a
`small but popular and highly connected fraction of the
`total activity on YouTube. Although our sampling
`began with the conspiracy theory channels, these are a
`
`Page 2633
`
`

`

`
`
`small proportion of the final network, which is
`otherwise dominated by entertainment content (below).
`
`3.2 The Network Over Time
`
`Our network analysis of YouTube is based on the
`structure induced by likes and favorites; we treat these
`as indicating directed links between channels, i.e., a
`channel has a (directed) link to another channel as
`strong as the number of times the first channel likes or
`favorites videos uploaded by the second. We treat likes
`and favorites as equivalent because the two relations
`are strongly correlated [4]. Likes and favorites also
`tend to occur in a short window of time after a video is
`released [4]. For this reason, we use the video
`publication date of the liked/favorited video as a proxy
`for historically dating the relationship.
`Using 3-month intervals over the video publication
`dates as a moving window in which to examine
`connectivity of channels, we segmented the network
`into 141 samples, starting from April 2005, shifting the
`window by one month for each sample, and ending
`with the December 2016 sample. To keep our networks
`within a size that we could process, we used a
`threshold of a minimum of 10 likes/favorites from one
`channel to another within any given sample to include
`a link in the network.
`
`3.3 Clustering
`
`There are many approaches to clustering networks
`[21]; here, we employ the Louvain method of [22].
`This algorithm performs well for large networks,
`especially with a high clustering coefficient and a fat-
`tailed degree distribution, as occurs in the YouTube
`network [4]. It performs an agglomerative hierarchical
`clustering in which a node is assigned to a cluster if
`doing so maximizes the modularity of the network,
`continuing until either a single node remains or
`modularity cannot be increased further. Modularity
`clustering is not perfect: it sometimes infers non-
`existing relationships between clusters based on weak
`false positive links [21], and tends to give large
`numbers of clusters in sparse networks. Nonetheless, it
`works well for detecting well-defined but small
`clusters in very large networks, as we expect to be the
`case with YouTube. Since crawling biases our samples
`toward connectivity, we anticipate some issues with
`interpretability in larger clusters.
`Compatible implementations of this algorithm exist
`in Gephi [23] as "modularity class", and in the igraph
`package [24] as the function cluster_louvain(). For
`clustering the samples, we use the implementation in R
`
`[25]. This results in anywhere from 1 to 3747 clusters
`for each sample depending on its size and overall
`connectivity. Clusters are identified by arbitrary ID
`numbers and the only means for identifying them
`across different samples is through their aggregate
`common memberships, which we obtain by cross-
`tabulating clusters from successive pairs of samples.
`This is identical to a treating the entire sample as a
`network of clusters, in which links represent shared
`membership between clusters across different samples.
`For convenience, the cluster comparison network
`was imported in its entirety into Gephi as a directed
`network with no minimum value for a link. Using
`Gephi's modularity class, we assigned each of the
`clusters within samples to new cross-sample clusters.
`Clusters with substantial overlap or that regularly
`exchange members fall together into a single new
`cluster assignment; clusters whose membership largely
`excludes those of another cluster over time appear in
`distinct new clusters, thereby identifying clusters with
`stable yet evolving membership across time. The
`success of this approach depends on the suitability of
`the threshold for the initial samples, the size of the
`moving sample window, the frequency of the samples,
`and the stability of class membership over time, so
`changes in these values would yield different results.
`Gephi also provides network layouts; a suitable
`layout for this data should be able to find a linear
`structure or structures, showing the evolution and
`relative closeness of different content clusters. We
`used two force-directed layouts: the Yifan Hu layout
`for rapidly finding the global structure, and Force Atlas
`to verify that the observed structures were not peculiar
`to Yifan Hu.
`The resulting layout appears as Figure 1, in which
`we find a single linear structure whose two large bends
`and single sharp elbow correspond to gradual and
`sharp changes in cluster membership, respectively. The
`layout has been rotated so that clusters from the earliest
`samples are on the left, and tracing along the main
`connected path takes one through more recent samples,
`to the final sample on the far right. Nodes in Figure 1
`represent the sample modularity classes, with color
`indicating the cluster a node belongs to and size its
`number of members.
`The largest 24 of the clusters (out of 14978 total)
`account for 75.9% of the network’s nodes, with the
`next largest containing only 0.1%. Individual clusters
`are rendered in Figure 2, so that their lifespans can be
`more readily recognized, alongside their relative sizes
`and general type of content (this indicates in which
`subsection it will be discussed below). The largest
`nodes group around a central path, with fine filaments
`
`Page 2634
`
`

`

`
`
`representing the paths through the smaller classes and
`clusters extending outward from it on either side,
`including smaller clusters not shown in Figure 2, as the
`full layout exceeds the margins of the image. To clarify
`the cluster timelines, we produced Figure 3, in which
`each cluster is represented by a horizontal bar spanning
`the x-axis from its beginning point to its endpoint.
`Scanning vertically in Figure 3 indicates which clusters
`overlap at specific times.
`To facilitate cluster interpretation, we created a web
`interface that provided summaries of the number of
`videos in each cluster for each month of the sample,
`along with a listing of ten videos from each of the 100
`most connected channels in the cluster and active links
`to the videos and channels on YouTube. All three co-
`authors explored the full complement of clusters
`through this interface, meeting together to discuss and
`reconsider their interpretations.
`
`4. Interpretation of the network
`
`
`A few observations can be made from Figures 1
`and 2 directly. First, there is a single central core to
`YouTube’s network with varied content, as reported in
`[2]; it is stable over YouTube’s history, although its
`composition changes. Many filaments diverge from the
`core, carrying channels toward or away from it, but
`
`these account for only a quarter of the observed
`network. In other words, YouTube's content is not
`strongly segmented due to, e.g., language markets,
`political polarization or content, as might have been
`expected. Such a pattern would appear as multiple,
`disentangled paths in the network, arising from clusters
`whose exchange of members with the core clusters is
`less frequent. We turn now to the specific patterns of
`content within the clusters that can be observed.
`
`4.1 The Early Years of YouTube
`
`
`Clusters 387, 629, and 909 represent the first stages
`of the development of YouTube's content. Cluster 387
`arises in October 2005, just 8 months after YouTube
`became public;
`it contains mostly music-based
`channels, typically songs made by YouTube users or
`remixes of popular songs. This cluster also contains
`viral videos (for example, “Charlie bit my finger -
`again,” “Evolution of Dance,” and “Sneezing Baby
`Panda”), indicating their importance in YouTube’s
`early history (they are otherwise infrequent). Cluster
`387’s
`content
`is predominantly
`entertainment,
`suggesting that the platform served a limited function
`in its early phase. Clusters 629 and 909 branch off
`from 387, maintaining continuity in both having music
`channels.
`
`Figure 1. Final layout of network of shared membership in modularity classes of 141 sample networks,
`based on 3-month samples of YouTube channel-to-channel likes and favorites spaced at overlapping one-
`month intervals.
`
`
`
`Page 2635
`
`

`

`
`
`Figure 2. Clusters of modularity classes in the network of Figure 1, grouped in columns by type of
`content. Size of each cluster as percentage of modularity classes of the cluster comparison network is
`given in the lower right corner of each panel.
`
`
`Figure 3. Temporal relationships of clusters in Figure 1, grouped by type of content. The left-right location
`and extent of a bar indicates the time period occupied by a cluster; clusters that overlap vertically occur at
`the same time.
`
`Page 2636
`
`

`

`
`
`Cluster 629 features more channels with hip-hop
`and non-pop music genres, alongside other early news
`channels such as the BBC News channel with material
`from its regular broadcasts, and The Young Turks, a
`news/commentary channel created specifically for
`Internet-based consumption. Both 387 and 629 end
`while increasingly accruing content related to the 2008
`US presidential election. The official channel for the
`Barack Obama campaign especially gains prominence
`in summer 2008 in cluster 629, but both clusters
`feature electoral content.
`Cluster 909 features music, comedy (especially for
`young males), and early YouTube-specific channels
`like Machinima, College Humor, and Rooster Teeth. It
`persists later than 387 and 629, possibly owing to its
`lack of political content from the 2008 election. Like
`cluster 629, broadcast media sources appear in 909, but
`while the media in 629 was political, 909 contains
`entertainment, such as the BBC cars and comedy show
`Top Gear. Cluster 909’s last months overlap with the
`launch of Vevo channels, which likely served to
`disrupt the music channels present in this cluster.
`
`4.2 The Two Main Streams
`
`The differences in 629 and 909 become amplified
`after this point, yielding two distinct streams of
`subsequent clusters lasting through the rest of the
`timeline: one with primarily entertainment content and
`the other with political content. The entertainment
`stream, originating from cluster 909, is larger and more
`fluid, with some members separating into specific
`subculture
`topic clusters and
`rejoining popular
`entertainment clusters
`later. The politics stream,
`partially descended from 629, is more stable, and while
`it possesses multiple clusters within it, they have strong
`temporal continuity from cluster 629 onwards. Clusters
`outside these two streams represent language markets
`outside the dominant English-language market on
`YouTube.
`
`4.3 The Entertainment Stream
`
`The Entertainment stream is broad, with the three
`largest clusters in the network. Its size and diversity
`make it somewhat difficult to characterize, but several
`consistent elements recur: music, video games, and
`comedy. This stream begins in late 2009 with clusters
`3327 and 2615, around the time Viacom’s settlement
`with Google that resulted in the Content ID system.
`Google simultaneously changed its rating system from
`a five-star rating system to merely reporting likes. Both
`changes arguably affect
`the connectivity among
`
`channels, and hence may play some role in explaining
`the emergence of two distinct clusters. Cluster 2615 is
`dominated by copyright-protected popular music in the
`form of Vevo channels, i.e., predominantly pop and
`country music. There are occasional parkour and
`skateboarding channels, whose videos often contain
`musical backing similar to Vevo's pop music, though
`they are clearly not professionally mixed. Cluster 3327
`contains music and content created within
`the
`YouTube platform itself, outside of any arrangement
`with Vevo. Its music comes predominantly from
`electronic genres such as dubstep and techno. Cluster
`3327 also shows some of the early video bloggers
`(“vloggers”).
`Cluster 4898 wraps around the elbow in our data
`representation, and both clusters 2615 and 3327 feed
`into it, though not at the same time. This period
`appears to correspond to the shift in YouTube’s
`definition of a view to include watch time in March,
`2012. Prior to the elbow and the sudden decline of
`cluster 3327, 4898 is characterized by a mixture of
`video games and music content. The gaming content is
`not centered clearly on a specific game or genre, and
`the music content is partially Vevo channels formerly
`in cluster 2615 and partially variations of pop music.
`Lindsey Stirling's channel exemplifies this music; at
`this time she produced violin pieces inspired by video
`game music. After channels from cluster 3327 merge
`into 4898, it becomes dominated by personality-driven
`YouTubers:
`GennaMarbles,
`PewDiePie,
`CaptainSparkles, among others. These channels
`combine some contemporary topics, lifestyle, gaming,
`fashion, etc., with comedy focusing on the host's
`personality, and become central to the entertainment
`stream from this period onwards.
`Rather than remaining a new large cluster, 4898
`splinters into multiple clusters: 6445, 9833, and 8778.
`These show similar divisions as 2615 and 3327 did in
`that while all feature popular culture, 8778, 9833, and
`2615 feature content produced for non-YouTube
`audiences (e.g., broadcast and cable television), while
`6445 and 3327 feature content produced specifically
`for YouTube. Cluster 6445, like 4898 before it,
`features personality-driven YouTubers, primarily in
`gaming
`and
`fashion. Cluster
`8778
`features
`predominantly Vevo and other music channels,
`alongside late night comedy show clips. Cluster 9833
`also features late night comedy channels like The
`Tonight Show Starring Jimmy Fallon, although much
`of the cluster focuses on reviews and critique of
`movies. Sketch comedy channels are also present. The
`presence of musical guests on late night comedy shows
`
`Page 2637
`
`

`

`
`
`helps to explain the presence of these shows in both
`clusters.
`These three clusters feed into the two clusters of
`11844 and 14486 at the end of our sample period.
`Much like cluster 4898, which straddled a major
`change in our network, 11844 begins with general
`gaming content, primarily of Let’s Plays of varied
`games, though Minecraft makes a strong and consistent
`showing. As cluster 6445 ends, many of the channels
`from that cluster shift into 11844, leading to the
`personality-driven channels dominating it. Cluster
`14486 began as a combination of fitness and lifestyle
`channels; at this time it also contains many popular
`videos in Italian, German, and Czech, though often
`with parts of their titles in English or subtitles in
`English. Later, language-specific clusters form around
`some of these channels.
`By mid-2015, after clusters 9833 and 8778 end and
`their constituent channels merge into 11844 and 14486,
`it becomes increasingly difficult to accurately describe
`11844 and 14486, with both clusters becoming so
`broad in scope that, while they are still clearly
`entertainment-focused,
`further characterization
`is
`difficult. This trend increases following the end of
`cluster 11844 in mid-2016 as most of the entertainment
`channel fall into 14486. One possible explanation for
`the difficulties in the period after mid-2015 is the
`candidacy announcements of the 2016 US Presidential
`Election, whose use of late-night comedy TV as a
`platform may have destabilized the entertainment
`clusters. A second is that the implementation of
`YouTube's paid subscription service YouTube Red
`may have disrupted some clusters by putting
`interactions behind a paywall where they are no longer
`observable. A third potential explanation is that the
`recency effect in likes and favorites distorts the
`network connectivity at the end of our sample period,
`closer to the point of our actual observations. The
`precise reason cannot be known right now, but we note
`that the clustering may be lower quality toward the end
`of our sample period.
`
`4.4 Subcultural Entertainment
`
`In addition to the main entertainment structure,
`there are several offshoot clusters of entertainment
`videos throughout the network. These clusters usually
`exist for brief periods of time, after which the channels
`involved rejoin one of the larger clusters of the
`entertainment stream. This is not universally true, and
`some channels
`in subcultural clusters
`regularly
`fluctuate between the main cluster and the subcultural
`
`cluster. Some are also more naturally associated with
`the politics stream rather than entertainment.
`Cluster 7047 is one such subculture cluster, and
`lasts for an unusually long time for this network where
`the constituent channels join either clusters 11844 or
`14486. Originally having short skateboarding and
`parkour videos, longer videos made using GoPro or
`other active point-of-view cameras start to show up in
`7047, allowing an “urban spelunking” genre in which
`the content creators explore rooftops, climb
`tall
`buildings, and generally trespass, often without safety
`equipment. Skateboarding’s
`importance diminishes
`after the emergence of the GoPro videos.
`Another subcultural cluster, 5876, features plane
`flights and train operation as its primary material. Its
`focus is clearly technical, especially the inner workings
`of machines. At different points, this cluster attracts
`channels from the Maker movement (electronic gadget
`and computer hackers) and car tinkerers. It transitions
`into cluster 11709 with relatively little difference in the
`channels of interest, although at that point drones
`emerge as a video platform. Cluster 11709 remains
`distinct from the rest of the entertainment cluster until
`the end of our data collection rather than joining cluster
`11844 or 14486.
`Cluster 7953 is dominated by hyper-masculine
`channels aimed at high school and college-aged males.
`The videos
`in
`this cluster
`include sports, male
`heteronormative dating advice, pranks, and videos of
`gameplay in Call of Duty. More so than among the
`other clusters, videos of these channels tend to feature
`video thumbnails with attractive, scantily-clad women
`on them. This cluster could be characterized as
`appealing to “Bro” culture. A different male-audience
`cluster is 9126, featuring cars, guns, and gadgets,
`which partially overlaps with transportation clusters
`5876 and 11709, although 9126 is favors experiences
`of speed and danger with cars more than their technical
`workings. This cluster also includes gun enthusiasts
`and maker channels. Many videos in 9126 demonstrate
`some technology a channel is dedicated to, thereby
`serving an advertising function. Members of this
`cluster arrive from both the entertainment and political
`(see below) streams, and some channels display salient
`political attitudes alongside the subjects of their
`technical interests, a stance less common in other
`entertainment clusters.
`One subcultural cluster in our network is not
`predominantly masculine in character, namely cluster
`10741. This cluster features relaxing content, primarily
`soft music, nature sounds, and videos of animals. As
`the content appears to be based more on a mood than
`on a particular language, this cluster also has some
`
`Page 2638
`
`

`

`television news media
`Broadcast and cable
`YouTube channels appear in this stream, including
`cable news channels ABC, CBC, CBS, CNN, Fox
`News, MSNBC, and NBC, several local affiliate
`channels of these networks. State-supported networks
`BBC and CBC also appear. However, most of these
`television media outlets arrive in this stream later and
`are not as well integrated into the network as the media
`entities discussed previously. While they spend more
`time in the clusters of the politics stream than
`elsewhere, they often cross over into other clusters
`more focused on entertainment. Despite the popularity
`of
`their
`content outside of YouTube,
`their
`likes/favorites pale in comparison to those of RT. It is
`likely that this represents a deliberate strategy of
`linking on the part of RT (and a reciprocal lack of that
`strategy on the part of other news organizations) rather
`than an absolute index of popularity. In other words,
`different organizations differ in their employment of
`search engine optimization
`techniques, and
`this
`influences the likes/favorites we have observed.
`
`4.6 Specific Language Markets
`
`In addition to these two main streams, we identify
`several major clusters of YouTube channels specific to
`different language markets. Clusters 11151, 11841, and
`12760 represent Spanish, Brazilian Portuguese, and
`Russian
`language channels respectively, and are
`remarkably similar in their positions in the network.
`All three start in mid-2011 and persist until the end of
`our collection in 2016; these three are the longest-lived
`clusters in our network. While the Brazilian Portuguese
`and Russian clusters are heavily identified with their
`country of origin, the Spanish cluster, 11151, is
`somewhat more complex: many of
`the popular
`channels in the cluster are from Spain, but channels
`from Latin American countries are also present.
`The Spanish and Brazilian Portuguese clusters
`appear to be most similar to the entertainment stream
`of the larger English-language YouTube environment:
`video gaming and music content dominate the large
`nodes in this cluster, with interspersed news and
`political content. The Russian cluster 12760,
`in
`contrast, appears closer to the political stream in
`content. While there are large channels producing
`gaming and music content, the frequency of political
`content is higher, buoyed by the constant presence of
`the Russian-language RT.
`The specific language market clusters are not
`totally isolated from English language YouTube, as
`seen during February 2013,
`the month of
`the
`Chelyabinsk meteor. Images of the meteor were
`
`
`
`languages, especially
`
`crossover with non-English
`Polish and German.
`
`4.5 The Politics Stream
`
` A
`
` second major stream of channels is one in which
`the content is primarily political. This stream is
`relatively consistent across the timeline, with four
`contiguous clusters spanning the period from the end
`of cluster 629 to the end of our sample. Cluster 629,
`while containing more news and political content than
`909, still contained elements of popular culture like
`hip-hop music. Cluster 1442 does not share these
`elements, and is defined by its juxtaposition of news
`and conspiracy theory channels. These conspiracy
`theory channels promulgate a wide variety of different
`types of conspiracy theories (e.g. those involving extra-
`terrestrials,
`Illuminati/Freemasons,
`anti-Semitic
`themes, and Christian apocalyptic end-times), though
`they tend toward the conservative end of the political
`spectrum. Channels supporting atheism also appear in
`this stream.
`Clusters 1442, 4099, 8430, and 10737 form a
`continuous chain, with clean breaks between the
`clusters. The

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket