As used herein, the term “Consumer Generated Media” (hereinafter CGM) is a phrase that describes a wide variety of Internet web pages or sites, which are sometimes individually labeled as web logs or “blogs”, mobile phone blogs or “mo-blogs”, video hosting blogs or “vlogs” or “vblogs”, forums, electronic discussion messages, Usenet, message boards, BBS emulating services, product review and discussion web sites, online retail sites that support customer comments, social networks, media repositories, audio and video sharing sites/networks and digital libraries. Private non-Internet information systems can host CGM content as well, via environments like Sharepoint, Wiki, Jira, CRM systems, ERP systems, and advertising systems. Other acronyms that describe this space are CCC (consumer created content), WSM (weblogs and social media), WOMM (Word of Mouth Media) or OWOM, (online word of mouth), and many others.
As used herein, the term “Keyphrase” refers to a word, string of words, or groups of words with Boolean modifiers that are used as models for discovering CGM content that might be relevant to a given topic. Could also be an example image, audio file or video file that has characteristics that would be used for content discovery and matching.
As used herein, the term “Post” refers to a single piece of CGM content. This might be a literal weblog posting, a comment, a forum reply, a product review, or any other single element of CGM content.
As used herein, the term “Site” refers to an Internet site which contains CGM content.
As used herein, the term “Blog” refers to an Internet site which contains CGM content.
As used herein, the term “Content” refers to media that resides on CGM sites. CGM is often text, but includes audio files and streams (podcasts, mp3, streamcasts, Internet radio, etc.) video files and streams, animations (flash, java) and other forms of multimedia.
As used herein, the term “UI” refers to a User Interface, that users interact with computer software, perform work, and review results.
As used herein, the term “IM” refers to an Instant Messenger, which is a class of software applications that allow direct text based communication between known peers.
As used herein, the term “Thread” refers to an “original” post and all of the comments connected to it, present on a blog or forum. A discussion thread holds the information of content display order, so this message came first, followed by this, followed by this.
As used herein, the term “Permalink” refers to a URL which persistently points to an individual CGM thread.
The Internet and other computer networks are communication systems. The sophistication of this communication has improved and the primary modes differentiated over time and technological progress. Each primary mode of online communication varies based on a combination of three basic values: privacy and persistence and control. Email as a communications medium is private (communications are initially exchanged only between named recipients), persistent (saved in inboxes or mail servers) but lacks control (once you send the message, you can't take it back, or edit it, or limit re-use of it). Instant messaging is private, typically not persistent (some newer clients are now allowing users to save history, so this mode is changing) and lacks control. Message boards are public (typically all members, and often all Internet users, can access your message) persistent, but lack control (they are typically moderated by a central owner of the board). Chat rooms are public (again, some are membership based) typically not persistent, and lack control.
privacypersistenceauthor controlChat Rooms/IRCnononoInstant MessagingyesnonoForumsnoyesnoEmailyesyesnoBlogsnoyesyessocial networksyes/noyesyesSecond Lifeyesyesyes+
Blogs and Social Networks are the predominant communications mediums that permit author control. By reducing the cost, technical sophistication, and experience required to create and administer a web site, blogs and other persistent online communication have given an unprecedented amount of editorial control to millions of online authors. This has created a unique new environment for creative expression, commentary, discourse, and criticism without the historical limits of editorial control, cost, technical expertise, or distribution/exposure.
There is significant value in the information contained within this public media. Because the opinions, topics of discussion, brands and celebrities mentioned and relationships evinced are typically totally unsolicited, the information presented, if well studied, represents an amazing new source of social insight, consumer feedback, opinion measurement, popularity analysis and messaging data. It also represents a fully exposed, granular network of peer and hierarchical relationships rich with authority and influence. The marketing, advertising, and PR value of this information is unprecedented.
This new medium represents a significant challenge for interested parties to comprehensively understand and interact with. As of Q1 2007 estimates for the number of active, unique online CGM sites (forums, blogs, social networks, etc.) range from 50 to 71 million, with growth rates in the hundreds of thousands of new sites per day. Compared to the typical mediums that PR, Advertising and Marketing businesses and divisions interact with (<1000 TV channels, <1000 radio stations, <1000 major news publications, <10-20 major pundits on any given subject, etc.) this represents a nearly 10,000-fold increase in the number of potential targets for interaction.
Businesses and other motivated communicators have come to depend on software that perform Business Intelligence, Customer Relationship Management, and Enterprise Resource Planning tasks to facilitate accelerated, organized, prioritized, tracked and analyzed interaction with customers and other target groups (voters, consumers, pundits, opinion leaders, analysts, reporters, etc.). These systems have been extended to facilitate IM, E-mail, and telephone interactions. These media have been successfully integrated because of standards (jabber, pop3, smtp, pots, imap) that require that all participant applications conform to a set data format that allows interaction with this data in a predictable way.
Blogs and other CGM generate business value for their owners, both on private sites that use custom or open source software to manage their communications, and for massive public hosts. Because these sites can generate advertising revenue, there is a drive by author/owners to protect the content on these sites, so readers/subscribers/peers have to visit the site, and become exposed to revenue generating advertising, in order to participate in/observe the communication. Because of this financial disincentive, there is no unifying standard for blogs which contains complete data. RSS and Atom feeds allow structured communication of some portion of the communication on sites, but are often very incomplete representations of the data available on a given site. Sites also protect their content from being “stolen” by automated systems with an array of CAPTCHAs, (“Completely Automated Public Turing test to tell Computers and Humans Apart”) email verification, mobile phone text message verification, password authentication, cookie tracking, Uniform Resource Locator (URL) obfuscation, timeouts and Internet Protocol (IP) address tracking.
The result is a massively diverse community that it would be very valuable to understand and interact with, which resists aggregation and unified interaction by way of significant technical diversity, resistance to complete information data standards, and tests that attempt to require one-to-one human interaction with content.