With the rapid development of Internet technologies and applications and the growth of network users, content distribution and sharing characterized by pan-media and massiveness is becoming a mainstream of Internet application development, and big data trends of cyber content are becoming increasingly prominent. Convenience and ubiquity of content distribution urges the content big data in Cyberspace to appear such characteristics as complex, heterogeneous, uneven and disorder, which is difficult to tackle with. Countries all over the world are actively searching for effective techniques to respond to the severe challenges brought by pan-media and content big data. The Second World Internet Conference convened in Wuzhen proposed a development concept of “An Interconnected World Shared and Governed by All”, stressing to promote the changes of the Internet globally through sharing and governance. The core goal of the sharing and governance of the Interconnected World is the sharing and governance of content big data on the Internet. As the continuous updating content big data in Internet is abundant, unstructured (semi-structured) and highly heterogeneous, the key to sharing and governing the content big data is to innovate the content metadata identification method.
Traditionally, Internet mainly employs Uniform Resource Locator (URL) to identify the sources in Internet, but typically it can only represent the location of the content resources, so the difficulty to describe the rich semantics of the content results in many disadvantages of content resources, such as difficulty for searching and governing, confusion and disorder. Some other content identification methods such as Digital Object Identifier and the content identifiers proposed by Information-Centric Networking including hierarchical content identifiers (e.g., TRIAD, CCN and NDN), flat content identifiers (e.g., DONA, PSIRP and NetInf) and property information-based content identifiers (e.g., CBCB), in general, have a feeble ability to describe the semantics and management information of the content, so that requirements for sharing and governance of content big data in Cyberspace is hard to deal with. Additionally, Doublin Core Metadata Element Set (Dublin Core) is influential in the world in recent years and it has developed into a universal content metadata associated with Uniform Resource Identifier (URI), however, 15 core metadata elements in Dublin Core are originated from the digital “library card catalog” which is insufficient for sharing and governing of content big data in Cyberspace. So it is urgent to invent an innovative content metadata identification method capable of supporting sharing and governance of content big data in a big data and pan-media environment, and accordingly propose an application method for effectively supporting high-efficiency sharing and governance of the content big data.