Peer-to-Peer (P2P) networks are a particular kind of environment created at the application layer by a local software application, which is adapted to communicate to other users in the network that run the same software. This creates an overlay network at the application layer wherein every end user shares his/her own contents and resources with the peers of the whole overlay.
The “cooperation” aspect represents the main virtue of P2P systems because it allows the community to download/upload contents in a mutual cooperation mode and to grow indefinitely without the decreased need of any relatively powerful/dedicated servers. This feature is currently named network scalability.
There has long existed an interest for the ability to make and use the functions of P2P networks in a media context, and more particularly, with respect to audio and video signal streaming for web TV applications, which are encouraged by the growing spreading of wide band distribution networks. This is true both for the ability to deliver professional content and for the possible distribution of User Generated Content, where each user can be at a same time, a producer and a consumer of content, becoming in this way a “prosumer” (producer/consumer).
FIG. 1 shows, in the form of a functional block diagram, the main components of a P2P system. Specifically, in the diagram of FIG. 1, the references P0 and P1 schematically show the (fixed/mobile) terminals of two end users who cooperate in a P2P system by making use of a tracker T of a terminal operating as a client C0 and of a web server WS.
The file sharing in P2P systems is based upon the running of programs, which are used to create and maintain a network enabling the transmission of files among users. Users can therefore both download files from other users of the P2P network and specify file sets in the file system of their own terminal, which are adapted to be shared with others, i.e. to be made available to other users of the P2P network.
A file sharing protocol in a currently used Peer-to-Peer network adapted to distribute large amounts of data is known as BitTorrent. Beside the original client version, this protocol is available in several implementations, which are substantially analogous to, for example, aria2, ABC, BitComet, BitTornado, Deluge, Shareaza, Transmission, μTorrent, and Vuze (former Azureus). At present, several clients are available for a variety of computing platforms, which are capable of preparing, requesting, and transmitting any type of file over a network by making use of a protocol of this kind.
Substantially, the system schematically shown in FIG. 1 is based on the use of .torrent files that include metadata information about the original file to be shared by the P2P network users and by the tracker T which keeps track of the peers sharing the content. The tracker T plays the role of a central entity with which peers, such as P0 and P1, communicate periodically (substantially through a mechanism of periodical registration), so as to be aware of one another. The tracker T sends out and receives peer information and also maintains peer statistics.
It will, however, be appreciated that while the diagram of FIG. 1 refers to a tracker T of a centralized kind, it may also possible to resort to distributed approaches (such as, the Distributed Hash Table (DHT), for example) to keep track of the peers who are sharing a certain content at a certain point of time. It will be appreciated that what will be described in the following part of the present disclosure is irrespective of whether a centralized approach versus a distributed approach are resorted to. As for the BitTorrent clients sharing and downloading the content, at least one (i.e. the client denoted by C0 in FIG. 1) accesses the whole file made available by the web server WS for downloading. According to the current approach, the server WS is what the end user sees first at the moment of choosing the .torrent file. The web server WS is therefore one of the complementary actors to be taken into consideration while implementing a P2P network. To be more precise, the server WS is one of the possible entities that distribute the metadata, i.e. the .torrent file. Every peer in the network retrieves such a file to be able to access the media content itself. The way in which the metadata is retrieved may not be previously defined. The typical case is when the peers download it from the WS (or from another equivalent web server) through a normal client/server protocol. It is, however, possible to retrieve the metadata in other ways (via chat, Facebook, email, USB key, etc.).
In any case, the metadata may be distributed outside the P2P network. The overall structure of a torrent file (e.g. MyFile.torrent) includes the URL of the tracker T, and a dictionary or look-up (info) including the keys. One key is name, which is a name which is suggested to save the informing entity. If the entity is a single file, this key may represent a file name. If the entity comprises several files, this key may map to a directory name. Another key is piece length, which is the size of each piece of the entity, and a string (named “Pieces(*)”), which may include the concatenation of SHA1 hashes of each piece of the entity.
A length key includes the length of the file in bytes. If this key is present, it means that the entity is a single file; otherwise, the “files” key will be present, with the related list of the files set. If the entity to be downloaded is a directory of multiple files, instead of a length the “files” key will be present, with the related metadata information.
The files key includes a list of files and directories with the following keys: length: the length of the file in bytes; and path: a list of strings containing sub-directory names, the last string being the file name. In the case of a set of files, the directory name will be present. The .torrent files with this structure are metadata files that are created before the file or the files (i.e. the “entity”) are shared.
Although they may not constitute the entity itself, .torrent files include the metadata to allow a BitTorrent client to download an entity (e.g., as already mentioned, the tracker URL, the filename, the number of pieces, etc. of the content). An advantage relating to the use of .torrent files is that they have far smaller sizes than the size of the original entity, which in the case, e.g. of media content with high resolution, may reach a size in the order of Gigabytes. The peers wishing to download an entity file must therefore first obtain a corresponding .torrent file and connect to the specified tracker. The latter tells them which of the other peers they can download the file pieces from. The users browse the web to find a torrent of interest, to download it, and to open it with a BitTorrent client. The client connects to the tracker or trackers specified in the .torrent file, wherefrom he receives a list of peers currently transferring pieces of the file(s) specified in the torrent. The proper downloading process can now start, with each peer sharing his upload resources and his contents in the overlay network, by exchanging blocks or “chunks” of the file.
The peer distributing a file (be it data or representing a multimedia content) treats the file as a series of identically-sized pieces. The peer creates a checksum for each piece, by using any suitable checksum algorithm, as, for example, the SHA1 hashing algorithm, and records it in the metadata .torrent file. The size of the piece is the same for each piece, and may be configurable by the user when he decides to create the metadata file. In the case of a relatively large payload, it may be possible to reduce the size of a metadata file by resorting to large sized pieces, for example, larger than 512 Kbytes, but this may reduce the protocol efficiency.
When another peer later receives a particular piece, the piece checksum is compared with a recorded checksum, to check that the piece is error-free. In the case of the BitTorrent protocol, the output information produced by the SHA1 algorithm (info_hash) is 20 bytes long and is listed in the torrent file at the field “Pieces,” so that this field is responsible for verification of the data pieces' integrity, and therefore of the integrity of the content itself.
In several contexts that comprise the use of media content (computer vision, speech recognition, and information retrieval) and also in new emerging standards such as MPEG-7, P/META, it may be known how to resort to techniques for semantic feature extraction. This may facilitate the identification of a certain media content, for example, with the aim of its classification and archiving, and with possible advantages on the end-user side.
In the case of multimedia content, the so-called “storyboard” is an example of semantic feature extraction, which may be used to improve video browsing. Semantic content may be presented as text lines or “thumbnails,” as it is the case, for example, for YouTube® interface.
In the case of video files (for example, films or videoclips) it may be known how to resort to various techniques of motion estimation, labelling (or tagging), color histogram analysis, edge or shape detection, audio analysis, speech-to-text, speaker recognition, etc. as semantic feature extraction techniques useful for identifying key frames that, shown to the user, may give a sufficiently accurate indication of the content of a particular file. This may happen, for example, by showing a storyboard of a video file as a sequence of frames. An example of an application of these techniques is the VideoBAY® browser.
The display of a storyboard is a presently used technique in P2P networks, which may help the user to retrieve the semantic content before starting the downloading process of a file, particularly if the same is a relatively large file. The storyboard check tends to become a nearly constant habit, also to reduce the pollution by dissemination processes of undesirable content. As a consequence, several .torrent files may be created from compressed archive files that include both the video and “semantic” data, such as a storyboard, the included songs or various other types of semantic information being related to the content. The semantic data may be archived in a compressed package with the original movie as additional and separate files.
This approach forces the user, in order to retrieve the semantic data, to access the P2P network and to download (at least partially) the package and at least a part of its content. The presence of a storyboard also may not necessarily offer a protection against malicious behavior, such as the intervention of somebody who may create a fake archive including a different content/storyboard association and create a new corresponding .torrent file.
In the case of the .torrent file known as Vuze (also as Azureus), the centralized WS provides, beside the .torrent metadata file, information about the content, which may be retrieved through the BitTorrent network. The client application may embed a browser that may access the central server to retrieve a preview of available content. A click on the content preview allows the client to download the associated .torrent file, which is then used to access the P2P network. In this case, the association between the content and the related metadata may be guaranteed by a central database provided by the official Vuze web server.
This approach has two major drawbacks. In the first place, it centralizes the distribution of content and of the associated P2P metadata, forcing the client to connect to a specific web server to retrieve the metadata in object, which reduces the spectrum of the possible metadata distribution channels. Secondly, this approach is not applicable to lightweight devices (for example, mobile terminals), as the browsing capabilities of which may be necessarily limited. It is to be noted that the traditional BitTorrent approach may not prevent the distribution of .torrent files via other means, such as email, USB keys, FTP, chat, Facebook etc., which may be useful with respect to user generated content when the producer of the content wishes to control and limit the distribution of a certain content. As already explained, the metadata distribution may take place outside the P2P network, and may be either public (for example, on a web server without limitations) or restricted to a community of peers. For example, if a user wishes to share the film of his/her wedding through the P2P network, he/she will generally not distribute the metadata (.torrent file) representing this content publicly, but only to relatives, possibly sending them an email with the related .torrent file enclosed.
U.S. Patent Application Publication No. 2007/033170 describes performing a query of content through a query of the associated metadata, also in a P2P environment, without providing any indication related to processes for associating and protecting semantic information linked to the content itself. Similarly, U.S. Patent Application Publication No. 2007/0038612 refers to the use of a multimedia bookmark, including information concerning a segment corresponding to an intermediate point within a method for indexing, query, identification, and processing of portions of a certain content. No reference is made therein to possible applications in a P2P context.
Moreover, U.S. Pat. No. 5,646,997 discloses an approach applicable to digital distribution systems. Information relating to a certain media content is obtained and formatted by using a portion of a data structure, so as to create a media package, with subsequent ciphering and storing of the same for a subsequent transfer onto a record medium. The aim of this is to allow a subsequent interpretation of this information to create information for the use of such a medium. The patent therefore deals with “watermarking” approaches for the protection of multimedia content, without taking into consideration a “Peer-to-Peer” scenario.