1. Field of the Disclosure
The present disclosure relates to a player-independent medium for storing digital multimedia content.
2. Related Art
Digital multimedia content includes digital images, digital video and digital audio. Each of these is formed of basic components that can be represented digitally. Pixels are the basic components of digital still images and digital video. The basic components of digital audio, for example, include samples of an analog audio waveform taken over time. Each of the basic components requires a certain amount of memory. Accordingly, the memory required for storing multimedia content increases with increased playback or presentation resolution and/or sampling rate of digital multimedia. Thus, to increase the quality of the image or sound, one must sacrifice memory space.
A digital image is represented by an array of pixels, each pixel being a discrete dot that appears on a display screen. In digital video, a series of images or pictures are displayed in rapid succession. There are many competing standards which are used to define a pixel's content (e.g., RGB, YUV, CYMK, HSI, HSV, and CIE). Regardless of which standard is used to define the content of each pixel, the amount of memory required to store all of the information for the digital images or pictures in a movie is substantial. For example, a typical feature length film is 118 minutes with 24 frames or pictures per second. Each frame or picture on a current DVD has 720 pixels horizontally and 480 pixels vertically, with each pixel using two bytes for its color. This requires 117.5 GB, the equivalent of multiple DVDs, to contain the entire film. To address this problem, image compression is used to remove or change some details that can not readily be perceived by a viewer, so that creation of the image requires less information and the image can be stored using less memory. Video compression further reduces the memory required for storing video data by primarily storing and taking into account differences between successive frames or images.
Just as there are multiple standards for defining the content of pixels, there are also many different standards for image compression and video compression such as, for example, MJPEG, MPEG-2, MPEG-4, WMV, and RealVideo. One of the reasons for this is that there are many different types of devices which can reproduce videos with a wide range of capabilities. Products with small screens, limited memory, or limited communications bandwidth require highly compressed videos. On the opposite end of the spectrum, products used in movie production require very high resolution or quality source material and thus use only a minimum of compression while sacrificing memory space to maintain the high quality of the image.
Digital audio exhibits the same limitations with respect to memory. That is, a high sampling rate is necessary to provide high quality audio and a high sampling rate also requires a large number of samples to be stored. Accordingly, digital audio compression techniques are commonly used to reduce the amount of memory required to store the audio data. There are also a multiplicity of standards for audio compression such as, for example, the MP3, WMA, and AAC standards.
Because of the many different standards that are variously used for compressing audio and video data, not all devices can play back all bit streams of stored or streamed multimedia data. Thus, multimedia data that is compressed and stored on a memory card for playback on a computer or television, for example, will typically not be playable on a smaller or lower-resolution device, such as a cell phone.
An example of a compression standard is MPEG-4, which is the latest compression standard developed by the Moving Picture Expert Group (MPEG). MPEG-4 is used in a wide variety of devices including, but not limited to, cell phones, TVs, computers, set top boxes (cable and satellite), movie cameras, still cameras, and security systems. To satisfy such a broad range of equipment, MPEG-4 includes a group of several profiles or layers for accommodating various device capabilities. Because of all the different standards, and the different profiles within the MPEG-4 standard, a particular device that plays MPEG-4 files may not be able to play back all MPEG-4 bit streams.
A typical MPEG-4 encoder 100 is illustrated in FIG. 1. Uncompressed video data is input at, for example, 30 frames per second (or 24 frames per second for film movies). The uncompressed data is first converted from the spatial domain (i.e., pixel representation) into the frequency domain by a Discrete Cosine Transform (DCT) 102. After the transformation, the data is represented differently, but is still the same size as the spatial domain data. Representing the data in the frequency domain facilitates removal of those parts of the video that are difficult to perceive because fine details in the spatial domain are represented as high frequency components in the frequency domain. Image compression is effected by a quantizer 104 which is used to remove the high frequency components in the transformed data. Video compression is also effected by a motion detection/motion compression block 106 which determines where blocks of an image have moved or otherwise changed from one frame to the next. This helps to compress the data even further because less data is required to instruct vertical and/or horizontal movement of a part of a previous image or frame than the data required to store or stream the complete image or frame. The quantized/motion compressed data is fed to an entropy compression block 108 which performs a lossless compression of the remaining data, which further reduces the quantized/motion compressed data by a factor of 2 to 4.
The audio data is similarly compressed in an audio encode block 110 in which sampled audio data is transformed into the frequency domain, and filters and algorithms are applied to remove details of the audio information which cannot or would not be noticed by most people. This process is referred to as psycho-acoustic modeling.
The compressed video and audio streams are combined into a final stream by a multiplexer 112. Timing information is inserted into the stream so that the audio and video streams are synchronized when played back.
To view the original video data which was compressed, the compressed data must be decoded using an MPEG decoder 200, for example, as shown in FIG. 2. When the video is to be played back, the MPEG bit stream which was previously encoded by the encoder 100 is fed to an MPEG stream demultiplexer 212 which separates the compressed audio and video streams. An entropy block 208 restores the data, and an inverse quantizer 204 produces DCT data. An inverse DCT (IDCT) block 202 transforms the data from the frequency domain into the spatial domain as pixels. A motion detection/motion compensation block 206 takes the instructions from the compressed stream and replaces those instructions with pixel data for each image, which is sent to the IDCT 202. The compressed audio stream is similarly decoded in an audio decoder to produce an audio stream.
The multimedia data files that are discussed above are stored and organized on a non-volatile memory using a file system which is then used by a host device to retrieve the file from the memory device. The file system is a method of keeping track of where files are located and also provides directories and/or folders to provide a hierarchical arrangement of the files. The file system also retains metadata about the files being stored.
The file system typically divides the memory into portions which may, for example, be referred to as sectors. A sector may be any number of bytes in size. However, a common size is 512 bytes. Groups of contiguous sectors may be managed as blocks or clusters which makes it easier and faster to manage the files. The determination of how big a block or cluster to manage is based on balancing time and space. Larger blocks or clusters result in wasted space, especially if small files (1−2 KB) are being stored. Multimedia files are large (1 MB or more) and therefore there is little wasted space even if 32 KB blocks (64*512 Byte sectors) are used. As stated above, the file system keeps track of where files are located in the memory. The location of files is indicated by the starting address of blocks. Once a host device selects a file, the file system maps the request to a specific location in the memory. Thus, the file system obviates the requirement that a user or host device know where physically in the memory the requested file is saved.
Flash memories typically use Windows FAT16 and FAT32 file systems. FIG. 3a shows that a 2 GB memory device using FAT16 file system includes a Boot Sector, File Allocation Tables FAT#1 and FAT#2, a Root Directory, and Files and Subdirectories. When a host device accesses any memory device, the host device initially reads the first block, i.e., address 0000, to determine what is stored on the memory device. For FAT16 and FAT 32 file systems, the first block contains the Boot Sector which includes information on where the other information is stored on the media. FIG. 3b illustrates the typical contents of a Boot Sector for an FAT 16 file system. Based on the information in the Boot Sector, the host device then knows where to look for the root directory and requests to read the root directory to determine what files are in the memory device. FIGS. 4a and 4b show the contents of a memory device and Boot Sector, respectively, for an FAT 32 file system. FIG. 5 is an example of meta data which is stored for each file.
FIG. 6 shows an example of a memory having two files stored using FAT 16/32 file system. The root directory shows two files stored. A pointer for each file which indicates where the files are stored is determined from the File Allocation Tables FAT#1 and FAT#2. Accordingly, when a host device requests to read one of the files, the request is directed to the address associated with the requested file.