A multimedia file is a file containing multimedia data, and data such as a video, an audio, and a subtitle are encapsulated in one multimedia file according to related specifications. Therefore, a format of a multimedia file includes two aspects, that is, an external file format and an inner-layer encoding format, where the file format refers to an encapsulation manner of multimedia data, for example, AVI, MOV, MKV, RMVB, or MP4; and the encoding manner refers to a compression algorithm of the multimedia data, for example, H.264, VP8, MPEG-4, WMV9, and the like are common video encoding formats, and MP3, AAC, AC-3, AMR, and the like are common audio encoding formats. In addition, the file format and the encoding format are not necessarily in a one-to-one correspondence because of a loose coupling between the file format and the encoding format. For example, a file format such as MP4 or MKV may contain different types of data encoding. However, there are also a few file formats, such as RMVB and WebM, used together with a uniquely specified video or audio encoding format.
Based on the foregoing multimedia file formats, there are mainly three phases, that is, media separation, media decoding, and media rendering, for playing a video file. Media separation is to identify a file format of a multimedia file by using an internal separator, obtain through separation encoding data such as a video, an audio, and a subtitle, and deliver the encoding data to a decoder. Media decoding refers to decoding video data and audio data by the decoder, to obtain restored uncompressed video data frames and audio data frames. Media rendering is to present, in an image manner by using a display device, the video data frames and the audio data frames obtained through decoding. In the three steps, the step of media decoding generally consumes the most calculation, which is much greater than that of separation and that of rendering, and is also greater than that of processing on the audio data. However, a processing capability of a central processing unit (CPU) of a device such as a mobile phone is limited, and therefore, a case in which playback is not smooth often occurs when a user uses a device such as a mobile phone to play a high resolution video. To make up for an inadequate video playback capability of the CPU, a graphics processing unit (GPU) chip supporting decoding of a multimedia file in a common format is built in most smart phones, and the multimedia file can be played by calling a system function of multimedia playback by the GPU.
In the existing technology, a GPU acquires a multimedia file in a specified format, obtains through separation encoding data of the multimedia file by using a separator, decodes the encoding data by using a decoder, and plays data frames obtained through decoding.
During an implementation of the present disclosure, inventors of the present disclosure find that the existing technology at least has the following problems:
A file format, a video encoding format, and an audio encoding format of a multimedia file played by the GPU all need to meet formats specified by the GPU, and the file format is fixed; as a result, a range of multimedia files that can be played by the GPU is limited.