Currently, when videos generated by an Android operating system and an Internetworking Operating System (“iOS” for short) system are joined into a single continuously playable video, the separately generated videos need to be decoded by using their respective decoding programs on their corresponding native operating systems into video frames first. For example, videos generated by an Android system will have a sequence parameter set (“sps” for short) and a picture parameter set (“pps” for short) of the Android system, and an Android system video decoding program will use those sps and pps parameters in the video files to convert each of those video files into a respective sequence of image frames. Similarly, videos generated by an iOS system will have an sps parameter and a pps parameter of the IOS system, and an iOS system video decoding program will use those sps and pps parameters in the video files to convert each of those video files into a respective sequence of image frames. For videos produced on both operating systems, the video bit streams in a video file are decoded into pictures frame by frame, and the pictures are displayed on a user interface for viewing. When the video files generated by the Android system and the IOS system are played in a mixed sequence as a continuous stream, a problem of incompatibility occurs because if the decoding performed based on either the Android decoding program or the iOS decoding program will render the video files produced using the other operating system unplayable. For example, if the decoding is performed by using sps and pps parameters of the Android system, display of the video of the IOS system would be abnormal, and if decoding is performed by using the sps and pps parameters of the IOS system, display of the video of the Android system would be abnormal. Therefore, when videos of an Android system and an IOS system are joined into a single continuously playing video stream, an existing technical solution is first reading a video file of the Android system, decoding video bit streams into pictures frame by frame by using sps and pps parameters in the Android video file, then, reading a video file of the iOS system, and decoding video bit streams into pictures frame by frame by using sps and pps parameters in the iOS video file; and further, providing all the decoded picture frames to a video encoder and performing uniform compression and encoding to generate new video bit streams and obtain a set of new uniform sps and pps parameters, thereby generating a target video file that can be played continuously as a single image stream. However, when the video files generated by the Android system and the iOS system are joined by using the foregoing processing method, the decoding and encoding processes are time-consuming and cumbersome. Therefore, during a video joining process, a large amount of time is consumed, which is disadvantageous to the real-time viewing experience of a user.
Moreover, with a rise of lip-syncing joint performances, video clips of two terminals running different operating systems need to be accurately and rapidly joined, and then, rapid joining of the video clips cannot be achieved by using the foregoing method, and meanwhile, when video joining is performed by using the foregoing method, usually, because a time length of a video clip recorded by a user is too long or too short, positions of lips of the user do not accurately correspond to the audio that they are supposed to match. Consequently, the integrity and seamlessness of the final product is greatly reduced.
For the foregoing problem, no effective solution has been provided at present.