FIG. 5 shows a schematic diagram of a structure of an image frame 500 of a current 3D stereo video signal. In the image frame 500, P1 represents an image object closest to human eyes, i.e., a stereo video object having lowest 3D depth, P2 represents an image object farthest from the human eyes, i.e., a stereo video object having deepest 3D depth, P3 indicates a position of an additive object (e.g., a subtitle object). Taking a current stereo film signal as an exampling, its subtitle object is located at fixed positions (e.g., 3D depth) in different frames, and the fixed 3D depth is normally very low. However, 3D depth of the primary image object or a primary scene in different frames of the stereo film signal dynamically changes. When 3D depth of a subtitle (e.g., P3) of a certain frame is significantly different from that of its primary image object or its primary scene (e.g., P2), a focal length needs to be continuously changed since human eyes cannot focus on a same location. In such situation, observation of such type of images for a long time easily tires or fatigues human eyes, and thus reducing quality and pleasure of observing the stereo video. Accordingly, there is a need for a technique for dynamically detecting 3D depth of a stereo image and dynamically adjusting 3D depth of an additive object (e.g., a subtitle) according to the 3D depth of the stereo image.