In digital video systems, such as network camera monitoring systems, video sequences are compressed before transmission using various video encoding methods. In many digital video encoding systems, two main modes are used for compressing video frames of a sequence of video frames: intra mode and inter mode. In the intra mode, the luminance and chrominance channels are encoded by using spatial redundancy of the pixels in a given channel of a single frame via prediction, transform, and entropy coding. Thus, a macro block of pixels may be encoded with reference to another, similar macroblock in the same frame. The encoded frames are called intra-frames, and may also be referred to as I-frames. The inter mode instead exploits temporal redundancy between separate frames, and relies on a motion-compensation prediction technique that predicts parts of a frame from one or more other frames by encoding the motion in pixels from one frame to another for selected blocks of pixels. Thus, a macro block of pixels may be encoded with reference to another, similar macroblock in another, previously decoded frame. The encoded frames are called inter-frames, and may be referred to as P-frames (forward-predicted frames), which can refer to previous frames in decoding order, or B-frames (bi-directionally predicted frames), which can have any arbitrary display-order relationship of the frames used for the prediction, and which can refer to two or more previously decoded frames. Further, the encoded frames are arranged in groups of pictures, or GOPs, where each group of pictures is started with an I-frame, and the following frames are P-frames or B-frames. The structure of the groups of pictures may be called a GOP structure. One aspect of the GOP structure is the number of frames in a group of pictures, which is generally referred to as a GOP length. GOP lengths may vary from 1, meaning that there is just an intra-frame, and no inter-frames, in a group of pictures, to, e.g., 61,440, meaning that there is one intra-frame followed by 61,439 inter-frames in a group of pictures. Since intra-frames generally require more bits for representation of an image than inter-frames, motion video having longer GOP lengths will generally produce a lower output bit rate than motion video having shorter GOP lengths.
At the site of reception of the encoded video sequence, the encoded frames are decoded. A concern in network camera monitoring systems is the available bandwidth for transmission of encoded video. This is particularly true in systems employing a large number of cameras. Further, this concern is especially important in situations where available bandwidth is low, such as when the video sequence is to be transmitted to a mobile device, e.g., a mobile phone, a PDA, or a tablet computer. An analogous problem occurs regarding storage of images, for instance when storing images on an on-board SD card in the camera. A compromise has to be made, where available bandwidth or storage is balanced against the interest of high quality video images. A number of methods and systems have been used for controlling the encoding in order to reduce the bit rate of transmissions from the cameras. These known methods and systems generally apply a bit rate limit, and control the encoding such that the output bit rate from the cameras is always below the bit rate limit. In this way, it may be ensured that the available bandwidth is sufficient, such that all cameras in the system may transmit their video sequences to the site of reception, e.g., a control center, where an operator may monitor video from the cameras of the system, and where video may be recorded for later use. However, applying a bit rate limit to all cameras may lead to undesirably low image quality at times, since the bit rate limit may require severe compression of images containing a lot of details, regardless of what is happening in the monitored scene. Recently, it has been proposed to use various schemes altering the GOP structure for controlling the output bit rate. For instance, the GOP length may be varied, such that a longer GOP length is used when there is little or no motion in the monitored scene, thereby reducing the output bit rate, and decreasing the GOP length when there is motion in the scene, allowing higher quality images at the price of a higher bit rate.
Recording of video sequences, particularly in monitoring or surveillance applications, may be based on one or more event triggers, e.g., a motion detection event trigger. In this manner, recording may be initiated when an event occurs, such as when movement occurs in a previously static scene. When recording is based on event triggers, it is oftentimes useful to record also a pre-event video sequence. For instance, if recording is triggered by a person moving in a region of interest representing a part of the monitored scene, it may be of interest to record also a video sequence showing how a person moved into that part of the scene. Similarly, it is generally useful to record also a post-event sequence capturing what happens after the person has moved out of the region of interest. In order to be able to record a pre-event video sequence when an event occurs, image frames may be continuously buffered in a first-in-first-out buffer, also referred to as a FIFO buffer. When an event occurs, image frames are retrieved from the buffer, such that they may be recorded preceding a video sequence that starts at the event. Recording may then be continued a predetermined time after the event has passed. The length of time of the pre-event and post-event sequences may be set by a user.
However, if the GOP length is varied in order to control bit rate, it is not possible to ensure that the pre-event sequence is viewable. This is because decoding has to start at an intra-frame. Should the first image in the pre-event buffer be an inter-frame, the previous frame to which that inter-frame refers has been lost due to the FIFO principle used for the buffer. Depending on the GOP length used and the time set for the pre-event sequence, there may be an intra-frame in the pre-event buffer, but from a point in time closer to the event than the user desired. The likelihood of an intra-frame from sufficiently long before the event being present in the pre-event buffer may be increased by adding a predetermined safety period to the pre-event time set by the user, such that actually a few more seconds are stored in the pre-event buffer than the user has set as pre-event recording time. Still, this requires a large buffer, which will be unnecessarily large when a short GOP length is used. Further, if the GOP length is long, there may still not be room for a sufficient number of frames to ensure that there is an intra-frame in the buffer from before the set pre-event time. Hence, it may be seen that a need remains for an improved method of generating an event video sequence.