Field
Embodiments of the present invention generally relate to object detection in images and videos, and more specifically, to a framework for incrementally expanding the object detector in such a way as to allow better detection from image instances that are substantially different from instances seen during initial training of the detector.
Description of the Related Art
Traditional image-based object detection systems identify objects in a video using a model trained on a source domain of images. Doing so allows the detection system to, e.g., localize an object and determine its bounding box for further use. The image-based model also allows the object detection system to distinguish object from one another and to classify objects into predefined types. However, systems that use image-based detection models often fail to generalize examples falling in a source (training) domain of images to the target (test) domain of videos. An object detection system trained on one set of image data often fails to accurately detect objects of other data sets.
To address this issue, some image-based object detection systems perform domain adaptation. An example domain adaptation approach involves transferring learned information (e.g., labeled sample images) from the source domain to the target domain. However, a target domain of objects (e.g., in videos) may constantly evolve object appearance and lighting changes. Further, as the object detection system adapts to a new target domain, the system disregards information learned from the previous target domain. Thus, if given the previous target domain, the object detection system has to again adapt to that domain.