The present invention relates generally to the field of computerized comparison of images (for example, video images, still images) to detect anomalies and more particularly to computerized comparison of images (for example, video images, still images) to inspect components of power transmission towers and/or power transmission lines (collectively herein referred to as “tower components”).
Inspection of utilities such as electric transmission towers is a regulated activity (that is, companies and/or government entities are required perform this activity). The inspection should detect parts that are broken, that may have rust, and so on. Conventionally, this inspection is performed using cranes that lift a cage. Person(s) in the cage visually inspect the different objects that make up the tower and/or power line structure. In other conventional tower component inspections, a helicopter is used. In other conventional tower component inspections, unmanned aerial vehicles (UAVs) have been used. More specifically, the UAV is equipped with a video camera that takes video images of the tower components for inspection purposes. In the case of inspection with video from a UAV, a person(s) are required to visually inspect the video from the UAV to detect problems, such as cracks, dirt, rust, etc. in tower components.
There exists software for comparing two images of the same object, captured at two different times, in order to detect changes in the status of the object captured in the two images.
It is known to create 3D models of objects that appear in multiple videos for the purpose of matching 3D models of an identical object that appears in multiple videos. For example, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004 (“Rothganger et al.”) states as follows: “Abstract: This paper presents a novel representation for dynamic scenes composed of multiple rigid objects that may undergo different motions and be observed by a moving camera. Multi-view constraints associated with groups of affine-invariant scene patches and a normalized description of their appearance are used to segment a scene into its rigid parts, construct three-dimensional protective, affine, and Euclidean models of these parts, and match instances of models recovered from different image sequences. The proposed approach has been implemented, and it is applied to the detection and recognition of moving objects in video sequences and the identification of shots that depict the same scene in a video clip (shot matching).”