This invention relates to representing an image.
The process of finding and retrieving images stored on electronic media (e.g., a computer, the Internet) has become increasingly difficult for a variety of reasons. For instance, with the explosive growth of the Internet, the number of searchable images available on the Internet has dramatically increased. With the increased number of images, the abilities of conventional systems, methods, and computer programs to perform searching and retrieval functions in an efficient, useful, and timely manner have been challenged.
The ability of conventional systems, methods, and computer programs to efficiently find and retrieve desired images in a database has been hampered by poor organization and inconsistent formatting of the images being searched and/or retrieved. Similar problems also may be experienced by other electronic applications involving a large quantity of images that may be searched for and retrieved. These problems may be compounded when the desired search result includes multiple formats (e.g., images and text).
In one general aspect, representing an image includes extracting information reflecting one or more characteristics of the image, and using the information to compute a first histogram vector. The first histogram vector includes one or more vector elements, each representing information for a different characteristic extracted about the image. Multiple subsets are identified within at least one of the vector elements included in the first histogram vector and a second histogram vector is created. The second histogram vector includes a vector element for each of the subsets identified within the vector elements included in the first histogram vector. Data within the vector elements of the second histogram vector represent the extracted image characteristics.
Implementations may include one or more of the following features. For example, the first histogram vector may be computed based on the extracted information. The first histogram vector may include one or more vector elements, each representing information for a different combination of characteristics extracted about the image. Representing an image may further include applying a weighting factor to the first histogram vector.
In another general aspect, representing an image includes extracting information about the image, computing a histogram based on the extracted information, and calculating a posterized histogram based on the computed histogram.
Implementations may include one or more of the following features. For example, the extracted information may include more than one type of information about the image. The more than one type of information may be used to compute a joint histogram from which the posterized joint histogram may be calculated.
Representing an image also may include applying a weighting factor to a selected vector element value in the joint histogram based on at least one type of information extracted about the image before the posterized joint histogram is calculated, such that the joint histogram represents a weighted joint histogram. The posterized joint histogram may be calculated based on the weighted joint histogram. The weighting factor also may be applied to the joint histogram based on more than one type of information extracted about the image.
Applying the weighting factor may include extracting one or more additional types of information about the image that differ from the types of information used to compute the histogram. The additional types of information may be used to determine whether to modify the vector element value in the joint histogram, which includes one or more vector elements. The additional types of information may include a centeredness feature about each pixel in the image that may be used to determine whether to modify the one or more vector element values in the joint histogram.
Additionally or alternatively, representing an image may include applying a weighting factor to a selected vector element value in the joint histogram based on at least one type of information extracted about the image that differs from the types of information used to compute the joint histogram before calculating the posterized joint histogram. The joint histogram represents a weighted joint histogram that reflects application of the weighting factor. The posterized joint histogram may be calculated based on the weighted joint histogram. The weighting factor may be based on more than one type of information extracted about the image that differs from the types of information used to compute the joint histogram.
The information extracted from the image may include at least one of color, edge density, texturedness, gradient magnitude, and rank features about pixels in the image. Additionally or alternatively, the information extracted from the image may include more than one of the above listed features.
Computing the histogram may include computing a first histogram vector based on information extracted about the image. The first histogram vector may include one or more vector elements, each representing information for a different feature extracted about the image. Additionally or alternatively, the first histogram vector may include one or more vector elements each representing information for a different combination of features extracted about the image.
Calculating a posterized histogram may include identifying multiple subsets within at least one of the vector elements included in the first histogram vector and creating a second histogram vector that includes a vector element for each of the subsets identified within the vector elements included in the first histogram vector. The data within the vector elements of the second histogram vector represent the extracted image features.
In yet another general aspect, a representation of at least a portion of an image includes a first element, which includes data indicating whether the image includes a range of weights corresponding to a first feature. The representation also includes a second element, that differs from the first element, which includes data indicating whether the image includes a range of weights corresponding to a second feature.
Implementations may include one or more of the following features. For example, the data included in the first and second elements may be binary.
In still another general aspect, a representation of at least a portion of an image includes a first element, which includes data indicating whether the image includes a range of weights corresponding to a first combination of features. The representation also includes a second element, that differs from the first element, which includes data indicating whether the image includes a range of weights corresponding to a second combination of features.
Implementations may include one or more of the following features. For example, the data included in the first and second elements may be binary.
These general and specific aspects may be implemented using a system, a method, or a computer program, or any combination of systems, methods, and computer programs.
Other features and advantages will be apparent from the description and drawings, and from the claims.