1. Technical Field
The present disclosure relates generally to identifying objects in digital visual media. More specifically, one or more embodiments of the present disclosure relate to systems and methods that utilize deep learning techniques to automatically identify objects in digital images.
2. Background and Relevant Art
Recent years have seen a rapid proliferation in the use digital media, such as digital photography. Digital photography has several advantages that draw individuals and businesses to increasingly utilize digital photography. One significant advantage of digital photography is the ability for a user to edit or otherwise customize a digital image for a particular purpose. Although there are numerous tools used to edit a digital image, one tool that users often use is a segmentation tool that is able to identify and select a specific portion of a digital image during the editing process. For example, users routinely desire to select, segregate, and/or modify a digital representation of an object (e.g., a person) in a digital image separately from a background in the digital image (e.g., to replace the background or otherwise modify the individual portrayed in the digital image). Accordingly, there is an increasing demand for systems that can distinguish between pixels that correspond to an object in a digital image from pixels that correspond to a background of the digital image.
Some conventional digital image editing systems assist users in segmenting an image to distinguish an object portrayed in a digital image from the background of the digital image, however, these conventional systems have a number of disadvantages. For example, conventional systems do not calculate or generate a cohesive boundary between the pixels that correspond to an object portrayed in the digital image and the pixels that correspond to a background. In particular, many conventional systems use a segmentation process that ignores, or often degrades, the quality of boundaries between the object portrayed in the digital image and the background. Accordingly, conventional systems frequently produce results that are unsatisfying and require significant manual labor to correct.
Specifically, conventional systems often produce false positive pixel identification where pixels that correspond to the background are incorrectly identified as pixels that correspond to the object. The false positive pixel identification produces results where several portions of the background are incorrectly selected, which ultimately provides a flawed segmentation.
In addition, conventional systems produce false negative pixel identification where pixels that correspond to the object are incorrectly identified as background pixels. In the case of false negative pixel identifications, the resulting selection of pixels produces an incomplete capturing of the object portrayed in the image. For example, a portion, or in many cases several portions, of the object portrayed in the digital image appear to be cutoff in the results of the segmentation process. Therefore, based on the false negative pixel identification, conventional systems often produce an incomplete segmentation of the image.
Moreover, many conventional systems produce false negative pixel identification based on conventional object detectors used in conventional systems. In general, object detectors in conventional systems attempt to detect an object within a digital image, and then crop out a portion of the digital image that includes the object to obtain a smaller portion of the image in the hopes of simplifying a segmentation process. Conventional object detectors, however, often cause more harm than good when used as part of a conventional segmentation processes. In particular, conventional object detectors often fail to detect the entirety of an object, and as such, conventional object detectors often crop out one or more portions of an object prior to segmentation. As such, conventional systems often produce a segmentation that completely fails to properly identify large portions of an object.
Unfortunately, the process for a user to manually fix an incorrectly segmented image resulting from a conventional system is often time intensive and technically difficult because of the irregular shapes that can exist in an incorrectly segmented image. In fact, although the process to manually select an object portrayed in a digital image is difficult and time intensive, manually segmenting an image is often faster and easier for a user compared to having to fix or adjust an incorrectly segmented image produced using conventional systems. Thus, many users become frustrated in the segmentation capabilities of conventional systems and choose to continue to simply use a manual segmentation process.
These and other problems exist with regard to identifying objects in digital visual media.