Recent years have seen a precipitous rise in the use of digital visual media on client computing devices. Indeed, individuals and businesses increasingly utilize laptops, tablets, smartphones, handheld devices, and other mobile technology for a variety of tasks involving digital visual media. For example, individuals and businesses increasingly utilize smartphones to capture, view, and modify digital visual media such as portrait images, “selfies,” or digital videos.
Although conventional digital visual media systems allow users to capture and modify digital visual media, they also have a number of significant shortcomings. For example, conventional digital visual media systems can utilize cameras to capture digital visual media, but cannot easily, quickly, or efficiently select or segregate individual objects from other pixels portrayed in the digital visual media.
Some conventional digital visual media systems assist users in segregating an object portrayed in a digital image by manually tracing a boundary line around the individual. Conventional systems that rely on manual tracing, however, have significant drawbacks in terms of accuracy, speed, and efficiency. Indeed, applying such conventional systems generally requires a significant amount of time and still result in inaccurate object segmentation.
Other conventional digital image editing systems select an object in a digital image by applying a machine learning classification model. Specifically, conventional digital editing systems can apply a classification model that categorizes an object portrayed in a digital image into one of a plurality of object categories and then segments the object based on the determined object category. Unfortunately, these conventional tools also have a number of shortcomings.
As an initial matter, conventional systems that utilize classification models are rigid and limited in applicability. For example, conventional systems that utilize classification models generally utilize a limited number (e.g., 20 or 80) classification categories. Such limited numbers are far from sufficient to cover the variety of objects that individuals or businesses routinely encounter in digital visual media. In addition, conventional digital visual media systems that utilize classification models have high computational performance requirements that make them infeasible to operate on mobile devices. Indeed, applying such classification models on mobile devices requires far more memory and processing power than typical mobile devices can afford. Furthermore, conventional digital visual media systems that utilize classification models cannot operate in real-time across a plurality of digital images. For example, conventional digital visual media systems cannot segment objects portrayed in a real-time digital visual media feed (e.g., a live video feed from a smartphone camera).
These and other problems exist with regard to identifying objects in digital visual media.