Various systems allow users to generate depth maps or 3-dimensional (3D) representations of structures using image frames of videos or still images. Typically, such systems require an immense amount of computing power, large numbers of images captured from different locations, or images captured under special conditions in a laboratory. However, these systems are generally unable to create depth maps from still images or images which have very little differences between them.