Various systems allow users to generate depth maps or 3-dimensional (3D) representations of structures using image frames of videos or still images. Typically, such systems require an immense amount of computing power or large numbers of images captured from different locations. However, these systems are generally unable to create depth maps from still images or images which have very little differences between them.