Autonomous navigation of robots requires robust estimation of robot's pose (position, orientation) as well as the associated 3D scene structure. Conventional algorithms and pipelines for visual Simultaneous Localization and Mapping (SLAM) require point correspondences between images or frames for camera (robot) position estimation as well as structure estimation. Feature based methods for visual SLAM try to find the point correspondences between images using scale-invariant feature transform (SIFT), speeded up robust features (SURF) or Oriented FAST and rotated BRIEF (ORB). Visual SLAMs known in the art, use these features to obtain camera and structure estimation by minimizing re-projection error through incremental bundle adjustment; however it fails when the number of points extracted is too less or erroneous especially when the amount of texture present in a scene is inadequate. This may lead to partial reconstruction and stopping of camera tracking when 3D-2D correspondences are less due to insufficient feature correspondences or insufficient 3D points from bundle adjustment. As against this direct SLAMs are independent of the feature extraction; however they are prone to erroneous camera pose estimation due to incorrect photometric error estimation in case of change in lighting or view. Also, direct SLAMs may not provide good camera estimations in the absence of well-textured environment.