Field
The present disclosure is directed generally to robot systems, and more specifically, to systems and methods of remotely controlling and managing robot systems.
Related Art
In the related art, manual robot teleoperation (e.g., control) with a video feed, via keyboard or joystick can be a frustrating and tedious task. Such related art implementations may also be error prone. For example, latency in the reception of the video, even in small amounts, can lead to collisions between the robot and objects, since the user sends commands based on out-of date information.
The related art methods for autonomous localization and navigation include generating a map through simultaneous localization and mapping techniques, and localizing within the map with the use of laser scanners or beacons. However, these related art techniques require an investment in time to setup and need to be updated as the environment changes.
Human vision provides the sense of a single comprehensive view of the surroundings. For a fixed position of each eye, humans have a horizontal field of view of about 120 degrees, of which only a small central foveal region has “high resolution”. The fovea sees only about the central two degrees of the visual field, with a “resolution” of about 31.46 arc seconds. The fovea takes up about 1 percent of the retina, but over 50 percent of the visual cortex in the brain. Together, with both eyes looking forward, humans have about a 180 degree forward facing field of view. Using eye motion only (and not moving the head), humans have a field of view of about 270 degrees. Human brains integrate this so well that they are normally not aware of how low our resolution is outside of the small foveal region, nor of the need for saccades (e.g., quick, simultaneous movement of both eyes between two phases of fixation in the same direction, that may be associated with a shift in frequency of an emitted signal or a movement of a body part or device) to produce an overall visual sense of the space, nor even of head motions.
By contrast, when watching a view from a remote camera in the related art, there is a consistent resolution within the view, while nothing is seen outside of the view. This gives a sense of “tunnel vision”, such as feeling removed from the remote space. Even if the camera is steerable or is on a robot or remote vehicle that can move around and turn, effort is needed to move the camera and make sense of the overall scene moves from the “perceptual” to “cognitive” level of mental processing. For highly immersive experiences, a user can be provided with a head-mounted-display having a field of view (FOV) close to that of the eye, and a head tracker so that head motion provides essentially the same view as if they were at the remote location. Alternatively, the user may sit in a “CAVE” with video shown on the walls or other surrounding surface. For many related art applications however, these are not practical or are otherwise undesirable. Providing such views requires high bandwidth, low latency networking, more elaborate hardware, or requires the user to be wearing an unnatural device and essentially be disconnected from their own local environment. A user may prefer, for example, to be watching the view on a large display, in a web page, or on a tablet.