Conventional IoT devices have pre-defined gestures for controlling the operation of the IoT devices. For example, a user can control any device with the set of gestures associated with it as input commands (example: smart TV can be controlled with associated gestures for different operations). In an embodiment, same pre-defined gestures may be configured for multiple IoT devices for controlling the operation thereby leading to a confusion of which device the user intends to control. Conventional IoT devices fail to differentiate between different gestures seamlessly to understand which IoT device the user intends to control.
Additionally, the user should be present in the line of sight of the IoT device to control the IoT device. Further, existing IoT devices fail to control the IoT devices using the multi-modal gesture commands. Conventional IoT devices fail to switch between multi-modal gesture commands such as, voice commands and gesture commands. Existing IoT devices cannot determine when to use voice commands, when to use gesture commands and when to use both. Thus, the user must explicitly call out the IoT device that the user intends to control as there is no query back mechanism to get clarification from the user when there is ambiguity as to which IoT device the user intends to control.
Further limitations and disadvantages of conventional and traditional approaches will become apparent to one of skill in the art, through comparison of described systems with some aspects of the present disclosure, as set forth in the remainder of the present application and with reference to the drawings.