In the artificial intelligence field, how to enable smart wearable devices or robots to perform reorganization and identification in a natural interactive way becomes the core problem of the current study, in which the creation of a natural human-computer interaction is particularly important. Artificial intelligence devices and robots have been widely applied in various aspects of human life, and machine vision and recognition by human intervention should also be more convenient and efficient by means of new technologies, thus requiring a more natural way to perform machine recognition and image identification.
Currently, the input for image identification and machine vision recognition is performed by first taking pictures and then determining a target object. This process is often limited by the complexity of a shot content, and consequently many steps are needed and the learning cost is high. Moreover, the pictures shot need to subject to human intervention, for example, the pictures need to be delimited manually. In addition, the identification content cannot be accurately obtained by a machine so as to result in low identification accuracy, it is difficult to identify irregular objects, the operation is inconvenient, and the user experience is poor.