In the above technical field, patent literature 1 discloses a technique of integrating manipulation instruction candidates based on a user's gesture shot by a camera and manipulation instruction candidates based on a user's voice collected by a microphone, and outputting one manipulation instruction intended by the user.