Cognitive collaboration involves cognitive processing applications to enable users to control collaboration applications/services on behalf of, and in a way that is natural to, humans. The cognitive processing applications receive user generated input, such as voice, video, or text, and convert that input into actions that interact with web-based and services that are not web-based on behalf of the users. Conventional cognitive processing applications are limited in several ways. First, cognitive processing applications limit the types of input a user may provide to text only, or voice only. Second, the cognitive processing applications are unable to concurrently process multimodal input, e.g., voice, text, and video, associated with a given user to produce a decision as to user intent. Third, the cognitive processing applications include cognitive processing functions that are tightly bound to each other and therefore inflexible. Inflexible cognitive processing functions do not scale easily and are not readily accessible from many geographical locations.