Field
Systems and methods are provided that relate to detecting and visualizing user interactions during remote collaborative meetings, and more specifically, to detecting, classifying and indexing user interactions in a live document stream for live searching and visualization of interaction-based meeting content.
Related Art
Online users that are remote with respect to one another may collaborate with one another remotely, in a collaborative environment using a web-based tool, such as WebRTC browser-based systems. WebRTC (Web Real-Time Communication) is an application programming interface (API) definition drafted by the World Wide Web Consortium (W3C) that supports browser-to-browser applications for voice calling, video chat, and peer-to-peer (P2P) file sharing without the need of internal or external plugins.
For example, remote users may share their screens during online meetings that are live, so as to show websites, edit presentation slides, or edit text in code editors. During the online meeting, the remote users may refer to the previously shared content. Further, the previously shared content may be the subject of future discussion or review.
However, a shared screen may include a large volume of information. Thus, one related approach is to index each frame, or one or more key frames using optical character recognition (OCR), so as to permit retrieval via text entry.
An alternative approach is to automatically detect actions taken by remote users in the live streams of each of the users. This automatic detection can be obtained via text editing and/or cursor (e.g., mouse cursor) motion. The output of the automatic detection includes screen-sharing videos (live or recorded).
One or more of the users may wish to retrieve the screen-sharing videos, either live or after the meeting. Because the screen-sharing videos contain text, a text-based search approach is one manner of providing the user with a retrieval mechanism.
However, such a related art approach may have various problems and disadvantages. For example, but not by way of limitation, the large amount of data in a frame (e.g., 30 “pages” per second) makes it impossible to provide real-time retrieval.
Related art application of users' actions to improve document skimming and retrieval includes video indexing that uses motion found in videos to segment the video into clips based on topics, allowing users to more easily browse clips or retrieve objects (e.g., “show me videos containing a cat”). This related art is directed to videos such as television footage or casual user-generated videos. However, this related art does not include extracting motion from screen sharing sessions for use in retrieval and presentation.
On web pages, related art mouse and keyboard tracking is used to monitor user's actions in order to design better web sites, detect when a search query was useful or not, or infer the emotional state of the user. However, unlike video documents, the related art Javascript code can be injected into web pages to collect mouse and keyboard actions. Accordingly, the related art does not include indexing of the pages being interacted with.
Additionally, related art personal bookmarks may be represented as an enhanced web page thumbnail, where keywords searched for are overlaid. However, this related art does not disclose how to extract mouse and text actions, and only uses color and text size to generate the enhanced thumbnails; moreover, the thumbnails are static.
Therefore, it may be desirable to develop systems and methods which may improve the ability to view relevant shared content during a remote collaboration meeting.