Meetings involve multiple participants and interactions in different modes between the participants. It may be of interest to make a record of at least some interactions that take place in a meeting. Meeting transcripts or meeting minutes have been popularly used to record verbal aspects of communication in a meeting. Traditionally, transcripts of a meeting may be made by a person present in the meeting. Currently, transcripts may be generated by recording the conversation in a meeting and converting it to text using speech recognition technology. Meeting videos may also be recorded in some cases for future reference. To contextualize or complement text in a transcript, annotations can be used. Annotations may be, for example, indicators of emphasis, speech directed towards a particular person, requests, orders etc. Annotations to text transcribed using speech recognition, if supported, may either be done manually or be based upon verbal cues by a speaker. Manual annotations may include a person inputting or selecting an annotation using an input device. Verbal cue based annotation may include speech recognition of a verbal cue. A verbal cue may be associated with a particular annotation. In the event a verbal cue is detected, transcribed text corresponding to a time period around the verbal cue may be annotated with the corresponding annotation.