1. Technical Field
The technical field relates generally to transcription of content and, more particularly, to systems and methods that automatically generate captions from transcripts.
2. Background Discussion
Conventional approaches for creating captions of content integrate computing technology with manual effort. For instance, in some approaches, computer systems accept media files and use automatic speech recognition to create draft transcripts of the media files. The draft transcripts may then be edited by human transcriptionists and subsequently processed to create caption frames that are embedded in the media files.
Presently, this caption post-processing is performed either manually or by simplistic computer processes. For instance, where the type of content requires special expertise to create accurate captions (e.g., a university mathematics lecture including specialized terms), manual post-processing may be utilized. In other instances, computer processes may build captions by iterating through the words included in the transcript and placing the words within caption frames according to the capacity of the caption frame.
If a downstream customer finds problems in the transcripts or captions, editors or customers can edit the transcripts. To propagate the edits made in the transcripts to the captions, the edited transcript may be subject to the same post processing performed on the previous version to create captions that reflect the edits made to the edited transcript.