Figure 2. An example of the type of raw input collected from automatic speech recognition (ASR), as well as five non-expert human captionists. The captionists collectively capture all of the words, but combining all their input on the fly is burdensome for users, so Scribe merges workers' partial captions into a single final stream that can be shown to users.