[grc] Dissertation talk: Tools for Creating Audio Stories

Brian Shiratsuki settled at gmail.com
Mon Oct 5 19:18:56 PDT 2015


[on the cutting edge? razor blades not mentioned. the talk at UC
berkeley computer sciences is likely inconvenient for most to attend,
but a non-interactive version appears to be available at
<http://vis.berkeley.edu/papers/audiostories/>]

Title: Tools for Creating Audio Stories

Speaker: Steve Rubin
Advisor: Maneesh Agrawala

Date: Friday, October 9
Time: 4-5 pm
Room: 510 Soda Hall (Visual Computing Lab)

Abstract
Audio stories are an engaging form of communication that combine
speech and music into compelling narratives. One common production
pipeline for creating audio stories involves three main steps:
recording speech, editing speech, and editing music. Existing audio
recording and editing tools force the story producer to manipulate
speech and music tracks via tedious, low-level waveform editing. In
contrast, we present tools for each phase of the production pipeline
that analyze the audio content of speech and music and thereby allow
the producer to work a higher semantic level.

We present Narration Coach, an interface that assists novice users in
recording scripted narrations. As a user records her narration, our
system synchronizes the takes to her script, provides text feedback
about how well she is meeting the expert voiceover guidelines, and
resynthesizes her recordings to help her hear how she can speak
better. Next, we present a speech editing interface that addresses the
challenges of logging, navigating, and editing recorded speech. Key
features include a transcript-based speech editing tool that
automatically propagates edits in the transcript text to the
corresponding speech track, and tools that help the producer maintain
natural speech cadences by manipulating breaths and pauses. Finally,
we present an algorithmic framework based on music analysis and
dynamic programming optimization that enables several methods for
adding music to audio stories: looping, musical underlays, and
emotionally relevant scores. Combined, our tools augment the
traditional audio story production pipeline by allowing the producer
to create stories using high-level rather than low-level operations on
audio clips. Ultimately, we hope that our tools enable the producer to
devote more time to storytelling and less time to tedious audio
recording and editing.




More information about the grc mailing list