Recognizing structure in report transcripts

Jeremy Jancsary
MSc thesis, Faculty of Informatics, Vienna University of Technology
Advisor: Harald Trost
Vienna, February 2008

Typically, the output of Automatic Speech Recognition (ASR) is a mere sequence of words. This view may be sufficient for some tasks, whereas others require a more structured approach. This thesis presents a framework that allows for identification of deep, underlying structure in report dictations. Identification of structural elements, such as headings, sections and enumerations is an important step towards automatic post-processing of dictated speech. The contributions of this thesis include a generic approach that can be integrated seamlessly with existing ASR solutions and provides structured output, as well as a freely available Conditional Random Field (CRF) toolkit that forms the basis of aforementioned approach and may also be applicable to numerous other problems.

MSc thesis

 

Note: The CRF software package developed during the course of the thesis (VieCRF) is available on the Software page.

Please cite as:

@MASTERSTHESIS{Jancsary2008a,
title = {Recognizing structure in report transcripts},
author = {Jeremy Jancsary},
school = {Faculty of Informatics, Vienna University of Technology},
year = {2008}
}