Identifying segment topics in medical dictations

Johannes Matiasek, Jeremy Jancsary, Alexandra Klein, and Harald Trost
EACL 2009 Workshop on Semantic Representation of Spoken Language
March 2009, Athens, Greece

In this paper, we describe the use of lexical and semantic features for topic classification in dictated medical reports. First, we employ SVM classification to assign whole reports to coarse work-type categories. Afterwards, text segments and their topic are identified in the output of automatic speech recognition. This is done by assigning work-type-specific topic labels to each word based on features extracted from a sliding context window, again using SVM classification utilizing semantic features. Classifier stacking is then used for a posteriori error correction, yielding a further improvement in classification accuracy. 



Please cite as:

title = {Identifying segment topics in medical dictations},
author = {Johannes Matiasek and Jeremy Jancsary and Alexandra Klein and Harald Trost},
booktitle = {EACL 2009 Workshop on Semantic Representation of Spoken Language},
year = {2009}