Tongue Contour Tracking in Ultrasound Images with Spatiotemporal LSTM Networks
Tarih
Yazarlar
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
Erişim Hakkı
Özet
Analysis of ultrasound images of the human tongue has many applications such as tongue modeling, speech therapy, language education and speech disorder diagnosis. In this paper we propose a novel ultrasound tongue contour tracker that enforces constraints of ultrasound imaging of the tongue such as spatial and temporal smoothness of the tongue contours. We use 3 different LSTM networks in sequence to satisfy these constraints. The first network uses only spatial image information from each video frame separately. The second and third networks add temporal information to the results of the first spatial network. Our networks are designed by considering the ultrasound image formation process of the human tongue. We use polar Brightness-Mode of the ultrasound images, which makes it possible to assume that each column of the image can contain at most one contour position. We tested our system on a dataset that we collected from 4 volunteers while they read written text. The final accuracy results are very promising and they exceed the state of the art results while keeping the run times at very reasonable levels (several frames per second). We provide the complete results of our system as supplementary material.








