- Sign Language Translation with Transformers
Awards
Author(s):
Category:
Institution:
Region:
Winner Category:
Year:
Abstract:
- This thesis explores a Neural Machine Translation (NMT) based approach towards automatic Sign Language Translation (SLT). The communication barrier between the Deaf community and hearing people prevents those with hearing loss to easily interact with a predominantly hearing society, and presents numerous challenges in their daily lives. SLT poses an interesting challenge and the provision of such a system can facilitate communication between the Deaf and hearing.
We begin by addressing the challenges of sign language processing and examine the different steps in SLT. Existing SLT systems first use a Sign Language Recognition (SLR) system to extract sign language glosses from videos. Then, a translation system generates spoken language translations from the sign language glosses. Our survey of previous efforts in SLT shows that though SLT has gathered interest recently, little study has been performed on the translation system. This thesis focuses on enhancing translation from sign language glosses to spoken language.
The recent success of Transformers for NMT between spoken languages inspires us to adopt this architecture. We study Transformers for SLT in various setups, including techniques from spoken language processing that have not yet been applied to sign language. Our experiments on RWTH-PHOENIX-Weather 2014T, a challenging SLT benchmark dataset of German sign language, and ASLG-PC12, a dataset involving American Sign Language (ASL), confirm our hypothesis that Transformers achieve better SLT results than previous RNN-based architectures.
Our methodology improves on the current state-of-the-art by over 5 and 7 points respectively in BLEU-4 score on ground truth glosses and predict glosses of RWTH-PHOENIX-Weather 2014T. On ASLG-PC12, we report an improvement of over 16 points. Our findings also demonstrate that end-to-end translation from videos provides even better results than translation of ground truth glosses. This shows potential for further improvement in SLT by either jointly training the SLR and translation systems or by revising the annotation system of sign language videos.
Attached Documents: