Student uses AI to translate sign language into English in real time

For many people, online calls have become an important part of their daily work routine. Software vendors have also kept pace, adding more useful features to these platforms, such as speaker highlighting. But when a person uses sign language, the software does not give him any special recognition. Additionally, these platforms can act as a barrier against those who depend on sign language; an engineer seeks to change that.

Image credit: Github.

To bridge this gap, Priyanjali Gupta, an engineering student at Vellore Institute of Technology (VIT) in Tamil Nadu, created an AI model that translates American Sign Language (ASL) into English in real time. She shared her creation on LinkedIn, reaching over 60,000 likes on the platform as people seem impressed with the creative idea.

Gupta said the engine behind the software was her mother, who encouraged her to do something different as an engineering student. “It got me thinking about what I could do with my knowledge and skills. The idea of ​​inclusive technology struck me. It triggered a series of plans,” Gupta said in an interview with Interesting Engineering.

A big leap

The system developed by Gupta converts signals into English text by analyzing the movements of several body parts, such as arms and fingers, using image recognition technology. She digitized a few people’s sign language to develop the technology, as she explains in her Github post – which has gone viral on the programming platform.

The AI ​​software offers a dynamic way to communicate with people who are deaf or hard of hearing because it is in real time. However, it is only in its infancy. It can turn hidden gestures into about six gestures – Yes, No, Please, Thank you, I love you and Hello. Lots of sign language data would be needed to create a more reliable model, but as a proof of concept it really seems to work.

Gupta said the dataset was created manually with a webcam and given annotations. The model is only trained on single frames, so it cannot detect videos yet. Gupta said she is currently researching the use of long-term memory networks (LSTM), an artificial recurrent neural network, to incorporate multiple images into the software.

The model was made with the help of Nicholas Renotte, a machine learning expert from a popular YouTube channel, Gupta said. While she recognizes that creating software for sign detection is rather complex, Gupta said she hopes the open-source community will soon help her find a solution that can further expand her work.

American Sign Language is considered the third most used language in the United States after English and Spanish. However, apps for translating ASL into another language have a lot of catching up to do. The pandemic has brought this into the spotlight, with the work done by Gupta and others being a very good starting point.

Last year, Google researchers presented a language detection model that could identify people logging in in real time with up to 91% accuracy. While welcoming these developments, Gupta said the first step would be to standardize sign languages ​​and other means of communication and work to bridge the communication gap.

Comments are closed.