AI Technology Helps Deaf Students Learning

ROCHESTER, NY – When students start sitting in general biology class, the professors talk about the general and special senses that read – “Which sensing can feel pain? Everything. “- exposed at the bottom of the PowerPoint presentation displayed on the wall behind it. A translator stands a few feet from the professor while translating his conversation into American Sign Language (ASL), the main language used by the deaf in the United States.

Aside from the real-time caption on the screen in front of the room, this is how classes usually take place at the Rochester Institute of Technology in New York. Around 1,500 deaf students and hearing difficulties are an important part of life on campus which has 15,000 undergraduate students. Nearly 700 of all students on campus who are deaf and hard of hearing, take courses in the same class as students who have normal hearing, including some general biology classes that Sandra Connelly brings to 250 students in her class.

In his class, Connely uses a headset that uses Microsoft Translator technology, an AI-based communication technology, to display a script (caption) on the screen behind Connely. The AI ​​system uses qualified speech recognition technology to change daily spoken language, such as the use of ‘um’, halting speech, etc. into a text that can be understood perfectly with punctuation. The translator technology, eliminating informal words and adding punctuation to produce higher-quality written translations in more than 60 languages. The hearing impaired and hard of hearing community recognizes that the translated and punctuated translated text is an ideal tool for understanding languages, in addition to the ASL they have been using.

Microsoft is partnering with the National Technical Institute for the Deaf at RIT, one of 9 faculties at the university, to test the use of Microsoft’s AI-based speech and language technology to support the activities of deaf or hard of hearing students in the classroom.

“When I first saw the technology work, I was very excited; I thought, ‘Wow, I can get information at the same time as my colleagues who can hear,’ “said Joseph Adjei, a first-year student from Ghana, who lost his hearing seven years ago. When he arrived at RIT, he had difficulty with ASL. The captions displayed in real-time on the screen behind Connelly during biology classes make it easy for her to take part in learning activities and help her understand scientific terms correctly.

Now, in the second semester of general biology, Adjei, who continues to study ASL, sits in the front and turns his gaze in turns towards the translator, the caption on the screen, and the transcript on his cellphone, which he rests on his desk. This combination keeps him connected to the teacher. When he does not understand ASL, he reads a caption that provides sources of information and content that he did not get from the ASL translator.

These captions sometimes miss important points in biology class, such as the difference between “I” (me) and “eye” (eyes). “But this is better than getting nothing at all.” In fact, Adjei uses the Microsoft Translator application on his cell phone to help communicate with colleagues who can hear outside the classroom.

“Sometimes when they talk, they talk too fast and I can’t read their lips. So, I use my cellphone and have a conversation that way so I understand what is happening, “he said.

AI for Creating Caption

Jenny Lay-Flurrie, Microsoft’s Chief Accessibility Officer, who is also deaf, said that the pilot project with RIT showed the potential of AI to empower people with disabilities, especially deaf people. The manuscript (caption) produced by Microsoft Translator forms a new layer of communication, in addition to sign language, to help people including themselves in achieving more, he said.

At present, the project has entered the initial stage and is starting to be provided in classes. Connelly’s general biology class is one of 10 classes that already use AI-based real-time captioning services, which is an additional application to Microsoft PowerPoint called Presentation Translator. Students can run the Microsoft Translator application on their laptops, cellphones or tablets to get a real-time caption in their chosen language. “Language is the driving force of human evolution. Language will increase collaboration, communication and learning. By using subtitles in RIT classes, we help everyone to learn and communicate better, “said Xuedong Huang, technical fellow and head of the Microsoft AI and Research language group and language.

Huang began working in the field of automatic speech recognition in 1980 to help 1.3 billion people in his home country, China, to avoid typing Mandarin on keyboards designed for Western languages. The continued development of speech recognition technology several years ago, according to him, has resulted in accurate conversation technology like humans, which leads to a machine translation system that can translate sentences in news articles from Mandarin to English and “the belief to introduce the technology to daily use for everyone. “

Growth in Demand for Access Services

When Gary Behm enrolled in 1974, he was one of 30 deaf and hard of hearing students enrolled for RIT lectures. ASL translators translate professors’ words into sign language, as translators do on various campuses today. He graduated with a degree in mechanical engineering and pursued a career at IBM. He worked around the country, earning a postgraduate degree in mechanical engineering and now has three children, two of whom are deaf, with his wife, who is also deaf.

When his children grew up and began his career, he and his wife, whom he first met at NTID, decided to return to campus. Behm, a mechanical technician who also has good computer skills, began working on access to technology to support the growth of the NTID student body, which now has more than 1,500 students, nearly half of whom are enrolled in lectures at eight other faculties at RIT.

“We are very pleased to see this growth, but we have access service constraints that we can provide for these students,” said Behm, who is now the interim director of academic relations at NTID and the director of the Center on Access Technology, the part in charge of conducting research and developing technology access.

The combination of translator access services and real-time captioning technology helps students who are deaf and hard of hearing to overcome the problem of attending classes in class. Students who have good hearing, according to Behm, routinely share their attention in class. If professors write a formula on the board while talking, for example, students with good hearing can listen and record the formula in their notebooks simultaneously.

“However, for the hearing impaired, it is impossible. My learning process depends on the translator, “Behm said. “But when the professor says, ‘try to look at the formula on the board,’ I have to distract from the translator and try to see which formula is being discussed, read it, then understand it.”

“When I looked back at the translator to get the information, the information was already too late.”

To help deal with these problems, the university employs a full-time staff of 140 translators, which is very important for the communication process, and more than 50 captionists. The captionists use technology developed by the campus called C-print to provide real-time transcripts in the lecture process that will be displayed on the laptop or tablet of every student who is deaf and hard of hearing. In addition, a number of students recorded information that could be shared so that students who were deaf and hard of hearing could focus on translators and read captions while in class.

“The question now becomes, can we improve our access services?” Behm said. With the increasing number of deaf and hard of hearing students enrolling to attend lectures in various faculties at RIT, RIT and NTID remain committed to helping their students faithfully to be able to attend lectures on campus well. RIT has employed the largest number of translator and caption staff compared to other educational institutions in the world and the need for access services is still increasing. That’s why Behm started looking for other better solutions, including automatic speech recognition technology, or commonly called ASR (Automatic Speech Recognition).

Automatic Conversation Identifier

According to Brian Trager, an NTID alumni and now an associate director at CAT, the initial trial of using ASR in the spring of 2016 did not produce good results. The initial system tested by the researchers was inaccurate to the point that it could not be understood, especially when talking about scientific and technical terms.

“I became someone who could only nod my head again,” said Trager, who is deaf and spent her childhood with difficulty reading lip movements. He often nods his head as if agreeing to something even though he doesn’t understand the ongoing conversation.

“Not only that, the text displayed is difficult to read,” he continued. “For example, a teacher talks about 9/11 and the system spells out ‘n-i-n-e e-l-e-v-e-n’ and so does the writing of the year and also the money. The data is very raw. My eyes are very tired. There are not even dots and commas. There is no room for understanding. “

That summer, a student working in a CAT laboratory conducted experiments with various ASR technologies from various Microsoft-owned technology companies that looked promising. “Numbers like 9/11 actually display 9 slashes 11, as you might imagine, and 2001 is displayed with 2001. It has punctuation. And that alone is very good because the readability factor greatly increases. That is very different. It’s very convenient and easy to use, “Trager said.

CAT researchers from NTID then learned about the beta version of the Microsoft Cognitive Service called the Custom Speech Service which enhances automatic speech recognition capabilities by allowing developers to build language models tailored for specific vocabulary according to region. The researchers volunteered to participate in the beta program. Less than 24 hours later, they received an e-mail from Will Lewis, a technical program manager for machine translators at Microsoft’s research organization.

Language Model for Classes

Lewis and his team at Microsoft introduced Microsoft Translator to researchers from CAT, and in the fall of 2017, the team worked together to build a language model specifically tailored to lecture material and control the technology in the classroom with the Presentation Translator add-on feature in PowerPoint.

To build the model, the researchers dug up a database of transcripts at the university, containing a C-print caption from lectures by specific professors for more than a decade, and also notes that each professor typed in their PowerPowint presentation. AI technology in Custom Speech Service uses these data to build a model of how specific words are pronounced. When the speaker uses these words, the system will recognize them and display the text on the transcript in real time.

Chris Cambell is an NTID alumni who is now a research professor at CAT, where he leads the ASR development effort. In the fall of 2017, he taught courses in the basics of programming to students at NTID. He teaches using American sign language (ASL – American Sign Language).

“Sometimes, in NTID we have students who are not fluent in sign language; they depend on English. So, for my class, I submitted an application to try ASR and see how it works using a translator, “he said.

The translator uses a headset and talks through the microphone about what Campbell suggests. Microsoft Presentatioin Translator displays captions under his PowerPoint slides and on the personal equipment of students who run the Microsoft Translator application. When Campbell used sign language, he said, he saw the eyes of his students moving from him, to the caption, and to the translator. The amount of time they spend on each source of information depends on their comfort with ASL and their hearing level.

“I was able to listen to the translator and read the caption on my laptop,” said Amanda Bui, a student who has difficulty hearing in classes that are not fluent with ASL and have limited access services while living in Fremont, California. “This makes it easier for me to learn cipher languages.”

Accessibility for All

Connelly, professor of general biology, saw the automatic caption technology would enlarge the performance of the ASL translator, not replace it. That is because ASL, which can interpret several words in a gesture, is lighter than reading. But when used in conjunction with translators, the technology increases access for students in a wider range, especially for those who are less fluent in using ASL like Joseph Adjei, a student from Ghana.

Moreover, he said, Microsoft Translator allows students to save lecture transcripts, which have changed how all students can learn the lecture material.

“They know every funny word I say today,” he said. “Lectures are no longer a routine with me standing and explaining, but now have me in their notes and in text form. This has changed what they said when they came to my office. They did not come to say ‘I missed this word’ or ‘I have missed this definition.’ They are now coming to discuss ‘I don’t understand why this is useful for this.’ This technology has changed our focus. ”

Students who can hear well periodically check every caption in class to find out which material they are missing and save the transcript as an aid to learning, Connelly added. When the only deaf student in her evolution biology class, which was a model of the ASR system during the fall semester, dropped out of college, Connelly turned off her caption. Students with good hearing then protest this and make the Presentation Translator finally turned on until the end of the semester.

Jenny Lay-Flurrie said she liked stories like that because they reinforced the value of investing in accessibility technology.

“From a product technical design perspective,” he said, “when you design for people with accessibility, you have designed for everyone, including 1 billion more people with physical limitations.”


Please enter your comment!
Please enter your name here