People take seeing and technology for granted. For a specific group of internet users, the online experience is not so straightforward. The visually impaired need special assistance to experience the digital world. There are a few diverse low-vision aids but generally, they can be divided into two categories: translating visual information into alternative sensory information (sound or touch) and adapting visual transformation to make it more visible. However, the bigger problem remains how to help people who are blind. The emerging technology for assistance in this category uses image processing techniques to optimize the visual experience. Today we will be looking at how image recognition is revolutionizing the online experience for the visually impaired.

Blind Users Interacting with Visual Content

Let’s stop for a second to consider the whole online experience for the visually impaired. What happens when a regular person sees a webpage? He scans it, clicks links or fills in page information. For the visually impaired, the experience is different. They use a screen reader: a software that interprets a photo or image on the screen and reads it to the user. However, to narrate each page element in a fixed order including skipping is not easy. Sometimes there is a vast difference between the visual page elements (buttons, banners, etc.) and the alt-text read by the screen reader. SNS pages (social networking service) with unstructured visual elements and an abundance of links, with horizontally and vertically organized content make listening to the screen reader more confusing.

Interacting with Social Visual Content

SNSs make it easy to communicate through various types of visual content. To fully engage with images, visually impaired people need to overcome accessibility challenges associated with the visual content through workarounds or with outside help.

Advancements in artificial intelligence are allowing blind people to identify and understand the visual content. Some of them include image recognition, tactile graphics, and crowd-powered systems.

Facebook has already algorithmically generated useful and accurate descriptions of photos on a larger scale without latency in the user experience. They provide visuals a description as image alt-text, an HTML attribute designed for content managers to provide the text alternative for images.

Web Accessibility  Today

We might think that web accessibility is a universal thing, but web designers do not always have the resources to devote to accessibility or do not see the value in making sites accessible. A 2-dimensional web page translated into a 1-dimensional speech stream is not easy to decipher. One of the most annoying things is that the majority of websites have insufficient text labeling of graphic content, concurrent events, dynamic elements, or infinitely scrolling pages (i.e. a stream of feeds). Thus, many websites continue to be inaccessible through screen readers. Even the ones that are intended for universal access: library websites, university websites, and SNSs.

The World Wide Web Consortium (W3C), an international community where Member organizations and the public work together to develop Web standards, created accessibility standards.  Led by Web inventor Tim Berners-Lee and CEO Jeffrey Jaffe, W3C’s mission is to lead the Web to its full potential.

Solutions Helping Visually Impaired Users

Aipoly
There is a new iPhone app which uses machine learning to identify objects for visually impaired people without an Internet connection. The free image-recognition app is called Aipoly and is making it easier for people to recognize their surroundings. How does it work? You simply point the phone’s rear camera at whatever you want to identify and it speaks what it sees. The app can identify one object after another as the user moves the phone around and it doesn’t require picture taking.The app can be helpful not only to people with impaired vision but also to the ones trying to learn a new language.

Aipoly cofounder Simon Edwardsson says it recognizes images by using deep learning, which is a machine-learning technique inspired by studies of the brain. This is the same technology used by Facebook for recognizing faces and Google for searching images. The app breaks down the image into different characteristics like lines, patterns, curves, etc. and uses them to determine the likelihood of that image to be a specific object. The app works fine for objects around the office. So far it can recognize around 1,000 objects, which is more than enough.

Banknote-reader (b-reader)
The banknote reader is a device that helps the visually impaired to recognize money. The banknote goes into the b-note holder for scanning and recognition (orientation doesn’t really matter), it gets photographed and sent securely to the cloud. There an Imagga-trained custom classifier recognizes the nominal value and returns the information to the b-note device. Then it plays a pre-recorded .mp3 file with the value if it is recognized. The project is part of TOM (Tikkun Olam Makers), a global movement of communities connecting makers, designers, engineers and developers with people with disabilities to develop technological solutions for everyday challenges. On the web platform, you can find full specs of the b-note prototype, including building instructions and camera code used for calling Images API, so that you can make a device like it for around 100 Euro or 115 USD.

LookTel
This is a combination of a Smartphone and advanced “artificial vision” software to create a helpful electronic assistant for anyone who is visually impaired or blind. It can be used to automatically scan and identify objects like money, packaged goods, DVDs, CDs, medication bottles, and even landmarks. All it takes is to point the device video camera at the object and the device pronounces the name quickly and clearly. It can be taught to identify all the objects and landmarks around you. With a little extra help, the LookTel can be a helpful assistant. It also incorporates a text reader which allows users to get access to print media.

Seeing AI
This is a smartphone app that uses computer vision to describe the world and is created by Microsoft. Once the app is downloaded, the user can point the camera at a person and it will announce who the person is and how they are feeling. The app also works with products. It is done by artificial intelligence running locally on the phone. So far the app is available for free in the US for iOS. It is unclear when the rest of the world and Android users will be able to download it.

The app works well for recognizing familiar people and household products (scanning barcodes). It can also read and scan documents and recognize US currency. This is not a small feat because the dollar bills are basically the same size and color, regardless of their value, so spotting the difference is sometimes difficult for the visually impaired. The app is using neural networks to identify objects, which is the same technology used for self-driving cars, drones, and others. The most basic functions take place on the phone itself, however most features require a connection.

Next  Challenges for Full Adoption

Facebook users upload more than 350 million photos a day. Websites are relying mostly on images and less on the text. Sharing visuals has become a major part of the online experience. So using screen readers and screen magnifiers on mobile and desktop platforms help the visually impaired. However, more efforts need to be put to make the web more accessible through design guidelines, designer awareness, and evaluation techniques.

The most difficult challenge ahead is the evaluation of the effectiveness of image processing. It needs to be held ultimately to the same standards as other clinical research in low vision. Image processing algorithms need to be tailored specifically to disease entities and be available on a variety of displays, including tablets. This field of research has the potential to deliver great benefits to a large number of people in short period of time.