Image Recognition Revolutionizes the Online Experience for the Visually Impaired

People take seeing and technology for granted. For a specific group of internet users, the online experience is not so straightforward. The visually impaired need special assistance to experience the digital world. There are a few diverse low-vision aids but generally, they can be divided into two categories: translating visual information into alternative sensory information (sound or touch) and adapting visual transformation to make it more visible. However, the bigger problem remains how to help people who are blind. The emerging technology for assistance in this category uses image processing techniques to optimize the visual experience. Today we will be looking at how image recognition is revolutionizing the online experience for the visually impaired.

Blind Users Interacting with Visual Content

Let’s stop for a second to consider the whole online experience for the visually impaired. What happens when a regular person sees a webpage? He scans it, clicks links or fills in page information. For the visually impaired, the experience is different. They use a screen reader: a software that interprets a photo or image on the screen and reads it to the user. However, to narrate each page element in a fixed order including skipping is not easy. Sometimes there is a vast difference between the visual page elements (buttons, banners, etc.) and the alt-text read by the screen reader. SNS pages (social networking service) with unstructured visual elements and an abundance of links, with horizontally and vertically organized content make listening to the screen reader more confusing.

Interacting with Social Visual Content

SNSs make it easy to communicate through various types of visual content. To fully engage with images, visually impaired people need to overcome accessibility challenges associated with the visual content through workarounds or with outside help.

Advancements in artificial intelligence are allowing blind people to identify and understand the visual content. Some of them include image recognition, tactile graphics, and crowd-powered systems.

Facebook has already algorithmically generated useful and accurate descriptions of photos on a larger scale without latency in the user experience. They provide visuals a description as image alt-text, an HTML attribute designed for content managers to provide the text alternative for images.

Web Accessibility  Today

We might think that web accessibility is a universal thing, but web designers do not always have the resources to devote to accessibility or do not see the value in making sites accessible. A 2-dimensional web page translated into a 1-dimensional speech stream is not easy to decipher. One of the most annoying things is that the majority of websites have insufficient text labeling of graphic content, concurrent events, dynamic elements, or infinitely scrolling pages (i.e. a stream of feeds). Thus, many websites continue to be inaccessible through screen readers. Even the ones that are intended for universal access: library websites, university websites, and SNSs.

The World Wide Web Consortium (W3C), an international community where Member organizations and the public work together to develop Web standards, created accessibility standards.  Led by Web inventor Tim Berners-Lee and CEO Jeffrey Jaffe, W3C's mission is to lead the Web to its full potential.

Solutions Helping Visually Impaired Users

There is a new iPhone app which uses machine learning to identify objects for visually impaired people without an Internet connection. The free image-recognition app is called Aipoly and is making it easier for people to recognize their surroundings. How does it work? You simply point the phone’s rear camera at whatever you want to identify and it speaks what it sees. The app can identify one object after another as the user moves the phone around and it doesn’t require picture taking.The app can be helpful not only to people with impaired vision but also to the ones trying to learn a new language.

Aipoly cofounder Simon Edwardsson says it recognizes images by using deep learning, which is a machine-learning technique inspired by studies of the brain. This is the same technology used by Facebook for recognizing faces and Google for searching images. The app breaks down the image into different characteristics like lines, patterns, curves, etc. and uses them to determine the likelihood of that image to be a specific object. The app works fine for objects around the office. So far it can recognize around 1,000 objects, which is more than enough.

Banknote-reader (b-reader)
The banknote reader is a device that helps the visually impaired to recognize money. The banknote goes into the b-note holder for scanning and recognition (orientation doesn’t really matter), it gets photographed and sent securely to the cloud. There an Imagga-trained custom classifier recognizes the nominal value and returns the information to the b-note device. Then it plays a pre-recorded .mp3 file with the value if it is recognized. The project is part of TOM (Tikkun Olam Makers), a global movement of communities connecting makers, designers, engineers and developers with people with disabilities to develop technological solutions for everyday challenges. On the web platform, you can find full specs of the b-note prototype, including building instructions and camera code used for calling Images API, so that you can make a device like it for around 100 Euro or 115 USD.

This is a combination of a Smartphone and advanced “artificial vision” software to create a helpful electronic assistant for anyone who is visually impaired or blind. It can be used to automatically scan and identify objects like money, packaged goods, DVDs, CDs, medication bottles, and even landmarks. All it takes is to point the device video camera at the object and the device pronounces the name quickly and clearly. It can be taught to identify all the objects and landmarks around you. With a little extra help, the LookTel can be a helpful assistant. It also incorporates a text reader which allows users to get access to print media.

Seeing AI
This is a smartphone app that uses computer vision to describe the world and is created by Microsoft. Once the app is downloaded, the user can point the camera at a person and it will announce who the person is and how they are feeling. The app also works with products. It is done by artificial intelligence running locally on the phone. So far the app is available for free in the US for iOS. It is unclear when the rest of the world and Android users will be able to download it.

The app works well for recognizing familiar people and household products (scanning barcodes). It can also read and scan documents and recognize US currency. This is not a small feat because the dollar bills are basically the same size and color, regardless of their value, so spotting the difference is sometimes difficult for the visually impaired. The app is using neural networks to identify objects, which is the same technology used for self-driving cars, drones, and others. The most basic functions take place on the phone itself, however most features require a connection.

Next  Challenges for Full Adoption

Facebook users upload more than 350 million photos a day. Websites are relying mostly on images and less on the text. Sharing visuals has become a major part of the online experience. So using screen readers and screen magnifiers on mobile and desktop platforms help the visually impaired. However, more efforts need to be put to make the web more accessible through design guidelines, designer awareness, and evaluation techniques.

The most difficult challenge ahead is the evaluation of the effectiveness of image processing. It needs to be held ultimately to the same standards as other clinical research in low vision. Image processing algorithms need to be tailored specifically to disease entities and be available on a variety of displays, including tablets. This field of research has the potential to deliver great benefits to a large number of people in short period of time.

Batch Image Processing From Local Folder Using Imagga API

Batch Upload of Photos for Image Recognition

This blog post is part of series on How-Tos for those of you who are not quite experienced and need a bit of help to set up and use properly our powerful image recognition APIs.

In this one we will help you to batch process (using our Tagging or Color extraction API) a whole folder of photos, that reside on your local computer. To make that possible we’ve written a short script in the programming language Python:

Feel free to reuse or modify it. Here’s a short explanation what it does. The script requires the Python package, which you can install using this guide.

It uses requests’ HTTPBasicAuth to initialize a Basic authentication used in Imagga’s API from a given API_KEY and API_SECRET which you have to manually set in the first lines of the script.

There are three main functions in the script - upload_image, tag_image, extract_colors.

    • upload_image(image_path) - uploads your file to our API using the content endpoint, the argument image_path is the path to the file in your local file system. The function returns the content id associated with the image.
  • tag_image(image, content_id=False, verbose=False, language='en') - the function tags a given image using Imagga’s Tagging API. You can provide an image url or a content id (from upload_image) to the ‘image’ argument but you will also have to set content_id=True. By setting the verbose argument to True, the returned tags will also contain their origin (whether it is coming from machine learning recognition or from additional analysis). The last parameter is ‘language’ if you want your output tags to be translated in one of Imagga’s supported 50 (+1) languages. You can find the supported languages from here -
  • extract_colors(image, content_id=False) - using this function you can extract colors from your image using our Color Extraction API. Just like the tag_image function, you can provide an image URL or a content id (by also setting content_id argument to True).

Script usage:

Note: You need to install the Python package requests in order to use the script. You can find installation notes here.

You have to manually set the API_KEY and API_SECRET variables found in the first lines of the script by replacing YOUR_API_KEY and YOUR_API_SECRET with your API key and secret.

Usage (in your terminal or CMD):

python <input_folder> <output_folder> --language=<language> --verbose=<verbose> --merged-output=<merged_output> --include-colors=<include_colors>

The script has two required - <input_folder>, <output_folder> and four optional arguments - <language>, <verbose>, <merged_output>, <include_colors>.

  • <input_folder> - required, the input folder containing the images you would like to tag.
  • <output_folder> - required, the output folder where the tagging JSON response will be saved.
  • <language> - optional, default: en, the output tags will be translated in the given language (a list of supported languages can be found here:
  • <verbose> - optional, default: False, if True the output tags will contain an origin key (whether it is coming from machine learning recognition or from additional analysis)
  • <include_colors> - optional, default: False, if True the output will also contain color extraction results for each image.
  • <merged_output> - optional, default: False, if True the output will be merged in a JSON single file, otherwise - separate JSON files for each image.

Imagga partners with Blitline to jointly offer Smart Cropping

We are excited to announce that starting today we are partnering with Blitline, one of the best and most lightweight and easy to integrate image processing services in the cloud!

Blitline& Imagga

Our cooperation starts with the opportunity for Blitline users to take advantage of our smart-cropping, as applied in our cropping API and cropping tool The guys at Blitline have produced a very easy to understand example why Imagga’s smart cropping is better than just center cropping:

How Imagga smart cropping works

Imagga Smart Cropping as explained by Blitline

Funny enough, the idea for the cooperation came a month ago from an existing user of Blitline who approached us and asked if we can provide them with the same seamless integration that Blitline offers for regular cropping, but with our smart cropping, or ideally – can we offer it via Blitline? No sooner said than done! We got in contact and things simply worked out.

Now Blitline users who subscribe for the special Blitline + Imagga plan have access to the imagga_smart_crop function for their image processing pipeline. The pricing sticks to the pay-as-you-go model (per cropped image) plus a very low monthly subscription fee.

Blitline Imagga Pricng Plan

Blitline + Imagga Pricing Plan

We do believe this is going to be a great example of co-opetition!