image_recognition_brail

Image Recognition Revolutionizes the Online Experience for the Visually Impaired

People take seeing and technology for granted. For a specific group of internet users, the online experience is not so straightforward. The visually impaired need special assistance to experience the digital world. There are a few diverse low-vision aids but generally, they can be divided into two categories: translating visual information into alternative sensory information (sound or touch) and adapting visual transformation to make it more visible. However, the bigger problem remains how to help people who are blind. The emerging technology for assistance in this category uses image processing techniques to optimize the visual experience. Today we will be looking at how image recognition is revolutionizing the online experience for the visually impaired.

Blind Users Interacting with Visual Content

Let’s stop for a second to consider the whole online experience for the visually impaired. What happens when a regular person sees a webpage? He scans it, clicks links or fills in page information. For the visually impaired, the experience is different. They use a screen reader: a software that interprets a photo or image on the screen and reads it to the user. However, to narrate each page element in a fixed order including skipping is not easy. Sometimes there is a vast difference between the visual page elements (buttons, banners, etc.) and the alt-text read by the screen reader. SNS pages (social networking service) with unstructured visual elements and an abundance of links, with horizontally and vertically organized content make listening to the screen reader more confusing.

Interacting with Social Visual Content

SNSs make it easy to communicate through various types of visual content. To fully engage with images, visually impaired people need to overcome accessibility challenges associated with the visual content through workarounds or with outside help.

Advancements in artificial intelligence are allowing blind people to identify and understand the visual content. Some of them include image recognition, tactile graphics, and crowd-powered systems.

Facebook has already algorithmically generated useful and accurate descriptions of photos on a larger scale without latency in the user experience. They provide visuals a description as image alt-text, an HTML attribute designed for content managers to provide the text alternative for images.

Web Accessibility  Today

We might think that web accessibility is a universal thing, but web designers do not always have the resources to devote to accessibility or do not see the value in making sites accessible. A 2-dimensional web page translated into a 1-dimensional speech stream is not easy to decipher. One of the most annoying things is that the majority of websites have insufficient text labeling of graphic content, concurrent events, dynamic elements, or infinitely scrolling pages (i.e. a stream of feeds). Thus, many websites continue to be inaccessible through screen readers. Even the ones that are intended for universal access: library websites, university websites, and SNSs.

The World Wide Web Consortium (W3C), an international community where Member organizations and the public work together to develop Web standards, created accessibility standards.  Led by Web inventor Tim Berners-Lee and CEO Jeffrey Jaffe, W3C's mission is to lead the Web to its full potential.

Solutions Helping Visually Impaired Users

Aipoly
There is a new iPhone app which uses machine learning to identify objects for visually impaired people without an Internet connection. The free image-recognition app is called Aipoly and is making it easier for people to recognize their surroundings. How does it work? You simply point the phone’s rear camera at whatever you want to identify and it speaks what it sees. The app can identify one object after another as the user moves the phone around and it doesn’t require picture taking.The app can be helpful not only to people with impaired vision but also to the ones trying to learn a new language.

Aipoly cofounder Simon Edwardsson says it recognizes images by using deep learning, which is a machine-learning technique inspired by studies of the brain. This is the same technology used by Facebook for recognizing faces and Google for searching images. The app breaks down the image into different characteristics like lines, patterns, curves, etc. and uses them to determine the likelihood of that image to be a specific object. The app works fine for objects around the office. So far it can recognize around 1,000 objects, which is more than enough.

Banknote-reader (b-reader)
The banknote reader is a device that helps the visually impaired to recognize money. The banknote goes into the b-note holder for scanning and recognition (orientation doesn’t really matter), it gets photographed and sent securely to the cloud. There an Imagga-trained custom classifier recognizes the nominal value and returns the information to the b-note device. Then it plays a pre-recorded .mp3 file with the value if it is recognized. The project is part of TOM (Tikkun Olam Makers), a global movement of communities connecting makers, designers, engineers and developers with people with disabilities to develop technological solutions for everyday challenges. On the web platform, you can find full specs of the b-note prototype, including building instructions and camera code used for calling Images API, so that you can make a device like it for around 100 Euro or 115 USD.

LookTel
This is a combination of a Smartphone and advanced “artificial vision” software to create a helpful electronic assistant for anyone who is visually impaired or blind. It can be used to automatically scan and identify objects like money, packaged goods, DVDs, CDs, medication bottles, and even landmarks. All it takes is to point the device video camera at the object and the device pronounces the name quickly and clearly. It can be taught to identify all the objects and landmarks around you. With a little extra help, the LookTel can be a helpful assistant. It also incorporates a text reader which allows users to get access to print media.

Seeing AI
This is a smartphone app that uses computer vision to describe the world and is created by Microsoft. Once the app is downloaded, the user can point the camera at a person and it will announce who the person is and how they are feeling. The app also works with products. It is done by artificial intelligence running locally on the phone. So far the app is available for free in the US for iOS. It is unclear when the rest of the world and Android users will be able to download it.

The app works well for recognizing familiar people and household products (scanning barcodes). It can also read and scan documents and recognize US currency. This is not a small feat because the dollar bills are basically the same size and color, regardless of their value, so spotting the difference is sometimes difficult for the visually impaired. The app is using neural networks to identify objects, which is the same technology used for self-driving cars, drones, and others. The most basic functions take place on the phone itself, however most features require a connection.

Next  Challenges for Full Adoption

Facebook users upload more than 350 million photos a day. Websites are relying mostly on images and less on the text. Sharing visuals has become a major part of the online experience. So using screen readers and screen magnifiers on mobile and desktop platforms help the visually impaired. However, more efforts need to be put to make the web more accessible through design guidelines, designer awareness, and evaluation techniques.

The most difficult challenge ahead is the evaluation of the effectiveness of image processing. It needs to be held ultimately to the same standards as other clinical research in low vision. Image processing algorithms need to be tailored specifically to disease entities and be available on a variety of displays, including tablets. This field of research has the potential to deliver great benefits to a large number of people in short period of time.


Securing Images in Python With the Imagga NSFW Categorization API

Its very common in web and mobile applications, as well as any other digital media to use images as part of the content. With images being so ubiquitous, there comes a need to ensure that the images posted are appropriate to the medium they are on. This is especially true for any medium accepting user-generated content. Even with set rules for what can and cannot be posted, you can never trust users to adhere to the set conditions. Whenever you have a website or medium accepting user-generated content, you will find that there is a need to moderate the content.

Why Moderate Content?

There are various reasons why content moderation might be in your best interest as the owner/maintainer of a digital medium. Some common ones are:

  • Legal obligations - If your application accomodates underaged users, then you are obligated to protect them from adult content.
  • Brand protection - How your brand is perceived by users is important, so you might want to block some content that may negatively affect your image.
  • Protect your users - You might want to protect your users against harassmsnt from other users. The harassment can be in the form of users attacking others by posting offensive content. An example of this is Facebook’s recent techniques of combating revenge p0rn on their platform.
  • Financial - It might be in your best interest financially, to moderate the content shown on your applications. For instance if your content is somewhat problematic, other businesses might not want to associate with you in terms of advertising on your platform or accepting you as an affiliate for them. For some Ad networks, keeping your content clean is a rule that you have to comply with if you want to use them. Google Adsense is an example of this. They strictly forbid users of the service from placing their ads on pages with adult content.
  • Platform rules - You might be forced to implement some form of content moderation if the platform your application is on requires it. For instance, Apple requires applications to have a way of moderating and restricting user-generated content before they can be placed on the App Store and Google also restricts apps that contain sexually explicit content

As you can see, if your application accepts user-generated content, moderation might be a requirement that you can’t ignore. There are different ways moderation can be carried out:

  • Individual driven - an example of this is a website that has admins that moderate the content. The website might work by either restricting the display of any uploaded content until it has been approved by an admin or it might allow immediate display of uploaded content, but have admins who constantly check posted content. This method tends to be very accurate in identifying inappropriate content, as the admins will most likely be clear as to what is appropriate/inappropriate for the medium. The obvious problem with this is the human labour needed. Hiring moderators might get costly especially as the application’s usage grows. Relying on human moderators can also affect the app’s user experience. Human response will always be slower than an automated one. Even if you have people working on moderation at all times, there will still be a delay in identifying and removing problematic content. By the time it is removed, a lot of users could have seen it. On systems that restrict showing uploaded content until it has been approved by an admin, this delay can become annoying to users.
  • Community driven - with this type of moderation, the owner of the application puts in place features that enable the app’s users to report any inappropriate content e.g. flagging the content. After a user flags a post, an admin will then be notified. This also suffers from a delay in identifying inappropriate content from both the community (who might not act immediately the content is posted) and the administrators (who might be slow to respond to flagged content). Leaving moderation up to the community might also result in reported false positives as content that is safe is seen by some users as inappropriate. With a large community, you will always have differing opinions, and because many people will probably not have read the Terms and Conditions of the medium, they will not have clear-cut rules of what is and isn’t okay.
  • Automated - with this, a computer system usually using some machine learning algorithm is used to classify and identify problematic content. It can then act by removing the content or flagging it and notifying an admin. With this, there is a decreased need for human labour, but the downside is that it might be less accurate than a human moderator.
  • A mix of some or all the above methods - Each of the methods described above comes with a shortcoming. The best outcome might be achieved from combining some or all of them e.g. you might have in place an automated system that flags suspicious content while at the same time enabling the community to also flag content. An admin can then come in to determine what to do with the content.

Autonomous Moderation Imagga NSFW Categorization API

Imagga makes available the NSFW (not safe for work) Categorization API that you can use to build a system that can detect adult content. The API works by categorizing images into three categories:

  • nsfw - these are images considered not safe. Chances are high that they contain ponographic content and/or display nude bodies or inappropriate body parts.
  • underwear - this categorizes medium safe images. These might be images displaying lingerie, underwear, swimwear, e.t.c.
  • safe - these are completely safe images with no nudity.

The API works by giving a confidence level of a submitted image. The confidence is a percentage that indicates the probability of an image belonging to a certain category.

To see the NSFW API in action, we’ll create two simple programs that will process some images using the API. The first program will demonstrate how to categorize a single image while the second will batch process several images.

Setting up the Environment

Before writing any code, we’ll first set up a virtual environment. This isn’t necessary, but is recommended as it prevents package clutter and version conflicts in your system’s global Python interpreter.

First, create a directory where you’ll put your code files.
$ mkdir nsfw_test
Then navigate to that directory with your Terminal application.
$ cd nsfw_test
Create the virtual environment by running:
$ python3 -m venv venv
We’ll use Python 3 in our code. In the above, we create a virtual environment with Python 3. With this, the default Python version inside the virtual environment will be version 3.

Activate the environment with (on MacOS and Linux):
$ source venv/bin/activate
On Windows:
$ venv\Scripts\activate

Categorizing Images

To classify an image with the NSFW API, you can either send a GET request with the image URL to the /categorizations/<categorizer_id> endpoint or you can upload the image to /content, get back a content_id value which you will then use in the call to the /categorizations/<categorizer_id> endpoint. We’ll create two applications that demonstrate these two scenarios.

Processing a Single Image

The first app we’ll create is a simple web application that can be used to check if an image is safe or not. We’ll create the app with Flask.

To start off, install the following dependencies.
$ pip install flask flask-bootstrap requests
Then create a folder named templates and inside that folder, create a file named index.html and add the following code to it:

{% extends "bootstrap/base.html" %}
{% block title %}Imagga NSFW API Test{% endblock %}
{% block navbar %}
<nav class="navbar navbar-inverse" role="navigation">
<div class="container">
<a class="navbar-brand" href="{{ url_for('index') }}">NSFW API Test</a>
</div>
</nav>
{% endblock %}
{% block content %}
<div class="container">
<div class="row">
<div class="col-md-8">
<form method="POST" action="">
<div class="form-group">
<label for="image_url">Image URL</label>
<input type="text" class="form-control" id="image_url" name="image_url" required>
</div>
<button type="submit" class="btn btn-primary">Submit</button>
</form>
</div>
</div>
{% if image_url %}
<br>
<div class="row">
<div class="col-md-4">
<img src="{{ image_url }}" class="img-thumbnail">
</div>
<div class="col-md-4">
{{ res }}
</div>
</div>
{% endif %}
</div>
{% endblock %}

In the above code, we create a html template containing a form that the user can use to submit an image URL to the Imagga API. When the response comes back from the server, it will be shown next to the processed image.

Next, create a file named app.py in the root directory of your project and add the following code to it. Be sure to replace INSERT_API_KEY and INSERT_API_SECRET with your Imagga API Key and Secret. You can signup for a free account to get these credentials. After creating an account, you’ll find these values on your dashboard:

from flask import Flask, render_template, request
from flask_bootstrap import Bootstrap
import os
import requests
from requests.auth import HTTPBasicAuth
app = Flask(__name__)

Bootstrap(app)

# API Credentials. Set your API Key and Secret here
API_KEY = os.getenv('IMAGGA_API_KEY', 'INSERT_API_KEY')
API_SECRET = os.getenv('IMAGGA_API_SECRET', 'INSERT_API_SECRET')
API_ENDPOINT = 'https://api.imagga.com/v1'
auth = HTTPBasicAuth(API_KEY, API_SECRET)
@app.route('/', methods=['GET', 'POST'])
def index():
image_url = None
res = None
if request.method == 'POST' and 'image_url' in request.form:
image_url = request.form['image_url']
response = requests.get(
'%s/categorizations/nsfw_beta?url=%s' % (API_ENDPOINT, image_url),
auth=auth)
res = response.json()
return render_template('index.html', image_url=image_url, res=res)
if __name__ == '__main__':
app.run(debug=True)

Every call to the Imagga API must be authenticated. Currently the only supported method for authentication is Basic. With Basic Auth, credentials are transmitted as user ID/password pairs, encoded using base64. In the above code, we achieve this with a call to HTTPBasicAuth().

We then create a function that will be triggered by GET and POST requests to the / route. If the request is a POST, we get the data submitted by form and send it to the Imagga API for classification.

The NSFW Categorizer is one of a few categorizers made available by the Imagga API. A Categorizer is used to recognize various objects and concepts. There are a couple predefined ones available (Personal Photos and NSFW Beta) but if none of them fit your needs we can build a custom one for you.

As mentioned previously, to send an image for classification, you send a GET request to the /categorizations/<categorizer_id> endpoint. The categorizer_id for the NSFW API is nsfw_beta. You can send the following parameters with the request:

  • url: URL of an image to submit for categorization. You can provide up to 10 urls for processing by sending multiple url parameters (e.g.) ?url=<first_url>&url=<second_url>…&url=<nth_url>
  • content: You can also directly send image files for categorization by uploading the images to our /content endpoint and then provide the received content identifiers via this parameter. As with the url parameter you can send more than one image - up to 10 content ids by sending multiple content parameters.
  • language: If you’d like to get a translation of the tags in other languages, you should use the language parameter. Its value should be the code of the language you’d like to receive tags in. You can apply this parameter multiple times to request tags translated in several languages. See all available languages here.

After processing the request, the API sends back a JSON object holding the image’s categorization data in case of a successful processing, and an error message incase there was a problem processing the image.Below you can see the response of a successful categorization:

{
'results': [{
'image': 'https://auto.ndtvimg.com/car-images/big/dc/avanti/dc-avanti.jpg',
'categories': [{
'name': 'safe',
'confidence': 99.22
}, {
'name': 'underwear',
'confidence': 0.71
}, {
'name': 'nsfw',
'confidence': 0.07
}]
}]
}

Note that you might not always get JSON with the three categories displayed. If the confidence of a category is 0, this category will not be included in the JSON object.Below you can see the response of a failed categorization.

{
'results': [],
'unsuccessful': [{
'reason': 'An error prevented image from being categorized. Please try again.',
'image': 'http://www.axmag.com/download/pdfurl-guide.pdf'
}]
}

Back to our app, you can save your code and run it with: $ python app.py

If you navigate to http://127.0.0.1:5000/ you should see a form with one input field. Paste in the URL of an image and submit it. The image will be processed and you will get back a page displaying the image and the JSON returned from the server. To keep it simple, we just display the raw JSON, but in a more sophisticated app, it would be parsed and used to make some decision.Below, you can see the results of some images we tested the API with.

  

As you can see, the images have been categorized quite accurately. The first two have safe confidence scores of 99.22 and 99.23 respectively while the last one has an underwear score of 96.21. Of course, we can’t show an nsfw image here on this blog, but you are free to test that on your own.

To know the exact confidence score to use for your app, you should first test the API with several images. When you look at the results of several images, you will be able to better judge which number to look out for in your code when filtering okay and not okay images. If you are still not sure about this, our suggestion is setting the confidence threshold at 15-20%. However, if you’d like to be more strict on the accuracy of the results, setting the confidence threshold at 30% might do the trick.

You should know that the technology is far from perfect and that the NSFW API is still in beta. From time to time, you might get an incorrect classification.

Note that the API has a limit of 5 seconds for downloading the image. If the limit is exceeded with the URL you send, the analysis will be unsuccessful. If you find that most of your requests are unsuccessful due to timeout error, we suggest uploading the images to our /content endpoint first (which is free and not accounted towards your usage) and then use the content id returned to submit the images for processing via the content parameter. We’ll see this in action in the next section.

Batch Processing Several Images

The last app we created allowed the user to process one image at a time. In this section, we are going to create a program that can batch process several images. This won’t be a web app, it will be a simple script that you can run from the command line.

Create a file named upload.py and add the code below to it. If you are still using the virtual environment created earlier, then the needed dependencies have already been insta lled, otherwise install them with. pip install requests import os import requests from requests.auth import HTTPBasicAuth\# API Credentials.  

Set your API Key and Secret here:

API_KEY = os.getenv('IMAGGA_API_KEY', 'INSERT_API_KEY')
API_SECRET = os.getenv('IMAGGA_API_SECRET', 'INSERT_API_SECRET')
API_ENDPOINT = 'https://api.imagga.com/v1'
FILE_TYPES = ['png', 'jpg', 'jpeg', 'gif']

class ArgumentException(Exception):


if API_KEY == 'YOUR_API_KEY' or \
API_SECRET == 'YOUR_API_SECRET':
raise ArgumentException('You haven\'t set your API credentials. '
'Edit the script and set them.'
auth = HTTPBasicAuth(API_KEY, API_SECRET)
def upload_image(image_path):
if not os.path.isfile(image_path):
raise ArgumentException('Invalid image path')
# Open the desired file
with open(image_path, 'rb') as image_file:
filename = image_file.name
# Upload the multipart-encoded image with a POST
# request to the /content endpoint
content_response = requests.post(
'%s/content' % API_ENDPOINT,
auth=auth,
files={filename: image_file})
# Example /content response:
# {'status': 'success',
# 'uploaded': [{'id': '8aa6e7f083c628407895eb55320ac5ad',
# 'filename': 'example_image.jpg'}]}
uploaded_files = content_response.json()['uploaded']
# Get the content id of the uploaded file
content_id = uploaded_files[0]['id']
return content_id
def check_image(content_id):
# Using the content id, make a GET request to the /categorizations/nsfw endpoint
# to check if the image is safe
params = {
'content' : content_id
}
response = requests.get(
'%s/categorizations/nsfw_beta' % API_ENDPOINT,
auth=auth,
params=params)
return response.json()
def parse_arguments():
import argparse
parser = argparse.ArgumentParser(
description='Tags images in a folder')
parser.add_argument(
'input',
metavar='<input>',
type=str,
nargs=1,
help='The input - a folder containing images')
parser.add_argument(
'output',
metavar='<output>',
type=str,
nargs=1,
help='The output - a folder to output the results')
args = parser.parse_args()
return args
def main():
import json
args = parse_arguments()
tag_input = args.input[0]
tag_output = args.output[0]
results = {}
if os.path.isdir(tag_input):
images = [filename for filename in os.listdir(tag_input)
if os.path.isfile(os.path.join(tag_input, filename)) and
filename.split('.')[-1].lower() in FILE_TYPES]
images_count = len(images)
for iterator, image_file in enumerate(images):
image_path = os.path.join(tag_input, image_file)
print('[%s / %s] %s uploading' %
(iterator + 1, images_count, image_path))
try:
content_id = upload_image(image_path)
except IndexError:
continue
except KeyError:
continue
except ArgumentException:
continue
nsfw_result = check_image(content_id)
results[image_file] = nsfw_result
print('[%s / %s] %s checked' %
(iterator + 1, images_count, image_path))
else:
raise ArgumentException(
'The input directory does not exist: %s' % tag_input)
if not os.path.exists(tag_output):
os.makedirs(tag_output)
elif not os.path.isdir(tag_output):
raise ArgumentException(
'The output folder must be a directory')
for image, result in results.items():
with open(
os.path.join(tag_output, 'result_%s.json' % image),
'wb') as results_file:
results_file.write(
json.dumps(
result, ensure_ascii=False, indent=4).encode('utf-8'))
print('Done. Check your selected output directory for the results')
if __name__ == '__main__':
main()

We use the argparse module to parse arguments from the command line. The first argument passed in will be the path to a folder containing images to be processed while the second argument is a path to a folder where the results will be saved.

For each image in the input folder, the script uploads it with a POST request to the /content endpoint. After getting a content id back, it makes another call to the /categorizations/<categorizer_id> endpoint. It then writes the response of that request to a file in the output folder.

Note that all uploaded files sent to /content remain available for 24 hours. After this period, they are automatically deleted. If you need the file, you have to upload it again. You can also manually delete an image by making a DELETE request to. https://api.imagga.com/v1/content/<:content_id>Add some images to a folder and test the script with:
$ python upload.py path/to/input/folder path/to/output/folder
If you look at the output folder you selected, you should see a JSON file for each processed image.

Feel free to test out the Imagga NSFW Categorization API. If you have any suggestions on ways to improve it or just general comments on the API, you can post them in the Comment Section below or get in touch with us directly. We are always happy to get feedback on our products.

NFSW Python Application


AI Policies: What is the world doing to make them secure

A couple of months ago the internet went berserk with the news of Facebook pulling the plug out on two bots, which started communicating in their own language. Imaginations and headlines went wild with the possibilities: malicious AI is taking over, the doomsday is here with the bots of the Apocalypse. Although the real story was quite different (the bots were turned off because they were designed to communicate with humans, not with each other, thus they were not delivering the expected results), the outcome was simply panic.What we can learn from this is that humans are afraid of their own creation - the artificial intelligence.

AI can transform gargantuan amounts of complex information into insight. It has the potential to present solutions, reveal secrets and solve problems. But before we get to the good part, we need to take care of development and deployment. In order to be able to use them, AI systems need to have the same ethical principles, moral values, professional codes, and social norms we follow. Some of us are excited about the opportunities AI provides, others are suspicious. To become widespread, AI needs to be designed in a way that allows people to understand it, use it and trust it. To ensure the acceptance of AI, public policies should help society deal with AI’s inevitable failures and facilitate adaptation.

Where are we now?

Policies can help AI’s progress or hamper it. We are witnessing a shift in the bottleneck to using AI products from technology to policymaking. Regulation is slow to respond to the cost of compliance or the adoption and development of innovations. Thorough and well-thought policies can influence the rate and direction of innovation by creating stimulus for the private sector. In order to grasp the current situation, we will take a look at the major players in AI technologies whose decisions will be influencing the future of policy making. Yes, you guessed it: China and USA (Europe also deserves a mention). If a country with AI research expertise wishes to participate as a producer it should be ready for tense labor market competition from the U.S. and China.

USA

On October 12, 2016, President Obama’s Executive Office published two reports that laid out its plans for the future of artificial intelligence. The report entitled “Artificial Intelligence, Automation and the Economy,” concluded that AI-driven automation suggests the need for aggressive public policies and a more robust safety net in order to combat labor disruption. The report elaborates on the topics of the previous one: Preparing for the Future of Artificial Intelligence, which recommended the publishing of a report on the economic impacts of artificial intelligence. The focus of AI capabilities is the automation of tasks which have required manual labor, which will provide new possibilities for the economy. However, the disruption of the current livelihood of some people is inevitable. The report’s objective is to find how to increase the benefits and mitigate the costs.

AI isn’t a science project; it’s commercially important.

The report proposes that three broad strategies are followed to ease the AI automation in the economy: first, invest and develop AI; second, educate and train workers for the future jobs, and, finally, aid workers in the transition and empower them to ensure broadly shared growth. Since AI automation will transform the economy, policymakers need to create or update, strengthen and adapt policies. The primary economic effects under consideration are the beneficial contribution to productivity growth, the new skills that the job market will demand (especially higher-level technical skills); the disbalance the impact of AI will create on wage and education levels, job types and locations; the loss of jobs which might be long term, depending on the policy responses.

China

For the past four years, the US and China have been heavily investing in AI especially compared to other countries. Just till recently, the US seemed like the leader in the tech race, but 2 years ago China has outdone the US in research output. China is emerging as a leader, not a follower. Government is backing research and development and thus driving China’s economy forward. The total value of AI industries will surpass 1 trillion yuan ($147.80 billion).

On July 20, China’s State Council issued the “Next Generation Artificial Intelligence Development Plan” (新一代人工智能发展规划), which articulates an ambitious agenda for China to lead the world in AI. China intends to pursue a “first-mover advantage” to become the “premier global AI innovation center” by 2030.  And Wan Gang, the Minister of Science and Technology, stated that China plans to launch a national AI plan, which will strengthen AI development and application, introduce policies to contain risks associated with AI, and work toward international cooperation. The plan will also provide funds to back these endeavors up.

The guideline states that developing AI is a “complicated and systematic project” and needs a coordinated AI innovation system- not only for the technology, but for the products as well. It goes on stating that AI in China should be used to promote the country’s technology, social welfare, economy, provide national security, and help the world in general.

The guideline advises that trans-boundary research needs to connect AI with subjects like psychology, cognitive science, mathematics, and economics. As far as platform construction goes, open-source computing platforms should promote coordination among different hardware, software and clouds. This will naturally increase the need of more AI professionals and scientists should who need to be prepared for work.

Europe

The International Business Machines Corporation (IBM) is actively engaged in global discussion about making AI ethical and beneficial. It is working not only internally, but with collaborators and competitors as well.

Because of the constant change in development, AI is making it difficult for any regulation agency to keep up with the progress. This is making meaningful and timely guidance almost impossible. On the other hands, issues like data privacy and ownership have been discussed in the EU. An algorithm for transparency and accountability has also been considered.

In 2018, the General Data Protection Regulation will be rolled out in the EU. It will restrict automated individual decision-making (algorithms making decisions on user-level predictors) that affects users. This law provides the “right to explanation:” a user can request an explanation why the algorithm has chosen him/her.

Safety is important, but so are fairness, equality and inclusiveness, which should be included in the AI systems. That’s why we need policies and regulations: to ensure AI is being used to the benefit of all. IBM is working with governments, media, regulatory agencies and industry sectors: everyone, who is willing to have a reasonable discussion on the ethical issues of AI. The aim is to clearly identify the potential and limits of AI and how to make the best use of it.

Who is it up to?

On a shorter term, it is up to the policymakers and lawyers. In the near future, government representatives need to have the technical expertise in AI to justify decisions. More research is needed on the security, privacy and the societal implications of AI used. For example, instead of cross-examining a person, lawyers may need to cross-examine an algorithm.

As with everything technological, there is a definite uncertainty about how strongly these effects will be felt. Maybe AI won’t have a large effect on the economy. But the other option is for the economy to experience a larger shock: changes in the labor market, employees without relevant work skills and in a desperate need of a training. Although no definitive decision could be taken or a deadline for policies setting, continued involvement of the government with the industry, technical and policy experts will play an important role.

free image recognition with imagga


Announcing V3.0 Tagging With Up to 45 % More Classes

We are happy to announce the upcoming update of our Tagging technology! We have been looking to make this change so that you can rely more on your API calls and get more precise results to help you better analyze and classify your visual data.

The updated version will become active on 28th of August and you can expect up to 45% more classes* with overall 15% improved precision rate and 30% better recall.

NEW vs. OLD Tagging Comparison

Because we believe that you need to see it with your eyes rather than just hear about it and test it on your own, we have decided to make this comparison with one of our Demo images.

Clearly in this case you can notice 46% more classes and the clear improvement of the precision in keywords accuracy. Prior to the update the most significant keyword was insect, while now it is dragonfly, small increase in recall but significant increase in intrinsic value.

Important For Current Users:

Our old tagging version will remain the default one for a period of 14 days after the launch. To upgrade you will need to change your endpoint parameter to version=3. Follow this example if you are uncertain how to do this.

If your current request url looks like this:

http://api.imagga.com/v1/tagging?url=http://pbs.twimg.com/profile_images/687354253371772928/v9LlvG5N.jpg

To make a request to the new version, it must looks like this:

http://api.imagga.com/v1/tagging?url=http://pbs.twimg.com/profile_images/687354253371772928/v9LlvG5N.jpg&version=3

After this, for a period of 7 days the new Tagging version will become the default one, but all users will still be able to make requests to the old version by adding the parameter for version=2. At the end of that period there will be only version 3 and you won't be able to add version parameters to your calls.

What is your experience with version 2 and 3? Did you find much of a difference when testing with your dataset?

* Some classes may have changed names.


free image recognition with imagga


Artificial Intelligence Becoming Human. Is That Good or Bad?

The term “artificial intelligence” has been driving people’s imaginations wild even before 1955 when the term was coined to describe an emerging computer science discipline. Today the term includes a variety of technologies to improve the human life and the list is ever growing. Starting with Alexa and self-driving cars finishing with love robots, your newsfeed is constantly full of AI updates. Your newsfeed is also a product of (somewhat) well-implemented algorithm. The good news? Just like the rest of the AI technologies, your newsfeed is self-learning and constantly changing, trying to improve your experience. The bad news? A lot of people know why but nobody can really explain why the most advanced algorithms work. And that’s where things can go wrong. And that’s where things can go wrong.

The Good AI

The AI market is blooming. The profitable mix of media attention, hype, startups and adoption by enterprises is making sure that AI is a household topic. A Narrative Science survey found that 38% of enterprises are already using AI and Forrester Research predicted that in 2017 the investments in AI will grow by 300% compared with 2016.

But what good can artificial intelligence do today?

Natural language generation

This capability of AI is used to generate reports, summarize business intelligence insights and automate customer service, AI can use this ability to produce text from data.

Speech recognition

Interactive voice response systems and mobile applications rely on AI ability to recognize speech. It transcribes and transforms human speech into form usable by a computer application.

Image recognition

This has been already successfully used to detect problematic persons at airports, for retail, etc.

Virtual agents/chatbots

These virtual agents are used in customer service and support, smart home managers. These chatbot systems and advanced AI can interact with humans. There are machine learning platforms which can design, train and deploy models into applications, processes and other machines, by providing algorithms, APIs, development and training data.

Decision management for enterprise

Engines that use rules and logic into AI systems and are used for initial setup/training and ongoing maintenance and tuning? Check. This technology has been used for a while now for decision management by enterprise applications and assisting automated decision-making. There is also AI-optimized hardware with the power to process graphics and designed to run AI computational jobs.

AI for biometrics

On a more personal level, the use of AI in biometrics enables more natural interactions between humans and machines, relying on image and touch recognition, speech, and body language. By using scripts and other ways to automate human action to support efficient business processes, robots are capable of executing tasks or processes instead of humans.

Fraud detection and security

Natural language processing (NLP) uses and supports text analytics by understanding sentence structure and meaning, sentiment and intent through statistical and machine learning methods. It is currently used in fraud detection and security.

The “Black Box” of AI

At the beginning AI breached out in two directions: machines should reason according to rules and logic (everything is visible in the code); machines should use biology and learn from observing and experiencing (a program generates an algorithm based on example data). Today machines ultimately program themselves based on the latter approach. Since there is no hand-coded system which can be observed and examined, deep learning is particularly a “black box.”

It is crucial to make sure we know when failures in the AI occur because they will. In order to do that, we need to know how techniques like deep learning work. Recognizing abstract things. In simple systems, recognition is based on physical attributes like outlines and colour; on the next level- more complex things like basic shapes, textures, etc. The top level can recognize all the levels and the whole not just as a sum of its parts.

There is the expectation that these techniques will be used to diagnose diseases, make trading decisions and transform whole industries. But it shouldn’t happen before we manage to make deep learning more understandable especially to their creators and accountable for their uses. Otherwise there is no way to predict failures.

Today mathematical models are already being used to find out who is approved for a loan and who gets a job. But deep learning represents a different way to program computers.  “It is a problem that is already relevant, and it’s going to be much more relevant in the future,” says Tommi Jaakkola, a professor at MIT who works on applications of machine learning. “Whether it’s an investment decision, a medical decision, or maybe a military decision, you don’t want to just rely on a ‘black box’ method.”

Starting in the summer of 2018, the European Union will probably require that companies be able to explain decisions made by automated systems. Easy right? Not really: this task might be impossible if the apps and the websites use deep learning. Even if it comes to something simple like recommending products or playing songs. Those services are run by computers which have programmed themselves. Even the engineers who have build them will not be able to fully clarify the way the computers reach the results.

“It might be part of the nature of intelligence that only part of it is exposed to rational explanation. Some of it is just instinctual.”

With the advance of technology, logic and reason might need to step down and leave some room for faith. Just like human reasoning and logic, we can’t always explain why we’ve taken a decision. However, this is the first time we are dealing with machines, which are not understandable by even the people who engineered them. How will this influence our relationship with technology? A hand-coded system is pretty straightforward, but any machine-learning technology is way more convoluted. Yes, not all AI tech will be this difficult to understand, but deep learning is a black box by design.

AI works a bit like the neural network and its center- the brain: you can’t look inside it to find out how it works because a network’s reasoning is embedded in the behaviour of thousands of simulated neurons. These neurons are arranged into dozens or even hundreds of intricately interconnected layers. The first layer receives input and then performs calculations before giving an a new signal as output. The results are fed to neurons in the next layer and so on.

Because there are many layers in a deep network, they are able to recognize things at different levels of abstraction. If you want to build an app, let’s say “Not a HotDog” (“Silicon Valley,” anyone?), you need to know what  a hot dog looks like. A system might be designed to recognize hot dogs based on outlines or color. Higher layers will recognize more complex things like texture and details like condiments.

But just as many aspects of human behavior can’t be explained in detail, it might be the case that we won’t be able to explain everything AI does.  “Even if somebody can give you a reasonable-sounding explanation [for his or her actions], it probably is incomplete, and the same could very well be true for AI,” says Clune, of the University of Wyoming. “It might just be part of the nature of intelligence that only part of it is exposed to rational explanation. Some of it is just instinctual, or subconscious, or inscrutable.”

Just like civilizations have been built on a contract of expected behaviour, we might need to design AI system to respect and fit into our social norms. Whatever robot or a system we created, it is important that their decision-making is consistent with our ethical judgements.

The AI Future

Participants in a recent survey were asked about the most worrying notion about AI. The results were as expected: participants were most worried by the notion of a robot that would cause them physical harm. Naturally, machines with close physical contact like self-driving cars and home managers were viewed as risky. However, when it cоmes to statistics, languages, personal assistants: people are more than willing to use AI in everyday tasks. The many potential social and economic benefits from the technology depend on the environment in which they evolve, says the Royal Society.

A robot animated by AI is known as “embodiment.” Thus applications that involved embodiment were viewed as risky. As data scientist Cathy O’Neil has written, algorithms are dangerous if they posses scale, their working are a secret and their effects are destructive. Alison Powell, an assistant professor at the London School of Economics believes that this mismatch between perceived and potential risk is common with new technologies. “This is part of the overall problem of the communication of technological promise: new technologies are so often positioned as “personal” that perception of systematic risk is impeded.”

Philosophers, computer scientists and techies make the distinction between “soft” and “hard” AI. The main difference? Hard AI’s main goal is to mimic the human mind. As the Wall Street Journal and MIT lecturer Irving Wladawsky-Berger explained, soft AI’s main purpose is to be statistically oriented and use its computational intelligence methods to address complex problems based on the analysis of vast amounts of information using sophisticated algorithms. For most of us soft AI is already an everyday part of our daily routine: from the GPS to ordering food online. According to Wladawsky-Berger, hard AI is “a kind of artificial general intelligence that can successfully match or exceed human intelligence in cognitive tasks such as reasoning, planning, learning, vision and natural language conversations on any subject.”

AI is already used to build devices that cheat and deceive or to outsmart human hackers. It is quickly learning from our behavior and people are building robots who are so humanlike they might be our lovers. AI is also learning right from wrong. Mark Riedl and Brent Harrison from the School of Interactive Computing at the Georgia Institute of Technology are leading a team who is trying to instill human ethics to AIs by using stories. Just like in real life we teach human values to children by reading them stories, AI learns to distinguish wrong from right, bad from good. Just like civilizations have been built on a contract of expected behaviour, we might need to design AI system to respect and fit into our social norms. Whatever robot or a system we created, it is important that their decision-making is consistent with our ethical judgements.

free image recognition with imagga


7 Image Recognition Uses of the Future

Did you know that image recognition is one of the main technologies that skyrockets the development of self-driving cars?

Image identification powered by innovative machine learning has already been embedded in a number of fields with impressive success. It is used for automated image organization of large databases and visual websites, as well as face and photo recognition on social networks such as Facebook. Image recognition makes image classification for stock websites easier, and even fuels marketers’ creativity by enabling them to craft interactive brand campaigns.  

Beyond the common uses of image recognition we have gotten accustomed to, the revolutionizing technology goes far beyond our imagination. Here are seven daring applications of computer vision that might as well belong in a science fiction novel - but are getting very close to reality today.

#1. Creating city guides

Can you imagine choosing your next travel destination on the basis of real-time location information from Instagram photos that other tourists have posted? Well, it’s already out there. Jetpac created its virtual “city guides” back in 2013 by using shared visuals from Instagram.

By employing image recognition, Jetpac caught visual cues in the photos and analyzed them to offer live data to its users. For example, on the basis of images, the app could tell you whether a cafe in Berlin is frequented by hipsters, or it’s a wild country bar. This way, users receive local customized recommendations at-a-glance.  

In August 2014, Jetpac was acquired by Google, joining the company’s Knowledge team. Its knowhow is said to be helping Google’s development of visual search and Google Glass, the ‘ubiquitous computer’ trial of the tech giant.

#2. Powering self-driving cars

In the last years, self-driving cars are the buzz in the auto industry and the tech alike. Autonomous vehicles are already being actively tested on U.S. roads as we speak. Forty-four companies are currently working on different versions of self-driving vehicles. Computer vision is one of the main technologies that makes these advancements possible, and is fueling their rapid development and enhanced safety features.

To enable autonomous driving, artificial intelligence is being taught to recognize various objects on roads. They include pathways, moving objects, vehicles, and people. Image recognition technology can also predict speed, location and behavior of other objects in motion. AI companies such as AImotive are also instructing their software to adapt to different driving styles and conditions. Researchers are close to creating AI for self-driving cars that can even see in the dark.

https://www.youtube.com/watch?v=sIlCR4eG8_o

#3. Boosting augmented reality applications and gaming

Augmented reality experiments have long tantalized people’s imagination. With image recognition, transposition of digital information on top of what we see in the world is no longer a futuristic dream. Unlike virtual reality, augmented reality does not replace our environment with a digital one. It simply adds some great perks to it.

You can see the most common applications of augmented reality in gaming. A number of new games use image recognition to complement their products with an extra flair that makes the gaming experience more immediate and ‘real.’ With neural networks training, developers can also create more realistic game environments and characters.

Image recognition has also been used in powering other augmented reality applications, such as crowd behavior monitoring by CrowdOptic and augmented reality advertising by Blippar.

#4. Organizing one’s visual memory

Here’s for a very practical application of image recognition - making mental notes through visuals. Who wouldn’t like to get this extra skill?

The app Deja Vu, for example, helps users organize their visual memory. When you take a photo, its computer vision technology matches the visual with background information about the objects on it. This means you can instantly get data about books, DVDs, and wine bottles just by taking a photo of their covers or labels. Once in your database, you can search through your photos on the basis of location and keywords.

#5. Teaching machines to see

Besides the impressive number of consumer uses that image recognition has, it is already employed in important manufacturing and industrial processes. Teaching machines to recognize visuals, analyze them, and take decisions on the basis of the visual input holds stunning potential for production across the globe.

Image recognition can make possible the creation of machines that automatically detect defects in manufacturing pipelines. Besides already known faults, the AI-powered systems could also recognize previously unknown defects because of their ability to learn.

There is a myriad of potential uses of teaching machines to perceive our visual world. For example, Xerox scientists are applying deep learning techniques to enable their AI software mimic the attention patterns of the human brain when seeing a photo or a video.

#6. Empowering educators and students

Another inspiring use of image recognition that is already being put in practice is tightly connected with education again - but this time, with improving education of people.

Image recognition is embedded in technologies that enable students with learning disabilities receive the education they need - in a form they can perceive. Apps powered by computer vision offer text-to-speech options, which allow students with impaired vision or dyslexia to ‘read’ the content.

Applications of image recognition in education are not limited to special students’ needs. The technology is used in a range of tools that push the boundaries of traditional teaching. For example, the app Anatomy3D allows discovery of the interconnectedness between organs and muscles in the human body through scanning of a body part. It revolutionizes the way students can explore anatomy and learn about the way our bodies function. Image recognition uses can also help educators find innovative ways to reach ever more distracted students, who are not susceptible to current methods of teaching.   

#7. Improving iris recognition

Iris recognition is a widely used method for biometric identification. It’s most common application is in border security checks, where a person’s identity is verified by scanning their iris. The identification is conducted by analyzing the unique patterns in the colored part of the eye.

Even though iris recognition has been around for a while, in some cases it is not as precise as it’s expected to be. The advancement of image recognition, however, is bringing new possibilities for iris recognition use across industries with improved accuracy and new applications. Most notably, iris identification is already being used in some consumer devices. The smartphones Samsung Galaxy Note7 and Galaxy S8, and Windows Lumia 950 are among the ones already equipped with such a capability.

While recognition is becoming more precise, security concerns over biometrics identification remain, as recently hackers broke the iris recognition of Samsung Galaxy S8. Together with the advancement of computer vision, security measures are also bound to improve to match the new technological opportunities.    

Have you had an experience with AI technology from a movie that years later you seen in real life? Share with the rest of the group and if it enough people like it we can build it together.

The uses of image recognition of the future are practically limitless - they’re only bound by human imagination. What is the practical application of computer vision that you find the most exciting or useful? We’d love to read about it in the comments below.

free image recognition with imagga


How AI Changes Real Estate For Sale Listing: Skyrocket the Role of Real Estate Agents

Nowadays,  Real Estate Agents offer a ton of options to their visitors to make the right property decision. Listings can have filters like neighborhood medians and cost calculators but what they fail to do is make their properties more searchable. This is where artificial intelligence can give a boost in both traffic and guide users through listings.

First, let's define AI: It is the ability of machines to solve problems through learning over time. It allows machines  to make logical decisions similar to a human. Why is this useful? Because a human brain can process only a limited amount of data while a machine is limited only by processing power. Thus,  if I am having a purchasing decision dilemma than artificial intelligence can help me. Let's look at 5 uses of AI that can influence the real estate industry:

 

Artificial Intelligence Search

AI allows more specific search criteria rather than the traditional approach: location, zip code, area, bedrooms, bathrooms etc. With AI you can search based on criteria such as return on investment and forecasted property value. This aspect of AI-generated query allows real estate agents to have a much later involvement in the decision making process and brings in more qualified customers.

How many factors can you elicit with AI?

According to Google Rank Brain, there are more than 10.000  A.I. criteria that can be applied in searches, they can change significantly a traditional real estate website. Here is an example of a website using some AI in its sidebar search criteria:

AI Powered Chat Bots

Probably the most mature technology from this list AI chat box have been in development for a while but it is only until recently that they reached a wider audience. They can perform simple task like booking an appointment, giving you detailed property information, verify contact details and even give insights about which properties will be available next. Naturally, this prediction requires a comprehensive knowledge of the market as well as a lot of data but as chat bots are getting more and more application in consumer-related industries, this process becomes more accurate. As this technology continues to develop, users will be much more inclined to purchase once they reach an agent, meaning the sales funnel will become a much more automated process. By 2020, A.I. chat boxes are expected to handle most of the education phase in marketing departments. One of the best chat bots on the market is done by Lokai raising awareness about Ethiopian women.

Maybe this is not enough to understand the power of chat bots. Watch this video to see a natural language processing and artificial intelligence making conversation with a potential buyer:

 

Image Recognition Recommendation

Agencies can recommend properties based on what buyers have already looked at. This can be achieved by taking a picture when you visit a property and uploading it into Homesnap. They pick up the area, the type of house and your budget based on the property to give you an accurate listing which they can automatically send a newsletter or simple notification email. As a result, speeding up the buying process by reducing the need for a broker to search through properties and send you relevant listings. That leads us to think that in the future, most of the jobs in a real estate agency will be automated. This brings us to our next point.

 

Automating Jobs

According to an Oxford University study, AI will have a significant impact on automating most of the jobs in a real estate agency. Their study reveals what positions will be replaced by AI and by 2020 if their estimations are correct, those are:

Real Estate Association Managers: 81%

Real Estate Sales Agents: 86%

Appraisers and Assessors of Real Estate: 90%

Real Estate Brokers: 97%

If you are thinking that automating the entire real estate brokerage experience will make the purchase of a home an emotionless experience than you might be right. At the end of the day, people want to make the best purchasing decision and according to a study by Inman, this point has already been reached. In their study, clients preferred a bot-generated listing. Furthermore, they had a hard time which listings were human-generated or bot-generated.

How can bots suggest listings?

By using a random forecast approach, AI can use previous searches, preferences, other users search patterns and many more criteria to give much a more accurate result and improve over time.

“ What if semantic technologies combined with cognitive computing and natural language processing help agents do their job better? There is way not be left out."

These capabilities seem to indicate a radical change in the industry. Despite the rigidity of the technology and the seeming difficult of clients to comprehend that they are communicating with a bot, accurate listings is justifying the adoption. As artificial intelligence becomes more accurate with time, the user experience will as well. The role of the broker will be diminished but not become redundant, since, purchasing a home is a very emotional experience and there will always be the need for a broker to give that extra push.

free image recognition with imagga


What's In a Color? The Basics About Image Recognition Color Extraction

Image recognition is bringing revolutionary changes to the ways in which we consume and process information online. Deeply integrated into web pages and apps, it allows us to make sense of visual data in small and large quantities alike as we’ve never been able to do before.

The applications of image recognition are diverse and empowering. Color extraction is one of the most significant and game-changing capabilities offered by computer vision. The possibility to identify and analyze the colors in images gives numerous possibilities to businesses to better use their visual libraries, monetize them, and even increase sales of in-store products.  

How does color extraction through image recognition work? The color API enables analysis of visuals in terms of the colors they contain. It determines the five most prominent colors that are present in an image. Then they can be exported as hex code, RGB triple, specific color name, and parent color name. This makes them easy to use for, say, keyword tagging and categorization.

Let’s delve into the capabilities of color extraction and how you can put it to use for your business.

What does color extraction offer?

The color extraction technology enabled by image recognition has a diverse business and user applications. But how does it make the online experience better?

Color extraction from images allows for keyword tagging of visuals by color. This makes it possible to easily navigate large databases containing visuals. As color differentiation is essential for categorizing images, it allows for searching and browsing based on color tagging.

Keyword tagging for color extraction is done with an API which you integrate into your project. Try it.

Multi-color search is a typical part of color extraction technologies as well. Through using it, you can conduct more complicated search of colors. This means you can identify complex objects that contain more than one dominant color. It also enables multi-color filtering of image search in databases and websites hosted in a color palette functionality.  

With powerful color extraction APIs, you can also identify the colors in the foreground and background of an image. In this way, you can remove the background if needed, or unnecessary elements from the foreground. This allows for more flexibility, so you can focus only on the objects on the image, or on the setting behind them.

How can you use color extraction in your business?

The possibilities that color extraction presents are fascinating, but the best part is that they can boost user experience and product visibility for your business.

Let’s consider how an e-commerce website selling clothes can benefit from color extraction. The color API can analyze the photos of all garments and provide the five predominant colors for each item. The color keywords are then attributed to the product.

When a buyer is searching in the online store for, say, rocker jeans in black, they can just filter the products on the website by the color of their preference. With Imagga’s color API, the user can even type in the exact name of the color they’re looking for. This is especially useful for color blind people, as the color extraction would allow for differentiation of shades and nuances that they would not be able to make otherwise.

Take virtual wedding planners as another example for the commercial uses of color extraction. By using a coloring API, they can offer automated color analysis for couples who want to decide on their wedding color palette. It would allow for uploading a photo with the color preferences of the client. On the basis of its analysis, the color extraction tool would offer similar color combinations.   

Another great use of color extraction is suited for image-based platforms such as Pinterest. If multi-color search is integrated with fashion and design inspiration websites and apps, this would allow users to conduct a color search of immensely large visual databases. They would be able to create groups of images and albums categorized by colors. Besides significantly improving the user experience, this feature can also be monetized by businesses. The color search and categorization can be used by a wide variety of platforms such as design, photography, painting, interior design, and more.

Learn how you can integrate color extraction with ease

Integrating color extraction in your website or app doesn’t need to be complicated. Imagga’s color extraction API is offered as a service. You don’t have to install anything. You just send HTTP requests to our servers in the cloud and get thousands of images processed in a matter of hours.

What are your top examples of using online color extraction? We’d love to hear about your creative approach in the comments below.

free image recognition with imagga


Image Recognition Is Changing Interactive Marketing

As a creative marketer, you’re likely on the lookout for innovative ways to reach your audience at all times. There’s a new kid on the block that just might revolutionize interactive marketing, and it’s called image recognition.

While visual recognition enabled by machine learning has been around for awhile, its potential to boost marketers’ efforts is being discovered only lately. Unlike text, which is easily searchable, visual information online has remained a ‘black box.’ But consumers today share an enormous amount of visual data, which marketers struggle to understand.

How can image recognition help you as an interactive marketer?

Image recognition can help brands make sense of the data contained in the ‘visual web.’ Visual listening through machine learning arms marketers with a powerful tool that they’re seeking relentlessly: relevance of content for their audiences.

Because of the technical capabilities it presents, computer recognition of images allows for creative forms of visual storytelling across media too. This offers powerful methods for engaging people and immersing them in new types of interaction with brands. Through effortless profiling based on the social visual content people share online rather than extensive questionnaires, image recognition opens new doors for personalization.

Here are three top ways in which image recognition is bringing unseen advancements for interactive marketing - and how your creative campaigns can hop on the new tech wave.

Visual listening is on the rise for brands

Until recently, a textual understanding of social media and the content shared there by users was the status quo. But with billions of images created and distributed online daily, brands now realize that social listening focusing on text only is ineffective and simply outdated. The rise of visual social networks like Pinterest, Instagram, Snapchat and Tumblr, as well as the wide use of visuals on Facebook, Twitter and LinkedIn, further pushes this understanding.

Marketers need to analyze enormous quantities of visual material to grasp how people use the currency of ‘visual’ online. While some of them have tried to make sense of images via metadata, this still has not shed enough light on the vast spaces of the visual web.

Speedpost used Imagga Tagging and Object Colour Extraction API to match 36 lifestyles of its prospective customers in the New KIA K5i.

Enter visual listening. This new method for understanding photos and graphics online is enabled by image recognition AI. It enables brands to mine visual content that their audiences are sharing and engaging with. In this way, they can identify patterns, analyze trends and gain valuable business insights. Marketers can use this data to track how visual posts spread online, what type of visuals gain the most attention, and who is engaging with their visual content. And that’s only the tip of the iceberg.

Simply put, visual listening allows marketers to understand how people consume and create visuals in relation to brands. It provides them with a way to read consumers’ emotions and reactions from the visual data they share. This is also a tool for them to discover and benefit from user generated content that promotes the brand and to identify influencers that can be instrumental in their marketing efforts.

For brand protection, image recognition is a handy tool for monitoring how copyrighted visual material is handled online. Brands can also gain insights into their competitors’ visual presence, which can inform their own campaigns.

With image recognition, personalization can flourish

The applications of image recognition for marketing is not limited to visual listening. In fact, it can fuel powerful personalization, so brands can better reach and engage their audiences.

Creative marketing campaigns based on image recognition analysis can be highly targeted and thus can make a real impact. Through gathering insights from visuals, marketers can learn more about the preferences of different target groups. Based on this, they can craft content that engages people better because it’s relevant and personalized. `Instead of using tedious questionnaires that people avoid, it allows brands to learn more about people’s preferences without asking questions. This is done by analyzing the visual content that users have shared online. It empowers brands to tell engaging stories at the right points of contact with people. KIA Motors created an interactive campaign for 36 lifestyles to match its new KIA K5 (Optima) .

How is image recognition used in the financial industry

The Commonwealth Bank of Australia equipped its mobile app with image recognition software that allows people looking for a new house to take a photo of their dreamed home. By analyzing the picture, the app provides them with information on prices, taxes and other details. Furthermore, the app analyzes their personal financial data to inform them of mortgage options for purchasing the house.  

How image recognition created interactive in-store campaigns

Image recognition has a place in the direct shopping experience of consumers by fueling contextual marketing and template matching. Some stores and shopping apps have already integrated image recognition capabilities. People can take a photo of an item they’d like to purchase. Then they receive information about it, locations where they can get it, or even a mobile-optimized page to order it directly.  

Big brands like Macy’s and Neiman Marcus have embedded image recognition in dedicated apps. Customers take photos of clothes and accessories they like, and the app automatically suggests items from the brands’ inventory. Other online and brick-and-mortar retailers have also adopted image recognition to engage customers and make their shopping seamless.  

A well-known example of image recognition in a consumer app is Vivino. People snap a photo of a wine label to get ratings and details about the brand and sort. The app has turned into a popular helper for making an informed wine choice.

How is image recognition used for access control

Another cool example of image recognition for an experimental creative campaign is Imagga’s Hipster Bar hack. During the WdW Festival in 2015, the Rotterdam-based artist Max Dovey hosted an installation called the Hipster Bar. Using Imagga’s image recognition, the camera at the door of the bar would take a photo of the person who wants to enter. Then it would juxtapose it against a database of photos of hipsters. If the person matches the style, they’d be able to enter the bar. The concept used for this art case can be applied in a number of marketing purposes. It is based on custom training of the AI by providing it with relevant visual data.

As these three prominent applications of image recognition illustrate, it holds impressive potential for a number of interactive marketing initiatives. Analysis of previously untapped visual data can inform marketers and help them craft powerful campaigns and on-spot content.

Now that you know what image recognition can do for your marketing would you like to give it a try?

free image recognition with imagga