Create Autotagging Uploads with NodeJS using Image Recognition API

Applications these days are visual. There’s no denying it. Applications these days are also social. Combine those two and, inevitably, you are going to need to enable your users to upload images and share them, either on their own or as part of larger creations.
With the sharing of images comes the desire to organize and tag them. Tagging images come with a litany of benefits, including:

  • Allowing other users to search for specific categories of images
  • Automatically labeling images for screen-readers and other accessibility tools
  • Easy sorting and organization of images into folders or trees

What we’re building today is a widget that you could easily embed within your own applications. It will take an image selected by the user, upload it to the Imagga servers and tag it, then automatically recommend the top tags for use within your app. The user will still be able to edit and customize the tags for an image, though their selections will be limited to the larger list suggested by Imagga to prevent spamming.

In your own development, you might implement a widget like this as part of a social networking platform, such as a forum or image sharing service. Anytime users are able to upload and categorize their own images, it’s important to both streamline that process for them as well as put restrictions around those categorizations so that they are accurate and not misleading to your other users.

Source code

If you’re merely interested in an overview of how an auto-tagging system like this can be built with Imagga, feel free to skip to the next section. If you’d like to follow along more closely, however, you can download the code from the sidebar to the right and use git tags to match the code with each stage. For example, if the stage is index-and-api-routes, just type in git checkout index-and-api-routes into your terminal to see the code at that stage. If you get stuck or make changes, you can always use get reset HEAD --hard to return to the base code for that stage or git checkout master to return to the completed code.

If you choose to also build and run the code yourself, you will need to get an Imagga API key and secret and add them to a .env file at the root of the code, like so:


You’ll also need to run an npm install within the repo to get the dependencies we need to run the application.

How does the tagging API work

Git tag: index-and-api-routes

Before we get started, we need to understand how to use the Imagga tagging API. The API at takes a GET request with a specified image, and then returns an array of tags, sorted by confidence. For example, if we ask it to analyze this gorgeous mountain vista:


We end up with a list of tags looking like this:

    "result": {
        "tags": [
                "confidence": 76.4135513305664,
                "tag": {
                    "en": "mountain"
                "confidence": 69.7975997924805,
                "tag": {
                    "en": "highland"
                "confidence": 54.8374099731445,
                "tag": {
                    "en": "mountains"
                "confidence": 54.5085144042969,
                "tag": {
                    "en": "landscape"
                "confidence": 38.0158271789551,
                "tag": {
                    "en": "sky"
            /* ... More tags here ... */
    "status": {
        "text": "",
        "type": "success"

The tagging API accepts one of two methods for identifying the image you want tagged: image_url and image_upload_id. If we use image_url, we can point the API to any image hosted on a publicly accessible web address and tag it, which is exactly what we did for the mountain image above. This is most helpful if the user is attaching an image already uploaded onto the internet somewhere.

For most applications, though, we want to allow the user to upload their own images, in which case we need to first understand another Imagga API, the uploads endpoint.

Take a look at a simple upload

Git Tag: upload-and-tag

There are two ways to handle tagging uploads with Imagga. In v1 of the API, it was a two-step process, requiring us first to upload the image, than use the returned upload_id to retrieve the tags. With v2, we now recommend you upload and tag the image all in one request, using the POST method to our tagging API. As such, that’s the method we’ll use here, but if you’re still using v1 and want to see an example of how the two-step process might work, check out the git tag upload-and-tag-v1 for an example.

Important note: However you handle uploads, it’s important to know that Imagga does not permanently store the images you upload. For security purposes, they only remain on the Imagga server for 24 hours, so if you’re using the two-step approach you’ll want to ensure you download and store the tags for your uploads within that time period. If you need to have the images removed immediately, you can use the upload_id and a DELETE call to the API as well.

For our sake, however, let’s just take a look at the simplified one-step approach. Our frontend is pretty generic (our demo uses FilePond for simplicity, but any file upload script will work), so let’s focus on the Node.JS backend:'/tag', function (req, res) {
  // Get the image field from the POST request
  let image = req.files.image;
  if (!image) {
    res.writeHead(300, {'Content-type': 'text/javascript'})
    res.end(JSON.stringify({'status': 'failed', 'error': 'no image specified'}))

      image: // Pass pure image buffer to API
      function (tags) {
        let data = {
          tags: tags
        res.writeHead(200, {'Content-type': 'text/javascript'});
      function (err) {
        console.warn('Error getting tags', err);
        res.writeHead(500, {'Content-type': 'text/javascript'});

As a reminder, for the full file context, check out the repo and use the tag upload-and-tag.

So let’s walk through the code. First, we have an express router that takes a POST request to /tags with an image file upload. Our middleware (express-fileupload) has made our files easily accessible on the req.files hash, so all we have to do is grab that data and pass it along to our tags handler. For cleanliness, we’ve abstracted the specifics of the HTTP requests to a separate file in api-request.js, but you can see the flow here.

Then, once our request returns with the tags, we simply pass that back on out to the frontend. As we noted above, the response format is exactly the same, so our frontend handler can be exactly the same regardless of whether this is a direct upload or an image url.

Connecting to an auto-tag dropdown

Git tag: preview-and-autotag

Having a raw display of tags is helpful for debugging and development, but it is hardly the kind of user experience we want for our actual applications, so let’s take a look at connecting these tags to something more user-friendly.

For this, we’re going to utilize a tagging library called tagify which turns our text input into a tag field. The flow will be that the user uploads an image, the API passes that on to Imagga and retrieves the tags, and then our widget will allow the user to accept or change the suggested tags. Any tag marked with 60% confidence or higher will be auto-suggested (or the top 3 tags, whichever is greater), and the user will be allowed to add more, but only from the list that Imagga returns. This ensures that while the user can correct or edit the tagging, they cannot add any completely inaccurate tags that might mislead other users on our site.

Our code for the API remains the same as before, so let’s take a look at the frontend code that handles the response:

var tagify
  , imgFile;
var showResults = function (tags) {
  var container = document.getElementById('output')
    , preview = document.getElementById('preview')
    , tagEl = document.getElementById('tags-json')
    , tagInput = document.getElementById('tags')
    , tagList = getTagList(tags);

  // Render the raw JSON response for debugging
  tagEl.innerHTML = JSON.stringify(tags, null, 2);

  if (tagify) {
    // Have already initialized tagify, just update with new tags
    tagify.settings.whitelist = tagList.all;
  } else {
    // Initialize new tagify and set whitelist
    tagify = new Tagify(
        whitelist: tagList.all,
        enforceWhitelist: true

  if (imgFile) {
    preview.src = URL.createObjectURL(imgFile.file);

var threshold = 60;
var getTagList = function (tags) {
  var list = {
    auto: [],
    all: []
  for (var i = 0, ii = tags.length; i < ii; i++) {
    var tag = tags[i]
      , t = tag.tag.en;

    // Add first three tags to 'auto-suggest' array, along with any
    // others over confidence threshold
    if ( < 3 || tag.confidence > threshold) {;
  return list;

Let’s walk through this from the top. showResults receives the tags from the Imagga API and immediately outputs the raw results into our pre tag for easy viewing. Again, we obviously wouldn’t include this in a final application, but it can be helpful during development, especially when we want to test our auto-suggest dropdown here shortly.

Along with outputting the raw results, our next task before creating the auto-suggest input is to convert the list from Imagga into something our tagging plugin can understand. For this, we use getTagList. There we loop through all the suggested tags, creating two arrays: an auto array with the first three tags along with any others that are above the confidence threshold (60% in this case) and an all array which we’ll use for the suggestions dropdown.

From there, we simply pass those along to tagify, either updating an existing input (if this isn’t the first image we’ve uploaded) or creating a new one. Finally, for a better user experience, we render the image the user uploaded in the preview element so they can be assured everything was processed correctly.

Go ahead and upload the mountain image we used before (you can download it here), and take a look at how this all comes together. Once it’s uploaded, you should see it previewed on the right, along with three recommended tags: mountain, highland, and landscape.

Now try adding some tags that are included farther down in the suggestions, such as “travel” or “peaceful.” Both of these should auto-complete as you type, just hit tab to add them to the list. Remember though, we want to prevent users from adding tags that are completely inaccurate, so now try typing something not in the list, such as “ocean.” You’ll see that not only is it not part of the suggestion dropdown, if you hit enter to try to add it, it will automatically be removed.


Today we’ve covered how the Imagga tagging and upload APIs work and how you can use them together to implement an auto-tagging widget within your NodeJS projects. Hopefully you can see how quickly and easily you can integrate them to enhance your users’ experiences and streamline your applications. Got a question or a suggestion for what we should build next? Tell us in the comments, we’d love to hear from you!

Libraries used

Download Free Auto Tagging Uploads with Node.Js Files

Marketing Permissions For Imagga

We are always concious about sending a lot of emails, so our commitment is that we will send only when there is something interesting or free knowledge that we would like to share with you. If you feel like we are not living up to that expectation you can unsubscribe at any time by clicking the link in the footer of our emails.


How to efficiently detect sex chatbots in Kik Messenger and build it for other platforms

Ever since 2017,  Kik messenger has been plagued by irrelevant messages distributed by automated bots. In particular, it has seen its public groups assailed by a large number of these bots which infuse it with spam-related content. Jaap, computer science and engineering master’s student, annoyed by these relentless and unwelcomed interruptions, decided to tackle the issue by engineering a counter bot called Rage that would identify and remove these bots.

Using proprietary algorithms to identify known spambot name patterns, Rage also uses Imagga’s adult content filtering API (NSFW API) to scan profile pictures as well. The result worked so well that friends soon wanted it. With word of mouth only, his bot was installed on over 2,000 groups in just three days. Now his bot is used in +3000 chats rooms, containing over 100.000 people, and some 20.000 images are being processed every month.

Major project issues - Neural networks are expensive: the cheapest is AWS g2.2 series which costs $0.65/hour. For a student, it is a hefty sum to invest in GPU instances. Therefore Jaap looked into using a third-party company that would provide him with a more affordable out of the box solution. While Google came up as first in a search, he selected Imagga because of the already tested accuracy compared to other solutions on the market.

Putting it all together - Since spambots use racy profile pictures, Rage bot’s detection algorithm works a lot better using Imagga NSFW API than it would be just applying name matching. When someone joins a chat, his or her name is scanned for known spambot name patterns while Imagga’s NSFW content moderation analyzes his or her profile picture. If the safety confidence level is less than 40%, the user is considered a spambot and is removed from the chat.
Since Kik profile pictures are public, Rage bot only needs to send image links directly to Imagga Content Moderation API from which it returns its result.

"confidence": 100,
"name": "safe"

Up until August 2018, Jaap and his bot have stopped over 20K spambots from plaguing Kik Messenger. Detected bots who have under a certain level of “confidence” are removed automatically, but the Rage bot is not stopping here. One of the most recent features is the 48 mode, which detects the number of people in a chat and removes inactive users.

Building and deploying - Imagga's NSFW classification were set up and running in a day. Using the 2,000 free API calls a month demo, Jaap was able to quickly implement, test and judge if this was the right tool for him. As a content moderation solution, it can be installed and run the same day with no downtime and deliver accurate content moderation. Then, if you need more API calls, Imagga has very affordable pricing.

Why Using Image Recognition Software Can Save Your Cloud Platform a Ton of Resources

In recent years, we have seen significant growth in artificial intelligence technology and its use in different industries such as automotive, healthcare, e-commerce, gaming, e.t.c. Image recognition, one of the flagship applications of AI, has had wide adoption across industries. It is estimated that the worldwide market for image recognition will grow to $29.98 billion by 2020.

A major factor in the growing demand for image recognition technology has been the increased use of the internet and the move of small and medium enterprises (SMEs) to the cloud. With this move, the businesses have benefited from some of the advantages a cloud platform offers such as widespread reach, scalability, flexible billing options, rapid deployment and constant availability. With the move to the cloud, businesses have found it necessary to adopt technology that helps them better navigate the smarter and more connected platform; and image recognition is one of those technologies.

Image recognition (sometimes called computer vision) is the ability of software to analyze an image or video, identifying its content e.g. people, objects, places and text. It is widely used in different industries e.g. in self-driving cars, facial and optical character recognition software, disease diagnosis, e.t.c. For businesses that operate in the cloud, image recognition can offer numerous benefits as outlined below.

Automating Tasks with Image Recognition Software Saves Time

Unlike other resources that you can create or acquire more of, time is a finite resource that most likely, to stay competitive, you can't afford to waste.

Without a doubt, computers are faster than humans at some particular tasks, and so for those tasks, it makes sense to automate the job using software, leaving your employees free to work on other urgent tasks. Image recognition software can be used to automate such tasks as categorizing and tagging media content, content moderation and editing images (e.g. cropping or background removal).

Use of Image Recognition Software can Help Keep your Team Lean and Thus Save Costs

Use of image recognition software can reduce or eliminate required labour. Without image recognition, you would have to put people on the job to do such tasks as tagging and categorizing your digital assets, moderating user-generated content, individually editing images, e.t.c. In some cases, such a feat might be annoying and frustrating at best, but in other cases, it might be outright impossible to do. Take, for instance, a firm that might be offering Digital Asset Management services. The firm might have several clients, each having millions of digital content that needs to be processed. It would be very difficult, if not impossible to run such a service on manual labour alone. To keep its client's happy, the business will have to keep its asset processing time to a minimal, which means it would have to keep a lot of people on board to do the work. With time, as its client list increases or as the content each client maintains increases, the business's labour costs will also be skyrocketing. Running such a business on manual labour alone isn't sustainable. By automating some tasks with image recognition software, you can maintain a lean and cost-effective team.

Image Recognition can Reduce Human Error

To err is human, to forgive divine so the saying goes; but when you are running a business that depends on the accuracy of its operations, you might not be so lax about errors that might occur.

Human labour is susceptible to making errors. When tasked with entering a large amount of data, it is probable that some data will be recorded incorrectly. Human labour is also prone to tiring. When one has to process thousands of images or videos, they might not be as keen on processing a few thousands. With exhaustion and waning focus, errors might creep in here and there.

For some use cases, image recognition has been shown to give better results than humans. In the medical field, for instance, there is a visual recognition software that has a higher success rate in diagnosing a particular type of cancer. In the still infant field of self-driving cars, it has been said that driverless cars are safer than human drivers.

Image recognition can help eliminate or at least reduce the inaccuracies of human intervention. This will, in turn, save the business resources that would have been lost due to the errors, whether in the form of revenue, labour or time.

Image Recognition can Help you Innovate According to Market Trends

One advantage of running an online business is that a lot of your customers are also online. In this connected ecosystem, it is easier to monitor the market by observing what people share online. By analyzing visual content that is shared online, you might be able to recognize a trend that you can piggyback on when it comes to product release. With image recognition, you can also gain some insights into your competitors by detecting their online visual presence. You can observe how the market engages with the competitor's visual content and determine if their reaction to it is positive or not. This can, in turn, inform your product design decisions.

Instead of using tedious questionnaires and discovery processes to find out what users want, you can use data to determine this. You can determine what users gravitate towards online by observing what they share and how they react to different content. An example of this in use is Netflix which uses data to determine what shows to create. This can save you the effort and cost of creating something that won't be profitable once it hits the market.

Image Recognition can Improve your Marketing Efforts

Other than using image recognition to predict products that will be popular amongst your target market, you can also use it to determine how best to market the products to consumers. Using image recognition, you can mine your target market's visual content and monitor market trends in real time. In this way, you can gain insights on how visual posts spread online, what type of visuals get the most attention, the type of people engaging most with your content, the individual influencers driving most of the traffic and the best platform to post your content on. This can, in turn, help you launch marketing campaigns that are most likely to succeed. Your marketers don't have to waste their budget guessing at what will work, they can use data to decide on the way forward.

How something is presented can have a huge impact on the level of engagement people will have with it. Netflix discovered from conducting consumer research, that the artwork on their website was not only the biggest influencer to a member's decision to watch content, but it also constituted over 82% of their focus while browsing. This is why they go through so much effort to determine the best artwork to display on their website, a feat that would be impossible without image recognition and machine learning. If you are running an online business, you should pay attention to how you present your product or service. In a world where consumers are spoilt for choice when searching for a product or service, you should ensure that your website communicates the value of what you are trying to sell in the best way possible.

Image Recognition can Help Online MarketPlaces Fight Counterfeit Goods

According to the Organization for Economic Co-operation and Development (OECD), counterfeit products may cost the global economy up to $250 billion a year. Businesses running online platforms that allow sellers to sell goods always run the risk of having some sellers selling counterfeit products. This can damage the marketplace's reputation when consumers get products that are subpar to their genuine counterparts.

To counter this, marketplace websites have started turning to image recognition technology to help identify legit and counterfeit products. Using software, the platforms put uploaded product images through some checks to ensure their authenticity.

In General, Image Recognition Makes for Better Apps

Overall, incorporating image recognition improves the user experience of cloud applications and makes their operation effective and efficient. Using better apps is good for any business's bottom line as they reduce the overall overhead costs.

In the presence of numerous competition, most companies compete primarily on the basis of customer experience. Poor user experience can lead to customer churn, and in an interconnected world, it is very easy for disgruntled customers to spread the word about the terrible service they had at your hands; so it is always in your best interest to employ any technology you can to produce the best possible product for your target market.

Do you use image recognition in your product? If yes, let us know how you use it and how it has improved your business. If you would like to find out more about the Imagga Image Recognition API, please contact us and we'll get back to you promptly.


Image Recognition Revolutionizes the Online Experience for the Visually Impaired

People take seeing and technology for granted. For a specific group of internet users, the online experience is not so straightforward. The visually impaired need special assistance to experience the digital world. There are a few diverse low-vision aids but generally, they can be divided into two categories: translating visual information into alternative sensory information (sound or touch) and adapting visual transformation to make it more visible. However, the bigger problem remains how to help people who are blind. The emerging technology for assistance in this category uses image processing techniques to optimize the visual experience. Today we will be looking at how image recognition is revolutionizing the online experience for the visually impaired.

Blind Users Interacting with Visual Content

Let’s stop for a second to consider the whole online experience for the visually impaired. What happens when a regular person sees a webpage? He scans it, clicks links or fills in page information. For the visually impaired, the experience is different. They use a screen reader: a software that interprets a photo or image on the screen and reads it to the user. However, to narrate each page element in a fixed order including skipping is not easy. Sometimes there is a vast difference between the visual page elements (buttons, banners, etc.) and the alt-text read by the screen reader. SNS pages (social networking service) with unstructured visual elements and an abundance of links, with horizontally and vertically organized content make listening to the screen reader more confusing.

Interacting with Social Visual Content

SNSs make it easy to communicate through various types of visual content. To fully engage with images, visually impaired people need to overcome accessibility challenges associated with the visual content through workarounds or with outside help.

Advancements in artificial intelligence are allowing blind people to identify and understand the visual content. Some of them include image recognition, tactile graphics, and crowd-powered systems.

Facebook has already algorithmically generated useful and accurate descriptions of photos on a larger scale without latency in the user experience. They provide visuals a description as image alt-text, an HTML attribute designed for content managers to provide the text alternative for images.

Web Accessibility  Today

We might think that web accessibility is a universal thing, but web designers do not always have the resources to devote to accessibility or do not see the value in making sites accessible. A 2-dimensional web page translated into a 1-dimensional speech stream is not easy to decipher. One of the most annoying things is that the majority of websites have insufficient text labeling of graphic content, concurrent events, dynamic elements, or infinitely scrolling pages (i.e. a stream of feeds). Thus, many websites continue to be inaccessible through screen readers. Even the ones that are intended for universal access: library websites, university websites, and SNSs.

The World Wide Web Consortium (W3C), an international community where Member organizations and the public work together to develop Web standards, created accessibility standards.  Led by Web inventor Tim Berners-Lee and CEO Jeffrey Jaffe, W3C's mission is to lead the Web to its full potential.

Solutions Helping Visually Impaired Users

There is a new iPhone app which uses machine learning to identify objects for visually impaired people without an Internet connection. The free image-recognition app is called Aipoly and is making it easier for people to recognize their surroundings. How does it work? You simply point the phone’s rear camera at whatever you want to identify and it speaks what it sees. The app can identify one object after another as the user moves the phone around and it doesn’t require picture taking.The app can be helpful not only to people with impaired vision but also to the ones trying to learn a new language.

Aipoly cofounder Simon Edwardsson says it recognizes images by using deep learning, which is a machine-learning technique inspired by studies of the brain. This is the same technology used by Facebook for recognizing faces and Google for searching images. The app breaks down the image into different characteristics like lines, patterns, curves, etc. and uses them to determine the likelihood of that image to be a specific object. The app works fine for objects around the office. So far it can recognize around 1,000 objects, which is more than enough.

Banknote-reader (b-reader)
The banknote reader is a device that helps the visually impaired to recognize money. The banknote goes into the b-note holder for scanning and recognition (orientation doesn’t really matter), it gets photographed and sent securely to the cloud. There an Imagga-trained custom classifier recognizes the nominal value and returns the information to the b-note device. Then it plays a pre-recorded .mp3 file with the value if it is recognized. The project is part of TOM (Tikkun Olam Makers), a global movement of communities connecting makers, designers, engineers and developers with people with disabilities to develop technological solutions for everyday challenges. On the web platform, you can find full specs of the b-note prototype, including building instructions and camera code used for calling Images API, so that you can make a device like it for around 100 Euro or 115 USD.

This is a combination of a Smartphone and advanced “artificial vision” software to create a helpful electronic assistant for anyone who is visually impaired or blind. It can be used to automatically scan and identify objects like money, packaged goods, DVDs, CDs, medication bottles, and even landmarks. All it takes is to point the device video camera at the object and the device pronounces the name quickly and clearly. It can be taught to identify all the objects and landmarks around you. With a little extra help, the LookTel can be a helpful assistant. It also incorporates a text reader which allows users to get access to print media.

Seeing AI
This is a smartphone app that uses computer vision to describe the world and is created by Microsoft. Once the app is downloaded, the user can point the camera at a person and it will announce who the person is and how they are feeling. The app also works with products. It is done by artificial intelligence running locally on the phone. So far the app is available for free in the US for iOS. It is unclear when the rest of the world and Android users will be able to download it.

The app works well for recognizing familiar people and household products (scanning barcodes). It can also read and scan documents and recognize US currency. This is not a small feat because the dollar bills are basically the same size and color, regardless of their value, so spotting the difference is sometimes difficult for the visually impaired. The app is using neural networks to identify objects, which is the same technology used for self-driving cars, drones, and others. The most basic functions take place on the phone itself, however most features require a connection.

Next  Challenges for Full Adoption

Facebook users upload more than 350 million photos a day. Websites are relying mostly on images and less on the text. Sharing visuals has become a major part of the online experience. So using screen readers and screen magnifiers on mobile and desktop platforms help the visually impaired. However, more efforts need to be put to make the web more accessible through design guidelines, designer awareness, and evaluation techniques.

The most difficult challenge ahead is the evaluation of the effectiveness of image processing. It needs to be held ultimately to the same standards as other clinical research in low vision. Image processing algorithms need to be tailored specifically to disease entities and be available on a variety of displays, including tablets. This field of research has the potential to deliver great benefits to a large number of people in short period of time.

Securing Images in Python With the Imagga NSFW Categorization API

In web and mobile applications, as well as any other digital media, the use of images as part of their content is very common. With images being so ubiquitous, there comes a need to ensure that the images posted are appropriate to the medium they are on. This is especially true for any medium accepting user-generated content. Even with set rules for what can and cannot be posted, you can never trust users to adhere to the set conditions. Whenever you have a website or medium accepting user-generated content, you will find that there is a need to moderate the content.

Why Moderate Content?

There are various reasons why content moderation might be in your best interest as the owner/maintainer of a digital medium. Some common ones are:

  • Legal obligations - If your application accommodates underaged users, then you are obligated to protect them from adult content.
  • Brand protection - How your brand is perceived by users is important, so you might want to block some content that may negatively affect your image.
  • Protect your users - You might want to protect your users against harassment from other users. The harassment can be in the form of users attacking others by posting offensive content. An example of this is Facebook’s recent techniques of combating revenge p0rn on their platform.
  • Financial - It might be in your best interest financially, to moderate the content shown on your applications. For instance, if your content is somewhat problematic, other businesses might not want to associate with you in terms of advertising on your platform or accepting you as an affiliate for them. For some Ad networks, keeping your content clean is a rule that you have to comply with if you want to use them. Google Adsense is an example of this. They strictly forbid users of the service from placing their ads on pages with adult content.
  • Platform rules - You might be forced to implement some form of content moderation if the platform your application is on requires it. For instance,Apple requires applications to have a way of moderating and restricting user-generated content before they can be placed on the App Store and Google also restricts apps that contain sexually explicit content

As you can see, if your application accepts user-generated content, moderation might be a requirement that you can’t ignore. There are different ways moderation can be carried out:

  • Individual driven - an example of this is a website that has admins that moderate the content. The website might work by either restricting the display of any uploaded content until it has been approved by an admin or it might allow immediate display of uploaded content, but have admins who constantly check posted content. This method tends to be very accurate in identifying inappropriate content, as the admins will most likely be clear as to what is appropriate/inappropriate for the medium. The obvious problem with this is the human labor needed. Hiring moderators might get costly especially as the application’s usage grows. Relying on human moderators can also affect the app’s user experience. The human response will always be slower than an automated one. Even if you have people working on moderation at all times, there will still be a delay in identifying and removing problematic content. By the time it is removed, a lot of users could have seen it. On systems that restrict showing uploaded content until it has been approved by an admin, this delay can become annoying to users.
  • Community driven - with this type of moderation, the owner of the application puts in place features that enable the app’s users to report any inappropriate content e.g. flagging the content. After a user flags a post, an admin will then be notified. This also suffers from a delay in identifying inappropriate content from both the community (who might not act immediately the content is posted) and the administrators (who might be slow to respond to flagged content). Leaving moderation up to the community might also result in reported false positives as content that is safe is seen by some users as inappropriate. With a large community, you will always have differing opinions, and because many people will probably not have read the Terms and Conditions of the medium, they will not have clear-cut rules of what is and isn’t okay.
  • Automated - with this, a computer system usually using some machine learning algorithm is used to classify and identify problematic content. It can then act by removing the content or flagging it and notifying an admin. With this, there is a decreased need for human labor, but the downside is that it might be less accurate than a human moderator.
  • A mix of some or all the above methods - Each of the methods described above comes with a shortcoming. The best outcome might be achieved by combining some or all of them e.g. you might have in place an automated system that flags suspicious content while at the same time enabling the community to also flag content. An admin can then come in to determine what to do with the content.

A Look at the Imagga NSFW Categorization API

Imagga makes available the NSFW (not safe for work) Categorization API that you can use to build a system that can detect adult content. The API works by categorizing images into three categories:

  • nsfw - these are images considered not safe. Chances are high that they contain ponographic content and/or display nude bodies or inappropriate body parts.
  • underwear - this categorizes medium safe images. These might be images displaying lingerie, underwear, swimwear, e.t.c.
  • safe - these are completely safe images with no nudity.

The API works by giving a confidence level of a submitted image. The confidence is a percentage that indicates the probability of an image belonging to a certain category.

To see the NSFW API in action, we’ll create two simple programs that will process some images using the API. The first program will demonstrate how to categorize a single image while the second will batch process several images.

Setting up the Environment

Before writing any code, we’ll first set up a virtual environment. This isn’t necessary but is recommended as it prevents package clutter and version conflicts in your system’s global Python interpreter.

First, create a directory where you’ll put your code files.

[cc lang="bash"]$ mkdir nsfw_test[/cc]

Then navigate to that directory with your Terminal application.

[cc lang="bash"]$ cd nsfw_test[/cc]

Create the virtual environment by running:

[cc lang="bash"]$ python3 -m venv venv[/cc]

We’ll use Python 3 in our code. In the above, we create a virtual environment with Python 3. With this, the default Python version inside the virtual environment will be version 3.

Activate the environment with (on MacOS and Linux):

[cc lang="bash"]$ source venv/bin/activate[/cc]

On Windows:

[cc lang="bash"]$ venv\Scripts\activate[/cc]

Categorizing Images

To classify an image with the NSFW API, you can either send a GET request with the image URL to the [cci]/categorizations/[/cci] endpoint or you can upload the image to [cci]/content[/cci], get back a [cci]content_id[/cci] value which you will then use in the call to the [cci]/categorizations/[/cci] endpoint. We’ll create two applications that demonstrate these two scenarios.

Processing a Single Image

The first app we’ll create is a simple web application that can be used to check if an image is safe or not. We’ll create the app with Flask.

To start off, install the following dependencies.

[cc lang="bash"]$ pip install flask flask-bootstrap requests[/cc]

Then create a folder named [cci]templates[/cci] and inside that folder, create a file named [cci]index.html[/cci] and add the following code to it.

[cc lang="python"]
{% extends "bootstrap/base.html" %}

{% block title %}Imagga NSFW API Test{% endblock %}

{% block navbar %}

{% endblock %}

{% block content %}

{% if image_url %}

{{ res }}

{% endif %}

{% endblock %}

In the above code, we create an HTML template containing a form that the user can use to submit an image URL to the Imagga API. When the response comes back from the server, it will be shown next to the processed image.

Next, create a file named [cci][/cci] in the root directory of your project and add the following code to it. Be sure to replace [cci]INSERT_API_KEY[/cci] and [cci]INSERT_API_SECRET[/cci] with your Imagga API Key and Secret. You can signup for a free account to get these credentials. After creating an account, you’ll find these values on your dashboard:

[cc lang="python"]
from flask import Flask, render_template, request
from flask_bootstrap import Bootstrap
import os
import requests
from requests.auth import HTTPBasicAuth

app = Flask(__name__)

# API Credentials. Set your API Key and Secret here



@app.route('/', methods=['GET', 'POST'])
def index():
image_url = None
res = None
if request.method == 'POST' and 'image_url' in request.form:
image_url = request.form['image_url']

response = requests.get(
'%s/categorizations/nsfw_beta?url=%s' % (API_ENDPOINT, image_url),

res = response.json()
return render_template('index.html', image_url=image_url, res=res)

if __name__ == '__main__':

Every call to the Imagga API must be authenticated. Currently, the only supported method for authentication is Basic. With Basic Auth, credentials are transmitted as user ID/password pairs, encoded using base64. In the above code, we achieve this with a call to [cci]HTTPBasicAuth()[/cci].

We then create a function that will be triggered by GET and POST requests to the [cci]/[/cci] route. If the request is a POST, we get the data submitted by form and send it to the Imagga API for classification.

The NSFW Categorizer is one of a few categorizers made available by the Imagga API. A Categorizer is used to recognize various objects and concepts. There are a couple predefined ones available (Personal Photos and NSFW Beta) but if none of them fit your needs we can build a custom one for you.

As mentioned previously, to send an image for classification, you send a GET request to the [cci]/categorizations/[/cci] endpoint. The [cci]categorizer_id[/cci] for the NSFW API is [cci]nsfw_beta[/cci]. You can send the following parameters with the request:

  • url: URL of an image to submit for categorization. You can provide up to 10 URLs for processing by sending multiple url parameters (e.g. [cci]?url=&url=…&url=[/cci])
  • content: You can also directly send image files for categorization by uploading the images to our [cci]/content[/cci] endpoint and then provide the received content identifiers via this parameter. As with the URL parameter, you can send more than one image - up to 10 content by sending multiple [cci]content[/cci] parameters.
  • language: If you’d like to get a translation of the tags in other languages, you should use the language parameter. Its value should be the code of the language you’d like to receive tags in. You can apply this parameter multiple times to request tags translated in several languages. See all available languages here.

After processing the request, the API sends back a JSON object holding the image’s categorization data in case of a successful processing, and an error message incase there was a problem processing the image.

Below you can see the response of a successful categorization:

[cc lang="javascript"]
'results': [{
'image': '',
'categories': [{
'name': 'safe',
'confidence': 99.22
}, {
'name': 'underwear',
'confidence': 0.71
}, {
'name': 'nsfw',
'confidence': 0.07

Note that you might not always get JSON with the three categories displayed. If the confidence of a category is [cci]0[/cci], this category will not be included in the JSON object.

Below you can see the response of a failed categorization.

[cc lang="javascript"]
'results': [],
'unsuccessful': [{
'reason': 'An error prevented image from being categorized. Please try again.',
'image': ''

Back to our app, you can save your code and run it with:

[cc lang="bash"]
$ python

If you navigate to you should see a form with one input field. Paste in the URL of an image and submit it. The image will be processed and you will get back a page displaying the image and the JSON returned from the server. To keep it simple, we just display the raw JSON, but in a more sophisticated app, it would be parsed and used to make some decision.

Below, you can see the results of some images we tested the API with.

As you can see, the images have been categorized quite accurately. The first two have [cci]safe[/cci] confidence scores of [cci]99.22[/cci] and [cci]99.23[/cci] respectively while the last one has an [cci]underwear[/cci] score of [cci]96.21[/cci]. Of course, we can’t show an [cci]nsfw[/cci] image here on this blog, but you are free to test that on your own.

To know the exact confidence score to use for your app, you should first test the API with several images. When you look at the results of several images, you will be able to better judge which number to look out for in your code when filtering okay and not okay images. If you are still not sure about this, our suggestion is setting the confidence threshold at 15-20%. However, if you’d like to be more strict on the accuracy of the results, setting the confidence threshold at 30% might do the trick.

You should know that the technology is far from perfect and that the NSFW API is still in beta. From time to time, you might get an incorrect classification.

Note that the API has a limit of 5 seconds for downloading the image. If the limit is exceeded with the URL you send, the analysis will be unsuccessful. If you find that most of your requests are unsuccessful due to timeout error, we suggest uploading the images to our [cci]/content[/cci] endpoint first (which is free and not accounted towards your usage) and then use the content id returned to submit the images for processing via the [cci]content[/cci] parameter. We’ll see this in action in the next section.

Batch Processing Several Images

The last app we created allowed the user to process one image at a time. In this section, we are going to create a program that can batch process several images. This won’t be a web app, it will be a simple script that you can run from the command line.

Create a file named [cci][/cci] and add the code below to it. If you are still using the virtual environment created earlier, then the needed dependencies have already been installed, otherwise, install them with [cci]pip install requests[/cci].

[cc lang="python"]
import os
import requests
from requests.auth import HTTPBasicAuth

# API Credentials. Set your API Key and Secret here

FILE_TYPES = ['png', 'jpg', 'jpeg', 'gif']

class ArgumentException(Exception):

if API_KEY == 'YOUR_API_KEY' or \
raise ArgumentException('You haven\'t set your API credentials. '
'Edit the script and set them.')


def upload_image(image_path):
if not os.path.isfile(image_path):
raise ArgumentException('Invalid image path')

# Open the desired file
with open(image_path, 'rb') as image_file:
filename =

# Upload the multipart-encoded image with a POST
# request to the /content endpoint
content_response =
'%s/content' % API_ENDPOINT,
files={filename: image_file})

# Example /content response:
# {'status': 'success',
# 'uploaded': [{'id': '8aa6e7f083c628407895eb55320ac5ad',
# 'filename': 'example_image.jpg'}]}
uploaded_files = content_response.json()['uploaded']

# Get the content id of the uploaded file
content_id = uploaded_files[0]['id']

return content_id

def check_image(content_id):
# Using the content id, make a GET request to the /categorizations/nsfw endpoint
# to check if the image is safe
params = {
'content' : content_id
response = requests.get(
'%s/categorizations/nsfw_beta' % API_ENDPOINT,

return response.json()

def parse_arguments():
import argparse
parser = argparse.ArgumentParser(
description='Tags images in a folder')

help='The input - a folder containing images')

help='The output - a folder to output the results')

args = parser.parse_args()
return args

def main():
import json
args = parse_arguments()

tag_input = args.input[0]
tag_output = args.output[0]

results = {}
if os.path.isdir(tag_input):
images = [filename for filename in os.listdir(tag_input)
if os.path.isfile(os.path.join(tag_input, filename)) and
filename.split('.')[-1].lower() in FILE_TYPES]

images_count = len(images)
for iterator, image_file in enumerate(images):
image_path = os.path.join(tag_input, image_file)
print('[%s / %s] %s uploading' %
(iterator + 1, images_count, image_path))
content_id = upload_image(image_path)
except IndexError:
except KeyError:
except ArgumentException:

nsfw_result = check_image(content_id)
results[image_file] = nsfw_result
print('[%s / %s] %s checked' %
(iterator + 1, images_count, image_path))
raise ArgumentException(
'The input directory does not exist: %s' % tag_input)

if not os.path.exists(tag_output):
elif not os.path.isdir(tag_output):
raise ArgumentException(
'The output folder must be a directory')

for image, result in results.items():
with open(
os.path.join(tag_output, 'result_%s.json' % image),
'wb') as results_file:
result, ensure_ascii=False, indent=4).encode('utf-8'))

print('Done. Check your selected output directory for the results')

if __name__ == '__main__':

We use the [cci]argparse[/cci] module to parse arguments from the command line. The first argument passed in will be the path to a folder containing images to be processed while the second argument is a path to a folder where the results will be saved.

For each image in the input folder, the script uploads it with a POST request to the [cci]/content[/cci] endpoint. After getting a content id back, it makes another call to the [cci]/categorizations/[/cci] endpoint. It then writes the response of that request to a file in the output folder.

Note that all uploaded files sent to [cci]/content[/cci] remain available for 24 hours. After this period, they are automatically deleted. If you need the file, you have to upload it again. You can also manually delete an image by making a DELETE request to [cci][/cci].

Add some images to a folder and test the script with:

[cc lang="bash"]$ python path/to/input/folder path/to/output/folder[/cc]

If you look at the output folder you selected, you should see a JSON file for each processed image.

Feel free to test out the Imagga NSFW Categorization API. If you have any suggestions on ways to improve it or just general comments on the API, you can post them in the Comment Section below or get in touch with us directly. We are always happy to get feedback on our products.

AI Policies: What is the world doing to make them secure

A couple of months ago the internet went berserk with the news of Facebook pulling the plug out on two bots, which started communicating in their own language. Imaginations and headlines went wild with the possibilities: malicious AI is taking over, the doomsday is here with the bots of the Apocalypse. Although the real story was quite different (the bots were turned off because they were designed to communicate with humans, not with each other, thus they were not delivering the expected results), the outcome was simply panic.What we can learn from this is that humans are afraid of their own creation - the artificial intelligence.

AI can transform gargantuan amounts of complex information into insight. It has the potential to present solutions, reveal secrets and solve problems. But before we get to the good part, we need to take care of development and deployment. In order to be able to use them, AI systems need to have the same ethical principles, moral values, professional codes, and social norms we follow. Some of us are excited about the opportunities AI provides, others are suspicious. To become widespread, AI needs to be designed in a way that allows people to understand it, use it and trust it. To ensure the acceptance of AI, public policies should help society deal with AI’s inevitable failures and facilitate adaptation.

Where are we now?

Policies can help AI’s progress or hamper it. We are witnessing a shift in the bottleneck to using AI products from technology to policymaking. Regulation is slow to respond to the cost of compliance or the adoption and development of innovations. Thorough and well-thought policies can influence the rate and direction of innovation by creating stimulus for the private sector. In order to grasp the current situation, we will take a look at the major players in AI technologies whose decisions will be influencing the future of policy making. Yes, you guessed it: China and USA (Europe also deserves a mention). If a country with AI research expertise wishes to participate as a producer it should be ready for tense labor market competition from the U.S. and China.


On October 12, 2016, President Obama’s Executive Office published two reports that laid out its plans for the future of artificial intelligence. The report entitled “Artificial Intelligence, Automation and the Economy,” concluded that AI-driven automation suggests the need for aggressive public policies and a more robust safety net in order to combat labor disruption. The report elaborates on the topics of the previous one: Preparing for the Future of Artificial Intelligence, which recommended the publishing of a report on the economic impacts of artificial intelligence. The focus of AI capabilities is the automation of tasks which have required manual labor, which will provide new possibilities for the economy. However, the disruption of the current livelihood of some people is inevitable. The report’s objective is to find how to increase the benefits and mitigate the costs.

AI isn’t a science project; it’s commercially important.

The report proposes that three broad strategies are followed to ease the AI automation in the economy: first, invest and develop AI; second, educate and train workers for the future jobs, and, finally, aid workers in the transition and empower them to ensure broadly shared growth. Since AI automation will transform the economy, policymakers need to create or update, strengthen and adapt policies. The primary economic effects under consideration are the beneficial contribution to productivity growth, the new skills that the job market will demand (especially higher-level technical skills); the disbalance the impact of AI will create on wage and education levels, job types and locations; the loss of jobs which might be long term, depending on the policy responses.


For the past four years, the US and China have been heavily investing in AI especially compared to other countries. Just till recently, the US seemed like the leader in the tech race, but 2 years ago China has outdone the US in research output. China is emerging as a leader, not a follower. Government is backing research and development and thus driving China’s economy forward. The total value of AI industries will surpass 1 trillion yuan ($147.80 billion).

On July 20, China’s State Council issued the “Next Generation Artificial Intelligence Development Plan” (新一代人工智能发展规划), which articulates an ambitious agenda for China to lead the world in AI. China intends to pursue a “first-mover advantage” to become the “premier global AI innovation center” by 2030.  And Wan Gang, the Minister of Science and Technology, stated that China plans to launch a national AI plan, which will strengthen AI development and application, introduce policies to contain risks associated with AI, and work toward international cooperation. The plan will also provide funds to back these endeavors up.

The guideline states that developing AI is a “complicated and systematic project” and needs a coordinated AI innovation system- not only for the technology, but for the products as well. It goes on stating that AI in China should be used to promote the country’s technology, social welfare, economy, provide national security, and help the world in general.

The guideline advises that trans-boundary research needs to connect AI with subjects like psychology, cognitive science, mathematics, and economics. As far as platform construction goes, open-source computing platforms should promote coordination among different hardware, software and clouds. This will naturally increase the need of more AI professionals and scientists should who need to be prepared for work.


The International Business Machines Corporation (IBM) is actively engaged in global discussion about making AI ethical and beneficial. It is working not only internally, but with collaborators and competitors as well.

Because of the constant change in development, AI is making it difficult for any regulation agency to keep up with the progress. This is making meaningful and timely guidance almost impossible. On the other hands, issues like data privacy and ownership have been discussed in the EU. An algorithm for transparency and accountability has also been considered.

In 2018, the General Data Protection Regulation will be rolled out in the EU. It will restrict automated individual decision-making (algorithms making decisions on user-level predictors) that affects users. This law provides the “right to explanation:” a user can request an explanation why the algorithm has chosen him/her.

Safety is important, but so are fairness, equality and inclusiveness, which should be included in the AI systems. That’s why we need policies and regulations: to ensure AI is being used to the benefit of all. IBM is working with governments, media, regulatory agencies and industry sectors: everyone, who is willing to have a reasonable discussion on the ethical issues of AI. The aim is to clearly identify the potential and limits of AI and how to make the best use of it.

Who is it up to?

On a shorter term, it is up to the policymakers and lawyers. In the near future, government representatives need to have the technical expertise in AI to justify decisions. More research is needed on the security, privacy and the societal implications of AI used. For example, instead of cross-examining a person, lawyers may need to cross-examine an algorithm.

As with everything technological, there is a definite uncertainty about how strongly these effects will be felt. Maybe AI won’t have a large effect on the economy. But the other option is for the economy to experience a larger shock: changes in the labor market, employees without relevant work skills and in a desperate need of a training. Although no definitive decision could be taken or a deadline for policies setting, continued involvement of the government with the industry, technical and policy experts will play an important role.

free image recognition with imagga

Announcing V3.0 Tagging With Up to 45 % More Classes

We are happy to announce the upcoming update of our Tagging technology! We have been looking to make this change so that you can rely more on your API calls and get more precise results to help you better analyze and classify your visual data.

The updated version will become active on 28th of August and you can expect up to 45% more classes* with overall 15% improved precision rate and 30% better recall.

NEW vs. OLD Tagging Comparison

Because we believe that you need to see it with your eyes rather than just hear about it and test it on your own, we have decided to make this comparison with one of our Demo images.

Clearly in this case you can notice 46% more classes and the clear improvement of the precision in keywords accuracy. Prior to the update the most significant keyword was insect, while now it is dragonfly, small increase in recall but significant increase in intrinsic value.

Important For Current Users:

Our old tagging version will remain the default one for a period of 14 days after the launch. To upgrade you will need to change your endpoint parameter to version=3. Follow this example if you are uncertain how to do this.

If your current request url looks like this:

To make a request to the new version, it must looks like this:

After this, for a period of 7 days the new Tagging version will become the default one, but all users will still be able to make requests to the old version by adding the parameter for version=2. At the end of that period there will be only version 3 and you won't be able to add version parameters to your calls.

What is your experience with version 2 and 3? Did you find much of a difference when testing with your dataset?

* Some classes may have changed names.

free image recognition with imagga

Artificial Intelligence Becoming Human. Is That Good or Bad?

The term “artificial intelligence” has been driving people’s imaginations wild even before 1955 when the term was coined to describe an emerging computer science discipline. Today the term includes a variety of technologies to improve the human life and the list is ever growing. Starting with Alexa and self-driving cars finishing with love robots, your newsfeed is constantly full of AI updates. Your newsfeed is also a product of (somewhat) well-implemented algorithm. The good news? Just like the rest of the AI technologies, your newsfeed is self-learning and constantly changing, trying to improve your experience. The bad news? A lot of people know why but nobody can really explain why the most advanced algorithms work. And that’s where things can go wrong. And that’s where things can go wrong.

The Good AI

The AI market is blooming. The profitable mix of media attention, hype, startups and adoption by enterprises is making sure that AI is a household topic. A Narrative Science survey found that 38% of enterprises are already using AI and Forrester Research predicted that in 2017 the investments in AI will grow by 300% compared with 2016.

But what good can artificial intelligence do today?

Natural language generation

This capability of AI is used to generate reports, summarize business intelligence insights and automate customer service, AI can use this ability to produce text from data.

Speech recognition

Interactive voice response systems and mobile applications rely on AI ability to recognize speech. It transcribes and transforms human speech into form usable by a computer application.

Image recognition

This has been already successfully used to detect problematic persons at airports, for retail, etc.

Virtual agents/chatbots

These virtual agents are used in customer service and support, smart home managers. These chatbot systems and advanced AI can interact with humans. There are machine learning platforms which can design, train and deploy models into applications, processes and other machines, by providing algorithms, APIs, development and training data.

Decision management for enterprise

Engines that use rules and logic into AI systems and are used for initial setup/training and ongoing maintenance and tuning? Check. This technology has been used for a while now for decision management by enterprise applications and assisting automated decision-making. There is also AI-optimized hardware with the power to process graphics and designed to run AI computational jobs.

AI for biometrics

On a more personal level, the use of AI in biometrics enables more natural interactions between humans and machines, relying on image and touch recognition, speech, and body language. By using scripts and other ways to automate human action to support efficient business processes, robots are capable of executing tasks or processes instead of humans.

Fraud detection and security

Natural language processing (NLP) uses and supports text analytics by understanding sentence structure and meaning, sentiment and intent through statistical and machine learning methods. It is currently used in fraud detection and security.

The “Black Box” of AI

At the beginning AI breached out in two directions: machines should reason according to rules and logic (everything is visible in the code); machines should use biology and learn from observing and experiencing (a program generates an algorithm based on example data). Today machines ultimately program themselves based on the latter approach. Since there is no hand-coded system which can be observed and examined, deep learning is particularly a “black box.”

It is crucial to make sure we know when failures in the AI occur because they will. In order to do that, we need to know how techniques like deep learning work. Recognizing abstract things. In simple systems, recognition is based on physical attributes like outlines and colour; on the next level- more complex things like basic shapes, textures, etc. The top level can recognize all the levels and the whole not just as a sum of its parts.

There is the expectation that these techniques will be used to diagnose diseases, make trading decisions and transform whole industries. But it shouldn’t happen before we manage to make deep learning more understandable especially to their creators and accountable for their uses. Otherwise there is no way to predict failures.

Today mathematical models are already being used to find out who is approved for a loan and who gets a job. But deep learning represents a different way to program computers.  “It is a problem that is already relevant, and it’s going to be much more relevant in the future,” says Tommi Jaakkola, a professor at MIT who works on applications of machine learning. “Whether it’s an investment decision, a medical decision, or maybe a military decision, you don’t want to just rely on a ‘black box’ method.”

Starting in the summer of 2018, the European Union will probably require that companies be able to explain decisions made by automated systems. Easy right? Not really: this task might be impossible if the apps and the websites use deep learning. Even if it comes to something simple like recommending products or playing songs. Those services are run by computers which have programmed themselves. Even the engineers who have build them will not be able to fully clarify the way the computers reach the results.

“It might be part of the nature of intelligence that only part of it is exposed to rational explanation. Some of it is just instinctual.”

With the advance of technology, logic and reason might need to step down and leave some room for faith. Just like human reasoning and logic, we can’t always explain why we’ve taken a decision. However, this is the first time we are dealing with machines, which are not understandable by even the people who engineered them. How will this influence our relationship with technology? A hand-coded system is pretty straightforward, but any machine-learning technology is way more convoluted. Yes, not all AI tech will be this difficult to understand, but deep learning is a black box by design.

AI works a bit like the neural network and its center- the brain: you can’t look inside it to find out how it works because a network’s reasoning is embedded in the behaviour of thousands of simulated neurons. These neurons are arranged into dozens or even hundreds of intricately interconnected layers. The first layer receives input and then performs calculations before giving an a new signal as output. The results are fed to neurons in the next layer and so on.

Because there are many layers in a deep network, they are able to recognize things at different levels of abstraction. If you want to build an app, let’s say “Not a HotDog” (“Silicon Valley,” anyone?), you need to know what  a hot dog looks like. A system might be designed to recognize hot dogs based on outlines or color. Higher layers will recognize more complex things like texture and details like condiments.

But just as many aspects of human behavior can’t be explained in detail, it might be the case that we won’t be able to explain everything AI does.  “Even if somebody can give you a reasonable-sounding explanation [for his or her actions], it probably is incomplete, and the same could very well be true for AI,” says Clune, of the University of Wyoming. “It might just be part of the nature of intelligence that only part of it is exposed to rational explanation. Some of it is just instinctual, or subconscious, or inscrutable.”

Just like civilizations have been built on a contract of expected behaviour, we might need to design AI system to respect and fit into our social norms. Whatever robot or a system we created, it is important that their decision-making is consistent with our ethical judgements.

The AI Future

Participants in a recent survey were asked about the most worrying notion about AI. The results were as expected: participants were most worried by the notion of a robot that would cause them physical harm. Naturally, machines with close physical contact like self-driving cars and home managers were viewed as risky. However, when it cоmes to statistics, languages, personal assistants: people are more than willing to use AI in everyday tasks. The many potential social and economic benefits from the technology depend on the environment in which they evolve, says the Royal Society.

A robot animated by AI is known as “embodiment.” Thus applications that involved embodiment were viewed as risky. As data scientist Cathy O’Neil has written, algorithms are dangerous if they posses scale, their working are a secret and their effects are destructive. Alison Powell, an assistant professor at the London School of Economics believes that this mismatch between perceived and potential risk is common with new technologies. “This is part of the overall problem of the communication of technological promise: new technologies are so often positioned as “personal” that perception of systematic risk is impeded.”

Philosophers, computer scientists and techies make the distinction between “soft” and “hard” AI. The main difference? Hard AI’s main goal is to mimic the human mind. As the Wall Street Journal and MIT lecturer Irving Wladawsky-Berger explained, soft AI’s main purpose is to be statistically oriented and use its computational intelligence methods to address complex problems based on the analysis of vast amounts of information using sophisticated algorithms. For most of us soft AI is already an everyday part of our daily routine: from the GPS to ordering food online. According to Wladawsky-Berger, hard AI is “a kind of artificial general intelligence that can successfully match or exceed human intelligence in cognitive tasks such as reasoning, planning, learning, vision and natural language conversations on any subject.”

AI is already used to build devices that cheat and deceive or to outsmart human hackers. It is quickly learning from our behavior and people are building robots who are so humanlike they might be our lovers. AI is also learning right from wrong. Mark Riedl and Brent Harrison from the School of Interactive Computing at the Georgia Institute of Technology are leading a team who is trying to instill human ethics to AIs by using stories. Just like in real life we teach human values to children by reading them stories, AI learns to distinguish wrong from right, bad from good. Just like civilizations have been built on a contract of expected behaviour, we might need to design AI system to respect and fit into our social norms. Whatever robot or a system we created, it is important that their decision-making is consistent with our ethical judgements.

free image recognition with imagga

7 Image Recognition Uses of the Future

Did you know that image recognition is one of the main technologies that skyrockets the development of self-driving cars?

Image identification powered by innovative machine learning has already been embedded in a number of fields with impressive success. It is used for automated image organization of large databases and visual websites, as well as face and photo recognition on social networks such as Facebook. Image recognition makes image classification for stock websites easier, and even fuels marketers’ creativity by enabling them to craft interactive brand campaigns.  

Beyond the common uses of image recognition we have gotten accustomed to, the revolutionizing technology goes far beyond our imagination. Here are seven daring applications of computer vision that might as well belong in a science fiction novel - but are getting very close to reality today.

#1. Creating city guides

Can you imagine choosing your next travel destination on the basis of real-time location information from Instagram photos that other tourists have posted? Well, it’s already out there. Jetpac created its virtual “city guides” back in 2013 by using shared visuals from Instagram.

By employing image recognition, Jetpac caught visual cues in the photos and analyzed them to offer live data to its users. For example, on the basis of images, the app could tell you whether a cafe in Berlin is frequented by hipsters, or it’s a wild country bar. This way, users receive local customized recommendations at-a-glance.  

In August 2014, Jetpac was acquired by Google, joining the company’s Knowledge team. Its knowhow is said to be helping Google’s development of visual search and Google Glass, the ‘ubiquitous computer’ trial of the tech giant.

#2. Powering self-driving cars

In the last years, self-driving cars are the buzz in the auto industry and the tech alike. Autonomous vehicles are already being actively tested on U.S. roads as we speak. Forty-four companies are currently working on different versions of self-driving vehicles. Computer vision is one of the main technologies that makes these advancements possible, and is fueling their rapid development and enhanced safety features.

To enable autonomous driving, artificial intelligence is being taught to recognize various objects on roads. They include pathways, moving objects, vehicles, and people. Image recognition technology can also predict speed, location and behavior of other objects in motion. AI companies such as AImotive are also instructing their software to adapt to different driving styles and conditions. Researchers are close to creating AI for self-driving cars that can even see in the dark.

#3. Boosting augmented reality applications and gaming

Augmented reality experiments have long tantalized people’s imagination. With image recognition, transposition of digital information on top of what we see in the world is no longer a futuristic dream. Unlike virtual reality, augmented reality does not replace our environment with a digital one. It simply adds some great perks to it.

You can see the most common applications of augmented reality in gaming. A number of new games use image recognition to complement their products with an extra flair that makes the gaming experience more immediate and ‘real.’ With neural networks training, developers can also create more realistic game environments and characters.

Image recognition has also been used in powering other augmented reality applications, such as crowd behavior monitoring by CrowdOptic and augmented reality advertising by Blippar.

#4. Organizing one’s visual memory

Here’s for a very practical application of image recognition - making mental notes through visuals. Who wouldn’t like to get this extra skill?

The app Deja Vu, for example, helps users organize their visual memory. When you take a photo, its computer vision technology matches the visual with background information about the objects on it. This means you can instantly get data about books, DVDs, and wine bottles just by taking a photo of their covers or labels. Once in your database, you can search through your photos on the basis of location and keywords.

#5. Teaching machines to see

Besides the impressive number of consumer uses that image recognition has, it is already employed in important manufacturing and industrial processes. Teaching machines to recognize visuals, analyze them, and take decisions on the basis of the visual input holds stunning potential for production across the globe.

Image recognition can make possible the creation of machines that automatically detect defects in manufacturing pipelines. Besides already known faults, the AI-powered systems could also recognize previously unknown defects because of their ability to learn.

There is a myriad of potential uses of teaching machines to perceive our visual world. For example, Xerox scientists are applying deep learning techniques to enable their AI software mimic the attention patterns of the human brain when seeing a photo or a video.

#6. Empowering educators and students

Another inspiring use of image recognition that is already being put in practice is tightly connected with education again - but this time, with improving education of people.

Image recognition is embedded in technologies that enable students with learning disabilities receive the education they need - in a form they can perceive. Apps powered by computer vision offer text-to-speech options, which allow students with impaired vision or dyslexia to ‘read’ the content.

Applications of image recognition in education are not limited to special students’ needs. The technology is used in a range of tools that push the boundaries of traditional teaching. For example, the app Anatomy3D allows discovery of the interconnectedness between organs and muscles in the human body through scanning of a body part. It revolutionizes the way students can explore anatomy and learn about the way our bodies function. Image recognition uses can also help educators find innovative ways to reach ever more distracted students, who are not susceptible to current methods of teaching.   

#7. Improving iris recognition

Iris recognition is a widely used method for biometric identification. It’s most common application is in border security checks, where a person’s identity is verified by scanning their iris. The identification is conducted by analyzing the unique patterns in the colored part of the eye.

Even though iris recognition has been around for a while, in some cases it is not as precise as it’s expected to be. The advancement of image recognition, however, is bringing new possibilities for iris recognition use across industries with improved accuracy and new applications. Most notably, iris identification is already being used in some consumer devices. The smartphones Samsung Galaxy Note7 and Galaxy S8, and Windows Lumia 950 are among the ones already equipped with such a capability.

While recognition is becoming more precise, security concerns over biometrics identification remain, as recently hackers broke the iris recognition of Samsung Galaxy S8. Together with the advancement of computer vision, security measures are also bound to improve to match the new technological opportunities.    

Have you had an experience with AI technology from a movie that years later you seen in real life? Share with the rest of the group and if it enough people like it we can build it together.

The uses of image recognition of the future are practically limitless - they’re only bound by human imagination. What is the practical application of computer vision that you find the most exciting or useful? We’d love to read about it in the comments below.

free image recognition with imagga