In this article we’ll put the Imagga API for visual similarity to use with a demo that you can try yourself. The technology allows you to add a bunch of images to a visual search index, and then query it, using unlabeled pictures.

Use case: Say you want to do stock photo suggestions – the customer uploads an image that they want to find similar stock photos to. You’ll make an index with your stock photos, and query it with each new incoming photo. You’ll receive the most visually similar images from the API, filter them however you want and suggest those to the client.

In a few steps we’ll make a python script that creates a visual search index from a small set of images, and then queries it with a given image file to get the most similar pics from the initial set.

Prerequisites

To get started you need to have some basic coding knowledge and have python (>3.5) installed (download link).

You can download all of the code and test images, used in this tutorial (3.4MB zip).

Project structure:

create_index.py – contains the code to feed the images to the visual search index and train it.
search_index.py – takes a single image path as an argument, searches the index for it and prints the API response
delete_index.py – self-explanatory

The scripts use relative paths for the images, so you should call them from within the project folder.

test_images – a folder with several images for use with the search_index.py script
index_creation_set – a folder with images that the create_index.py script feeds to the index. Here’s a preview:

Creating the visual search index

Here’s the code of create_index.py, which we’ll examine to understand the API usage:

import os
import requests
import time


API_KEY = 'acc_xxxxxxxxxxxxxxx'
API_SECRET = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'

API_ENDPOINT = 'https://api.imagga.com/v2'
CATEGORIZER = 'general_v3'  # The default general purpose categorizer
INDEX_NAME = 'similarity_tutorial_v1'

categorizer_endpoint = '%s/categories/%s' % (API_ENDPOINT, CATEGORIZER)
index_endpoint = ('%s/similar-images/categories/%s/%s' %
                  (API_ENDPOINT, CATEGORIZER, INDEX_NAME))
tickets_endpoint = '%s/tickets' % API_ENDPOINT


def feed_image(image_path):
    image_name, extension = os.path.splitext(os.path.basename(image_path))

    # Make the API call
    response = requests.post(
        categorizer_endpoint,
        auth=(API_KEY, API_SECRET),
        files={'image': open(image_path, 'rb')},
        params={'save_id': image_name, 'save_index': INDEX_NAME})

    # Check for errors
    try:
        status = response.json()['status']
        if status['type'] != 'success':
            print('API reported error:', status['text'])
            return False
    except Exception as e:
        print('Exception occured during handling the feeding response')
        print(e)
        return False

    return True


def train_index():
    ticket_id = ''

    # Make the API call
    response = requests.put(index_endpoint, auth=(API_KEY, API_SECRET))

    # Get the completion ticket
    try:
        ticket_id = response.json()['result']['ticket_id']
    except Exception as e:
        print('Exception occured when processing the train call response')
        print(e, response.content)

    return ticket_id


def is_resolved(ticket_id):
    resolved = False

    # Make the API call
    response = requests.get(
        '%s/%s' % (tickets_endpoint, ticket_id), auth=(API_KEY, API_SECRET))

    # Check for errors
    try:
        resolved = response.json()['result']['is_final']
    except Exception as e:
        print('Exception occured during the ticket status check')
        print(e)

    return resolved


def main():
    # Feed the visual search index with images
    successful_feeds = 0
    failed_feeds = 0

    for image in os.scandir('./index_creation_set'):
        successful = feed_image(image.path)
        if successful:
            successful_feeds += 1
        else:
            failed_feeds += 1

        print('At image %s: %s' % (successful_feeds, image.path))

    # Report
    if not successful_feeds:
        print('No images were fed successfully. Exiting')
        return

    print(
        '%s out of %s images fed successfully' %
        (successful_feeds, successful_feeds + failed_feeds))

    # Train the index
    ticket_id = train_index()

    if not ticket_id:
        print('No ticket id. Exiting')
        return

    # Wait for the training to complete
    time_started = time.time()
    while not is_resolved(ticket_id):
        t_passed = (time.time() - time_started) // 1000
        print(
            'Waiting for training to finish (time elapsed: %.1fs)' % t_passed)

        time.sleep(0.5)

    print('Training done.')


if __name__ == '__main__':
    main()

At the top of the file we have the API credentials (mock examples here):

API_KEY = 'acc_xxxxxxxxxxxxxxx'
API_SECRET = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'

You can signup for a free account to get these credentials. After creating an account, you’ll find these values on your dashboard and you can replace them in the script. That’s all the manual setup you’ll need.

Then we prepare some names:

API_ENDPOINT = 'https://api.imagga.com/v2'
CATEGORIZER = 'general_v3'  # The default general purpose categorizer
INDEX_NAME = 'similarity_tutorial_v1'

API_ENDPOINT is the Imagga API address.
INDEX_NAME is the name I’ve chosen for our test index.
CATEGORIZER is the image categorizer to be used to create the visual search index. As mentioned in the code – the general_v3 categorizer is the default choice. The data with which the categorizer was trained determines the index behavior. Therefore if you want to do visual search for a specific dataset (with distinct visual features) the best choice will be to use a custom model as a categorizer.

Lastly we define the API endpoints as global variables (since it’s just a test script):

categorizer_endpoint = '%s/categories/%s' % (API_ENDPOINT, CATEGORIZER)
index_endpoint = ('%s/similar-images/categories/%s/%s' %
                  (API_ENDPOINT, CATEGORIZER, INDEX_NAME))
tickets_endpoint = '%s/tickets' % API_ENDPOINT

The file contains 4 functions:

main() – feeds all the images from the index_creation_set folder to the index, calls the index training and waits until the latter finishes
feed_image(image_path) – feeds a single image to the index
train_index() – calls the index training
is_resolved(ticket_id) – checks if the training is done

The script starts execution with the main() function:

def main():
    # Feed the visual search index with images
    successful_feeds = 0
    failed_feeds = 0

    for image in os.scandir('./index_creation_set'):
        print('At image %s: %s' % (successful_feeds, image.path))

        successful = feed_image(image.path)
        if successful:
            successful_feeds += 1
        else:
            failed_feeds += 1

    # Report
    if not successful_feeds:
        print('No images were fed successfully. Exiting')
        return

    print(
        '%s out of %s images fed successfully' %
        (successful_feeds, successful_feeds + failed_feeds))

    # Train the index
    ticket_id = train_index()

    if not ticket_id:
        print('No ticket id. Exiting')
        return

    # Wait for the training to complete
    time_started = time.time()
    while not is_resolved(ticket_id):
        t_passed = (time.time() - time_started) // 1000
        print(
            'Waiting for training to finish (time elapsed: %.1fs)' % t_passed)

        time.sleep(0.5)

    print('Training done.')

There are a few key lines of code here and the rest is basically made up of progress and error print statements.

Feed all the images to the index:

for image in os.scandir('./index_creation_set'):
    successful = feed_image(image.path)

Train the index.

ticket_id = train_index()

The API call that invokes the training returns a ticket_id with its response, which is used below to check if the training is completed. A check is made every 0.5 seconds whether the training is completed:

while not is_resolved(ticket_id):
    t_passed = (time.time() - time_started) // 1000
    print(
        'Waiting for training to finish (time elapsed: %.1fs)' % t_passed)

    time.sleep(0.5)

The function feed_image(image_path)

def feed_image(image_path):
    file_name = os.path.basename(image_path)
    image_name, extension = os.path.splitext(file_name)

    # Make the API call
    response = requests.post(
        categorizer_endpoint,
        auth=(API_KEY, API_SECRET),
        files={'image': open(image_path, 'rb')},
        params={'save_id': image_name, 'save_index': INDEX_NAME})

    # Check for errors
    try:
        status = response.json()['status']
        if status['type'] != 'success':
            print('API reported error:', status['text'])
            return False
    except Exception as e:
        print('Exception occured during handling the feeding response')
        print(e)
        return False

    return True

Here the main action is the API call, which is a POST request to the /categorizers/<categorizer_id> endpoint. As mentioned earlier we use the default categorizer, but you can check out the available categorizers with a GET request to /categorizers.

With the request parameters we specify the authentication credentials and the file to be uploaded. Again there – the save_id parameter specifies the ID which the image will be associated with in the index. The latter is specified by the save_index parameter. The ID of the image is important, because that’s what you’ll receive as a response when you query the index with a similar image. In this demo we use the image name as an ID.

The function train_index()

def train_index():
    ticket_id = ''

    # Make the API call
    response = requests.put(index_endpoint, auth=(API_KEY, API_SECRET))

    # Get the completion ticket
    try:
        ticket_id = response.json()['result']['ticket_id']
    except Exception as e:
        print('Exception occured when processing the train call response')
        print(e, response.content)

    return ticket_id

It’s a plain PUT request to the /similar-images/categories/<categorizer_id>/<index_id> endpoint. The response returns JSON content where a ticket_id is specified. The ticket API is used to check for task completion (and event resolution in general).

Here’s how the is_resolved(ticket_id) function polls for the training completion status

def is_resolved(ticket_id):
    resolved = False

    # Make the API call
    response = requests.get(
        '%s/%s' % (tickets_endpoint, ticket_id), auth=(API_KEY, API_SECRET))

    # Check for errors
    try:
        resolved = response.json()['result']['is_final']
    except Exception as e:
        print('Exception occured during the ticket status check')
        print(e)

    return resolved

A simple GET call to the /tickets/<ticket_id> endpoint. The response contains an is_final boolean, which, in this case, corresponds to the training status.

Using the visual search index

In search_index.py (snippet below) you can see that the function search_index(image_path) makes a POST request to the /similar-images/categories/<categorizer_id>/<index_id> endpoint to retrieve the IDs of the most visually similar images from the index. That’s our query call. The optional parameter distance is given to filter the results to those that are most similar. In this case the DISTANCE_THRESHOLD is set to 1.4, but you should choose a distance that fits you use case. In most cases looking at the results for several samples will give you a good idea for what might be suitable for your needs. You can choose not to filter the results by distance at all (just omit the parameter).

import os
import json
import requests
import argparse

from create_index import API_KEY, API_SECRET, index_endpoint

DISTANCE_THRESHOLD = 1.4


def search_index(image_path):
    if not os.path.exists(image_path):
        print('Path is invalid:', image_path)
        return None

    search_result = None
    with open(image_path, 'rb') as image_file:
        response = requests.post(
            index_endpoint,
            params={'distance': DISTANCE_THRESHOLD},
            files={'image': image_file},
            auth=(API_KEY, API_SECRET))

    try:
        search_result = response.json()['result']
    except Exception as e:
        print('Exception occured during reading the search response')
        print(e, response.content, response.status_code)

    return search_result


def main():
    parser = argparse.ArgumentParser()
    parser.add_argument('image_path', help='Image to search for')

    args = parser.parse_args()

    result = search_index(args.image_path)
    if not result:
        print('No result obtained')

    print('Search result:')
    print(json.dumps(result, indent=4))


if __name__ == '__main__':
    main()

Running the scripts

In the console we call the training:

python create_index.py

Then if the script reports success we can test the visual search:

python search_index.py query_test_images/beach_volleyball.JPEG

If you want to delete the index and start anew:

python delete_index.py

Here are some example results

The response JSON contains a result and a status field. The result field holds the categorizer labels for the image in the categories list, and the search index results are in the images list with their distance and id. Below you’ll find the visualizations for the top-3 results for each of the test images, and the JSONs for the corresponding API responses (with a distance filter of 1.4).

Response for the beach_volleyball.JPEG:

{
    "categories": [
        {
            "confidence": 78.8192367553711,
            "name": {
                "en": "volleyball.n.02"
            }
        },
        {
            "confidence": 10.889087677002,
            "name": {
                "en": "volleyball_net.n.01"
            }
        },
        {
            "confidence": 1.0372930765152,
            "name": {
                "en": "swimming_trunks.n.01"
            }
        }
    ],
    "count": 2,
    "images": [
        {
            "distance": 0.910576224327087,
            "id": "volleyball"
        },
        {
            "distance": 1.27113902568817,
            "id": "table_tennis"
        }
    ]
}

Response for the dogs_2.JPEG:

{
    "categories": [
        {
            "confidence": 62.049072265625,
            "name": {
                "en": "walker_hound.n.01"
            }
        },
        {
            "confidence": 33.7763175964355,
            "name": {
                "en": "english_foxhound.n.01"
            }
        },
        {
            "confidence": 3.91585564613342,
            "name": {
                "en": "beagle.n.01"
            }
        }
    ],
    "count": 3,
    "images": [
        {
            "distance": 1.25044810771942,
            "id": "dogs_1"
        },
        {
            "distance": 1.37776911258698,
            "id": "dog_1"
        },
        {
            "distance": 1.39786684513092,
            "id": "penguins"
        }
    ]
}

Response for the lesser_panda_3.JPEG:

{
    "categories": [
        {
            "confidence": 99.9105606079102,
            "name": {
                "en": "lesser_panda.n.01"
            }
        }
    ],
    "count": 3,
    "images": [
        {
            "distance": 0.376796960830688,
            "id": "lesser_panda_1"
        },
        {
            "distance": 0.62762051820755,
            "id": "lesser_panda_2"
        },
        {
            "distance": 1.33433437347412,
            "id": "bird_1"
        }
    ]
}

Response for the man_with_burger.JPEG:

{
    "categories": [
        {
            "confidence": 22.9423999786377,
            "name": {
                "en": "ladle.n.01"
            }
        },
        {
            "confidence": 8.98854541778564,
            "name": {
                "en": "wineglass.n.01"
            }
        },
        {
            "confidence": 5.89485549926758,
            "name": {
                "en": "maraca.n.01"
            }
        },
        {
            "confidence": 3.46249604225159,
            "name": {
                "en": "strainer.n.01"
            }
        },
        {
            "confidence": 3.08541393280029,
            "name": {
                "en": "goblet.n.01"
            }
        },
        {
            "confidence": 2.84574699401855,
            "name": {
                "en": "spatula.n.01"
            }
        },
        {
            "confidence": 2.77523231506348,
            "name": {
                "en": "alcoholic.n.01"
            }
        },
        {
            "confidence": 1.77907812595367,
            "name": {
                "en": "wooden_spoon.n.02"
            }
        },
        {
            "confidence": 1.404292345047,
            "name": {
                "en": "ping-pong_ball.n.01"
            }
        },
        {
            "confidence": 1.33869755268097,
            "name": {
                "en": "granadilla.n.04"
            }
        },
        {
            "confidence": 1.21686482429504,
            "name": {
                "en": "golfer.n.01"
            }
        }
    ],
    "count": 2,
    "images": [
        {
            "distance": 1.2667555809021,
            "id": "meal"
        },
        {
            "distance": 1.35286629199982,
            "id": "volleyball"
        }
    ]
}

How to use the Imagga visual similarity search