In this article we’ll put the Imagga API for visual similarity to use with a demo that you can try yourself. The technology allows you to add a bunch of images to a visual search index, and then query it, using unlabeled pictures.
Use case: Say you want to do stock photo suggestions – the customer uploads an image that they want to find similar stock photos to. You’ll make an index with your stock photos, and query it with each new incoming photo. You’ll receive the most visually similar images from the API, filter them however you want and suggest those to the client.
In a few steps we’ll make a python script that creates a visual search index from a small set of images, and then queries it with a given image file to get the most similar pics from the initial set.
Contents
Prerequisites
To get started you need to have some basic coding knowledge and have python (>3.5) installed (download link).
You can download all of the code and test images, used in this tutorial (3.4MB zip).
Project structure:
create_index.py
– contains the code to feed the images to the visual search index and train it.search_index.py
– takes a single image path as an argument, searches the index for it and prints the API responsedelete_index.py
– self-explanatory
The scripts use relative paths for the images, so you should call them from within the project folder.
test_images
– a folder with several images for use with thesearch_index.py
scriptindex_creation_set
– a folder with images that thecreate_index.py
script feeds to the index. Here’s a preview:
Creating the visual search index
Here’s the code of create_index.py
, which we’ll examine to understand the API usage:
import os import requests import time API_KEY = 'acc_xxxxxxxxxxxxxxx' API_SECRET = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' API_ENDPOINT = 'https://api.imagga.com/v2' CATEGORIZER = 'general_v3' # The default general purpose categorizer INDEX_NAME = 'similarity_tutorial_v1' categorizer_endpoint = '%s/categories/%s' % (API_ENDPOINT, CATEGORIZER) index_endpoint = ('%s/similar-images/categories/%s/%s' % (API_ENDPOINT, CATEGORIZER, INDEX_NAME)) tickets_endpoint = '%s/tickets' % API_ENDPOINT def feed_image(image_path): image_name, extension = os.path.splitext(os.path.basename(image_path)) # Make the API call response = requests.post( categorizer_endpoint, auth=(API_KEY, API_SECRET), files={'image': open(image_path, 'rb')}, params={'save_id': image_name, 'save_index': INDEX_NAME}) # Check for errors try: status = response.json()['status'] if status['type'] != 'success': print('API reported error:', status['text']) return False except Exception as e: print('Exception occured during handling the feeding response') print(e) return False return True def train_index(): ticket_id = '' # Make the API call response = requests.put(index_endpoint, auth=(API_KEY, API_SECRET)) # Get the completion ticket try: ticket_id = response.json()['result']['ticket_id'] except Exception as e: print('Exception occured when processing the train call response') print(e, response.content) return ticket_id def is_resolved(ticket_id): resolved = False # Make the API call response = requests.get( '%s/%s' % (tickets_endpoint, ticket_id), auth=(API_KEY, API_SECRET)) # Check for errors try: resolved = response.json()['result']['is_final'] except Exception as e: print('Exception occured during the ticket status check') print(e) return resolved def main(): # Feed the visual search index with images successful_feeds = 0 failed_feeds = 0 for image in os.scandir('./index_creation_set'): successful = feed_image(image.path) if successful: successful_feeds += 1 else: failed_feeds += 1 print('At image %s: %s' % (successful_feeds, image.path)) # Report if not successful_feeds: print('No images were fed successfully. Exiting') return print( '%s out of %s images fed successfully' % (successful_feeds, successful_feeds + failed_feeds)) # Train the index ticket_id = train_index() if not ticket_id: print('No ticket id. Exiting') return # Wait for the training to complete time_started = time.time() while not is_resolved(ticket_id): t_passed = (time.time() - time_started) // 1000 print( 'Waiting for training to finish (time elapsed: %.1fs)' % t_passed) time.sleep(0.5) print('Training done.') if __name__ == '__main__': main()
At the top of the file we have the API credentials (mock examples here):
API_KEY = 'acc_xxxxxxxxxxxxxxx' API_SECRET = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
You can signup for a free account to get these credentials. After creating an account, you’ll find these values on your dashboard and you can replace them in the script. That’s all the manual setup you’ll need.
Then we prepare some names:
API_ENDPOINT = 'https://api.imagga.com/v2' CATEGORIZER = 'general_v3' # The default general purpose categorizer INDEX_NAME = 'similarity_tutorial_v1'
API_ENDPOINT
is the Imagga API address.
INDEX_NAME
is the name I’ve chosen for our test index.
CATEGORIZER
is the image categorizer to be used to create the visual search index. As mentioned in the code – the general_v3
categorizer is the default choice. The data with which the categorizer was trained determines the index behavior. Therefore if you want to do visual search for a specific dataset (with distinct visual features) the best choice will be to use a custom model as a categorizer.
Lastly we define the API endpoints as global variables (since it’s just a test script):
categorizer_endpoint = '%s/categories/%s' % (API_ENDPOINT, CATEGORIZER) index_endpoint = ('%s/similar-images/categories/%s/%s' % (API_ENDPOINT, CATEGORIZER, INDEX_NAME)) tickets_endpoint = '%s/tickets' % API_ENDPOINT
The file contains 4 functions:
main()
– feeds all the images from the index_creation_set folder to the index, calls the index training and waits until the latter finishesfeed_image(image_path)
– feeds a single image to the indextrain_index()
– calls the index trainingis_resolved(ticket_id)
– checks if the training is done
The script starts execution with the main() function:
def main(): # Feed the visual search index with images successful_feeds = 0 failed_feeds = 0 for image in os.scandir('./index_creation_set'): print('At image %s: %s' % (successful_feeds, image.path)) successful = feed_image(image.path) if successful: successful_feeds += 1 else: failed_feeds += 1 # Report if not successful_feeds: print('No images were fed successfully. Exiting') return print( '%s out of %s images fed successfully' % (successful_feeds, successful_feeds + failed_feeds)) # Train the index ticket_id = train_index() if not ticket_id: print('No ticket id. Exiting') return # Wait for the training to complete time_started = time.time() while not is_resolved(ticket_id): t_passed = (time.time() - time_started) // 1000 print( 'Waiting for training to finish (time elapsed: %.1fs)' % t_passed) time.sleep(0.5) print('Training done.')
There are a few key lines of code here and the rest is basically made up of progress and error print statements.
Feed all the images to the index:
for image in os.scandir('./index_creation_set'): successful = feed_image(image.path)
Train the index.
ticket_id = train_index()
The API call that invokes the training returns a ticket_id with its response, which is used below to check if the training is completed. A check is made every 0.5 seconds whether the training is completed:
while not is_resolved(ticket_id): t_passed = (time.time() - time_started) // 1000 print( 'Waiting for training to finish (time elapsed: %.1fs)' % t_passed) time.sleep(0.5)
The function feed_image(image_path)
def feed_image(image_path): file_name = os.path.basename(image_path) image_name, extension = os.path.splitext(file_name) # Make the API call response = requests.post( categorizer_endpoint, auth=(API_KEY, API_SECRET), files={'image': open(image_path, 'rb')}, params={'save_id': image_name, 'save_index': INDEX_NAME}) # Check for errors try: status = response.json()['status'] if status['type'] != 'success': print('API reported error:', status['text']) return False except Exception as e: print('Exception occured during handling the feeding response') print(e) return False return True
Here the main action is the API call, which is a POST request to the /categorizers/<categorizer_id> endpoint. As mentioned earlier we use the default categorizer, but you can check out the available categorizers with a GET request to /categorizers.
With the request parameters we specify the authentication credentials and the file to be uploaded. Again there – the save_id
parameter specifies the ID which the image will be associated with in the index. The latter is specified by the save_index
parameter. The ID of the image is important, because that’s what you’ll receive as a response when you query the index with a similar image. In this demo we use the image name as an ID.
The function train_index()
def train_index(): ticket_id = '' # Make the API call response = requests.put(index_endpoint, auth=(API_KEY, API_SECRET)) # Get the completion ticket try: ticket_id = response.json()['result']['ticket_id'] except Exception as e: print('Exception occured when processing the train call response') print(e, response.content) return ticket_id
It’s a plain PUT request to the /similar-images/categories/<categorizer_id>/<index_id> endpoint. The response returns JSON content where a ticket_id
is specified. The ticket API is used to check for task completion (and event resolution in general).
Here’s how the is_resolved(ticket_id) function polls for the training completion status
def is_resolved(ticket_id): resolved = False # Make the API call response = requests.get( '%s/%s' % (tickets_endpoint, ticket_id), auth=(API_KEY, API_SECRET)) # Check for errors try: resolved = response.json()['result']['is_final'] except Exception as e: print('Exception occured during the ticket status check') print(e) return resolved
A simple GET call to the /tickets/<ticket_id> endpoint. The response contains an is_final
boolean, which, in this case, corresponds to the training status.
Using the visual search index
In search_index.py
(snippet below) you can see that the function search_index(image_path)
makes a POST request to the /similar-images/categories/<categorizer_id>/<index_id> endpoint to retrieve the IDs of the most visually similar images from the index. That’s our query call. The optional parameter distance
is given to filter the results to those that are most similar. In this case the DISTANCE_THRESHOLD
is set to 1.4, but you should choose a distance that fits you use case. In most cases looking at the results for several samples will give you a good idea for what might be suitable for your needs. You can choose not to filter the results by distance at all (just omit the parameter).
import os import json import requests import argparse from create_index import API_KEY, API_SECRET, index_endpoint DISTANCE_THRESHOLD = 1.4 def search_index(image_path): if not os.path.exists(image_path): print('Path is invalid:', image_path) return None search_result = None with open(image_path, 'rb') as image_file: response = requests.post( index_endpoint, params={'distance': DISTANCE_THRESHOLD}, files={'image': image_file}, auth=(API_KEY, API_SECRET)) try: search_result = response.json()['result'] except Exception as e: print('Exception occured during reading the search response') print(e, response.content, response.status_code) return search_result def main(): parser = argparse.ArgumentParser() parser.add_argument('image_path', help='Image to search for') args = parser.parse_args() result = search_index(args.image_path) if not result: print('No result obtained') print('Search result:') print(json.dumps(result, indent=4)) if __name__ == '__main__': main()
Running the scripts
In the console we call the training:
python create_index.py
Then if the script reports success we can test the visual search:
python search_index.py query_test_images/beach_volleyball.JPEG
If you want to delete the index and start anew:
python delete_index.py
Here are some example results
The response JSON contains a result
and a status
field. The result
field holds the categorizer labels for the image in the categories
list, and the search index results are in the images
list with their distance
and id
. Below you’ll find the visualizations for the top-3 results for each of the test images, and the JSONs for the corresponding API responses (with a distance filter of 1.4).
Response for the beach_volleyball.JPEG:
{ "categories": [ { "confidence": 78.8192367553711, "name": { "en": "volleyball.n.02" } }, { "confidence": 10.889087677002, "name": { "en": "volleyball_net.n.01" } }, { "confidence": 1.0372930765152, "name": { "en": "swimming_trunks.n.01" } } ], "count": 2, "images": [ { "distance": 0.910576224327087, "id": "volleyball" }, { "distance": 1.27113902568817, "id": "table_tennis" } ] }
Response for the dogs_2.JPEG:
{ "categories": [ { "confidence": 62.049072265625, "name": { "en": "walker_hound.n.01" } }, { "confidence": 33.7763175964355, "name": { "en": "english_foxhound.n.01" } }, { "confidence": 3.91585564613342, "name": { "en": "beagle.n.01" } } ], "count": 3, "images": [ { "distance": 1.25044810771942, "id": "dogs_1" }, { "distance": 1.37776911258698, "id": "dog_1" }, { "distance": 1.39786684513092, "id": "penguins" } ] }
Response for the lesser_panda_3.JPEG:
{ "categories": [ { "confidence": 99.9105606079102, "name": { "en": "lesser_panda.n.01" } } ], "count": 3, "images": [ { "distance": 0.376796960830688, "id": "lesser_panda_1" }, { "distance": 0.62762051820755, "id": "lesser_panda_2" }, { "distance": 1.33433437347412, "id": "bird_1" } ] }
Response for the man_with_burger.JPEG:
{ "categories": [ { "confidence": 22.9423999786377, "name": { "en": "ladle.n.01" } }, { "confidence": 8.98854541778564, "name": { "en": "wineglass.n.01" } }, { "confidence": 5.89485549926758, "name": { "en": "maraca.n.01" } }, { "confidence": 3.46249604225159, "name": { "en": "strainer.n.01" } }, { "confidence": 3.08541393280029, "name": { "en": "goblet.n.01" } }, { "confidence": 2.84574699401855, "name": { "en": "spatula.n.01" } }, { "confidence": 2.77523231506348, "name": { "en": "alcoholic.n.01" } }, { "confidence": 1.77907812595367, "name": { "en": "wooden_spoon.n.02" } }, { "confidence": 1.404292345047, "name": { "en": "ping-pong_ball.n.01" } }, { "confidence": 1.33869755268097, "name": { "en": "granadilla.n.04" } }, { "confidence": 1.21686482429504, "name": { "en": "golfer.n.01" } } ], "count": 2, "images": [ { "distance": 1.2667555809021, "id": "meal" }, { "distance": 1.35286629199982, "id": "volleyball" } ] }