Imagga @ Machine Learning Event at SAP Labs Bulgaria
Machine learning is getting lots of attention lately. It’s amazing that some 200 people showed up at Hack Bulgaria event and stayed almost 3 hours to learn more about machine learning. Not to mention it was friday and the venue was not in the center of the city! It was a clear indication for us that lots of developers are getting curious about machine learning (ML) and that’s totally cool for companies like ours.
This is a short overview of our not so tech presentation about machine learning for images. Some of the other lecturers have covered different aspects of ML and it’s application in various cases and industries.
From the moment of their invention the convolution networks were great for tasks as face detection and handwriting. Thanks to the advance of the GPU technology and the extended base of image data, the convolutional networks demonstrated far better results for complicated tasks such as visual classification of objects in images.
There are some specifics when it comes to image recognition using machine learning. Images are a matrix of pixels (raster data) and that’s why recognition is sensitive to lighting, contrast, saturation, blur, noise, geometric transformations (scaling, translation, rotation) and occlusion.
Conventional image recognition methods struggle to find the optimal set of filters (convolutions) to apply for each specific use-case.
There are multiple levels and scales of interest, from low-level features such as texture to high-level features such as composition. On the top of that there’s a need for data-augmentation to compensate for sensitivity (e.g. training with blurred, cropped, scaled, noised versions of the images)
In order to do proper image analysis you will need huge (both deep and wide) architecture which requires massive amount of memory and processing power, made more accessible today via GPU empowered machines. It still takes a lot of time (up to 10 days) to train some large architectures.
There are a few implementations for convolutional neural networks.
- cuda-convnet - python interface, c++/cuda implementation for convolutional neural networks, training using back-propagation method, fermi-generation nVidia GPU(GTX 4xx, GTX 5xx, or Tesla equivalent) is required, but no multi-GPU support
- cuda-convnet2 - an upgrade to cuda-convnet, optimized for new kepler-generation nvidia gpus and added multi-gpu support
- caffe - deep learning framework, developed by Berkeley Vision and Learning Center, big community of contributors, support for nVidia’s GPU accelerated convnet library - cudnn
- torch7 - lua interface, with support for python, wide support for machine learning algorithms, one of the fastest implementation for convnet is a torch7 extension - fbcnn from facebook artificial intelligence research team; there are other extensions as well and a support for cudnn
- theano - a python library, open-ended in terms of network architecture and transfer functions, slightly lower-level than the other implementations
We will be doing a series of articles in our blog on how image recognition is changing the paradigm and will allow image intensive business to finally better understand and monetize their image contents.
[slideshare id=47160420&doc=imaggaproductpresentationmleventslideshare-150419060621-conversion-gate02]
Machine Learning Meetup in Sofia
Three year ago when we publicly talked about machine learning, deep learning, convolutional neural networks and AI not many people were getting it. It was hard to explain what all this is about. Things have changed, and for good.
Last week we’ve invited a bunch of people to Machine Learning meetup. The first in Sofia. 60 people attended and it was awesome. It’s awesome to see so many people interested in AI and machine learning. And they were getting it. We are sure machine learning will be widely adopted in many tech verticals in an year or so and are proud to be helping Bulgarian AI/ML community to exchange ideas and grow.
Judging by the number of people and cases that has been discussed, lots of startups are already exploring the power of machine learning in various industries - e-commerce, bitcoin landing, real estate, to mention few. It’s still the early days of ML community in Sofia, so we’ve started with some basics. Judging by the variety of the questions after our short intro presentation, next editions of Sofia Machine Learning Meetup will be quite geeky and interesting.
Some More Image Recognition Tests, Some More Great Results
Yesterday we tested Imagga’s deep learning algorithms against 6 others. The results were quite good and we believe our tech is doing an amazing job recognizing everyday objects with impressive accuracy.
Looks like there’s another interesting test of various image recognition technologies, this time performed by Jack Clark for Bloomberg Business with several challenging images.
It’s quite fascinating to run the same images, the author has originally used, via the Imagga image tagging demo to find out how we perform as well. Imagga tech definitely thinks Mark Zuckerberg is more than a cardigan, but still tags him with а few trivial words (see the results below). May be the subtitle of the article would be different if Imagga’s image tagging was considered, but the ‘Best answer’ section would be a bit boring ;)
Let’s have some fun now (feel free to use Imagga auto-tagging demo to test with your own images):
Imagga: food
Best answer: Fat (Clarifai)
Worst answer: Slug/Churros (MetaMind, demo version)
Imagga: business
Best answer: Cloth/Zuckerberg (Orbeus)
Worst answer: Cardigan (MetaMind, demo version)
Imagga: cat
Best answer: Cat—everyone got this. After all, the Internet is made of cats.
Imagga: mountain
Best answer: Mountain (Orbeus)
Worst answer: Vehicle/Scene (IBM Watson Visual Recognition, beta version)
Imagga: keyboard
Best answer: Technology (Clarifai)
Worst answer: Photo/Object (IBM Watson Visual Recognition, beta version)
Have an amazing idea but felt sceptical about image recognition? Give Imagga's super easy to use API a try and you'll change your mind :)
Imagga among the 40 winners of UN-based World Summit Award 2015
Imagga was selected as one of the winners of the World Summit Award, a global initiative in cooperation of the United Nations World Summit on the Information Society (WSIS) and UNESCO, UNIDO and UN GAID. WSA is the only ICT event worldwide, that reaches the mobile community in over 178 countries.
Imagga Image Recognition PaaS will be honored to receive the Award in e-Media & Journalism category in front of UN representatives, ICT ministries and the private sector at the World Summit Global Congress in Shenzhen, China, in February 2016.
“It’s extremely exciting to see two Bulgarian companies - Imagga (e-Media & Journalism category) and Bee Smart (e-Health and Environment category) as finalists of the World Summit Award 2015. Having two winning teams happens for the first time since Bulgaria participates in the prestigious award”, states Pavel Vurbanov, European Software Institute - Center Eastern Europe.
The 40 winners representing 24 countries were carefully selected from 386 nominations. The goal of the award is to showcase the world’s best practices in digital innovation - from Japan to Brazil and from Norway to Australia.
The WSA winners were selected by a jury of international ICT experts in two democratic rounds. Each UN Member State is eligible to nominate one product per category for the World Summit Award. This way any nomination results from a national pre-selection prior to the international WSA Jury.
Imagga at Hack The Visual
During the last couple years we’ve been taking part in numerous hackathons and events. Hack The Visual was perfect fit for what we do at Imagga. The goal of the event is to connect different types of visual data to each other and create new and interesting prospective. All data is welcomed - pictures, music, video, geo-data, even open data, you name it. In just 48 hours over 100 participants were hacking on projects mashing existing APIs and data sets to find a solution for a real life visual problem.
The main challenge was to bring together photos, videos and other kind of imagery with hardware, interfaces, platforms, apps & services in order to unlock the next step in visual culture.
Tree main tracks have been set based on research by Imaging Mind (organizer of the event) regarding the future of imaging:
- meshed capture - connecting multiple camera sources to generate new experiences. Winner: Camera Crowd - combining multiple photos and their location data with a photo of the area you are. A mesh of pictures from different sources blended into the space
- new perspectives/interpretation of images - accessing various image data sets to extract value from them outside of the image itself. Winner: Hear The Picture - by linking each coloured pixel to its individual sound, a photo could be ‘heard’ through its own distinctive soundtrack
- interactive visuals - reworking the static images into interactive new experience. Winner: Sharon - watch the same video source with multiple people, and allow synced manipulation of the video
Grand Prize went to Splatmap - web application that allows you to photograph buildings with your smartphone and plot the information into the application.
The special Imagga API prize went to Remember - app that triggers your memories using your own photo collection. Re/Visit a place and Remember will remind you of pictures you or someone else snapped nearby. It can also search for relevant photos based on the topics in the photo (using Imagga’s image recognition tagging API), turning your photo library into a smart conversation starter wherever you might be.
Overall, great event! See you next time. And do not forget to give Imagga APIs a try!
Imagga and 6 alternative image recognition services
In a recent post on the newly introduced component of Wolfram’s language for image identification ImageIdentify Jordan Novet of Venture Beat conducted a quick test of ImageIdentify against 5 deep learning platforms for image recognition he chose. He selected “10 images from Flickr that seemed to clearly fall into the 1,000 categories used for the 2014 ImageNet visual recognition competition.” and tagged them with ImageIdentify and 5 alternative image recognition services.
As one of the very first platforms as a service, offering such functionality worldwide, we felt we should join this funny experiment and make our humble contribution, by adding the tags Imagga’s image recognition technology generated for the same 10 photos. You can try with your own photos using Imagga online image recognition demo.
We better leave the results speak for themselves. Please take all this with a grain of salt and don’t forget that these results are obtained for just 10 randomly selected photos :)
1. Coffee Mug
Imagga: cup, mug, coffee mug, drinking vessel, beverage, punch, container, coffee, drink, vessel
Wolfram ImageIdentify: tea
CamFind: white ceramic mug
Clarifai: coffee cup nobody tea mug cafe hot ceramic coffee cup cutout
MetaMind: Coffee mug
Orbeus: cup
AlchemyAPI: coffee
2. Mushroom
Imagga: vegetable, produce, mushroom, food, fungus, cap, organic, lush, moss, forest
Wolfram ImageIdentify: magic mushroom
CamFind: white mushroom
Clarifai: mushroom fungi fungus toadstool nature grass fall moss forest autumn
MetaMind: Mushroom
Orbeus: fungus
AlchemyAPI: mushroom
3. Spatula
Imagga: microphone, spatula, business, turner, black, device, knife, technology, hand, cooking utensil
Wolfram ImageIdentify: spatula
CamFind: black kitchen turner
Clarifai: steel wood knife handle iron fork equipment nobody tool chrome
MetaMind: spatula
Orbeus: tool
AlchemyAPI: knife
4. Scoreboard
Imagga: signboard, scoreboard, board
Wolfram ImageIdentify: scoreboard
CamFind: baseball scoreboard
Clarifai: scoreboard soccer stadium football game competition goal group north america match
MetaMind: Scoreboard
Orbeus: billboard
AlchemyAPI: sport
5. German Shepherd
Imagga: shepherd dog, german shepherd, dog, canine, domestic animal, kelpie, doberman, pinscher, pet, animal
Wolfram ImageIdentify: German shepherd
CamFind: black and brown German shepherd
Clarifai: dog canine cute puppy mammal loyalty grass sheepdog fur German hepherd
MetaMind: German Shepherd, German Shepherd Dog, German Police Dog, Alsatian
Orbeus: animal
AlchemyAPI: dog
6. Toucan
Imagga: volleyball, ball, people, man, black, racket, body, person, game equipment, equipment (nice try)
Wolfram ImageIdentify: tufted puffin
CamFind: toucan bird
Clarifai: bird one north america nobody animal people adult nature two outdoors
MetaMind: toucan
Orbeus: animal
AlchemyAPI: sport
7. Indian Cobra
Imagga: Indian cobra, cobra, snake, thunder snake
Wolfram ImageIdentify: black-necked cobra
CamFind: brown and beige cobra snake
Clarifai: snake nobody reptile cobra wildlife daytime sand rattlesnake north america desert
MetaMind: Indian cobra, Naja Naja
Orbeus: animal
AlchemyAPI: snake
8. Strawberry
Imagga: berry, strawberry, fruit, edible fruit, produce, food, strawberries, juicy, sweet, dessert
Wolfram ImageIdentify: strawberry
CamFind: red strawberry ruit
Clarifai: fruit sweet food strawberry ripe juicy berry healthy isolated delicious
MetaMind: strawberry
Orbeus: strawberry
AlchemyAPI: berry
9. Wok
Imagga: plate, pan, wok, china, porcelain, food, dinner, cooking utensil, utensil, delicious
Wolfram ImageIdentify: cooking pan
CamFind: gray steel frying pan
Clarifai: ball nobody pan cutout kitchenware north america tableware competition bowl glass
MetaMind: wok
Orbeus: frying pan
AlchemyAPI: (No tags)
10. Shoe store
Imagga: black, symbol, business, food, design, pattern, sign, art, traditional
Wolfram ImageIdentify: store
CamFind: black crocs
Clarifai: colour street people color car mall road fair architecture hotel
MetaMind: Shoe Shop, Shoe Store
Orbeus: shoe shop
AlchemyAPI: sport
The fun part aside, we are quite interested to see soon a more comprehensive subjective and objective evaluation of all these services, including Imagga, with their pros and cons, on more representative and rich datasets, and depending on the way the tags will be used in different verticals and applications.
Competition is an important driver for every industry, so we are more than open to participate in such kinds of service comparisons and may even initiate such a comparison in the very near future.
Batch Image Processing From Local Folder Using Imagga API
This blog post is part of series on How-Tos for those of you who are not quite experienced and need a bit of help to set up and use properly our powerful image recognition APIs.
In this one we will help you to batch process (using our Tagging or Color extraction API) a whole folder of photos, that reside on your local computer. To make that possible we’ve written a short script in the programming language Python: https://bitbucket.org/snippets/imaggateam/LL6dd
Feel free to reuse or modify it. Here’s a short explanation what it does. The script requires the Python package, which you can install using this guide.
It uses requests’ HTTPBasicAuth to initialize a Basic authentication used in Imagga’s API from a given API_KEY and API_SECRET which you have to manually set in the first lines of the script.
There are three main functions in the script - upload_image, tag_image, extract_colors.
-
- upload_image(image_path) - uploads your file to our API using the content endpoint, the argument image_path is the path to the file in your local file system. The function returns the content id associated with the image.
- tag_image(image, content_id=False, verbose=False, language='en') - the function tags a given image using Imagga’s Tagging API. You can provide an image url or a content id (from upload_image) to the ‘image’ argument but you will also have to set content_id=True. By setting the verbose argument to True, the returned tags will also contain their origin (whether it is coming from machine learning recognition or from additional analysis). The last parameter is ‘language’ if you want your output tags to be translated in one of Imagga’s supported 50 (+1) languages. You can find the supported languages from here - http://docs.imagga.com/#auto-tagging
- extract_colors(image, content_id=False) - using this function you can extract colors from your image using our Color Extraction API. Just like the tag_image function, you can provide an image URL or a content id (by also setting content_id argument to True).
Script usage:
Note: You need to install the Python package requests in order to use the script. You can find installation notes here.
You have to manually set the API_KEY and API_SECRET variables found in the first lines of the script by replacing YOUR_API_KEY and YOUR_API_SECRET with your API key and secret.
Usage (in your terminal or CMD):
python tag_images.py <input_folder> <output_folder> --language=<language> --verbose=<verbose> --merged-output=<merged_output> --include-colors=<include_colors>
The script has two required - <input_folder>, <output_folder> and four optional arguments - <language>, <verbose>, <merged_output>, <include_colors>.
- <input_folder> - required, the input folder containing the images you would like to tag.
- <output_folder> - required, the output folder where the tagging JSON response will be saved.
- <language> - optional, default: en, the output tags will be translated in the given language (a list of supported languages can be found here: http://docs.imagga.com/#auto-tagging)
- <verbose> - optional, default: False, if True the output tags will contain an origin key (whether it is coming from machine learning recognition or from additional analysis)
- <include_colors> - optional, default: False, if True the output will also contain color extraction results for each image.
- <merged_output> - optional, default: False, if True the output will be merged in a JSON single file, otherwise - separate JSON files for each image.
Update Of Imagga Pricing Plans
We are excited to announce some changes to our API pricing policy. We’ve got lots of feedback and requests for more affordable ways to access our APIs.
Today, we are announcing Developer Plan for Imagga APIs, priced at $14/month that will allow the use of one of our APIs with up to 12 000 calls a month (3000/day, 2 requests/second). We believe this plan will bring on the table flexibility and the opportunity to apply our breakthrough technology on a more affordable price.
Hacker plan remains free but we are reducing the monthly calls to 2000 (200/day, 1 request per second) and will be available as before just for image tagging API.
We are eager to see you how gonna apply our technology in your projects! Send us feedback and any ideas you have regarding our technology offering in general or any tip you want to share.
Imagga in 2014
2014 was quite exciting and challenging for Imagga. One of the most important things that have happened is the significant improvement of out tagging technology. We’ve trained and learned to recognize new objects so the tags the tech returns are more relevant than ever. This wouldn't be possible without the committed efforts of our machine learning researchers and software engineers. We grew in numbers as well. We’ve also got a new website and better business offering - see our current pricing plans.
What would be the year without great hacker events. We’ve attended and partnered quite alot - Photo Hack Day NYC, Seedhack Lifelogging London, LDV Vision New York, Photo Hack Day Japan, Telerik Hackathon. It’s always nice to meet excited developers eager to get their hands dirty on our APIs.
The end of the year got us a nice surprise - awesome reward from Trento ITC Labs. Besides the cash, we are excited to be able to leverage on their research and business network and spend couple of weeks in Berlin and London.
What 2015 have in store?
We are getting ready for an exciting and quite intensive 2015.
- Several updates of the technology are pending - more concepts and objects to recognize
- great demo tools are in works that we believe will help us better explain the power behind Imagga image tagging technology
- personal photo categorization app
- discovering new verticals and awesome application of image recognition
If you haven’t tried our APIs, sign up, our hacker plan is free forever.
Can The Machine Beat Humans in Image Recognition
For far too long image understanding has been considered too complex for the machines to deal with. It takes years of training for the human brain to build links between the visible and connect it to concepts of shapes, colors and objects. Even though neural networks were invented couple of decades ago and were considered huge step into machine AI, what lacked was computing power. With the advance of GPU computing, new opportunities were discovered, algorithms were reinvented so machine and deep learning are back on the table.
The machines are powerful enough now to grasp the world almost as good as a 3 years old kid. A prerequisite for neural network to work well is a clear, representative data that will make the outcome results more precise and accurate. Huge efforts to collect and classify the images of the world were undertaken in the last couple of years. Are the machines ready for a battle then?
At Imagga we take that challenge seriously by building an intelligent image recognition technology that can teach the machine to understand basic daily life objects, comprehend concepts and eventually deal with complex pictures, where lots of background information needs to be taken into account in order to be interpreted properly. It’s challenging task but we love what we do.
With that stated, we are ready to set the stage for an epic battle, the battle of the century - machines vs humans. To some it might sound funny, unrealistic, pretentious, but it’s coming. At least now in a form of a cool game, done with love by Imagga and Algolia.
We’ve called it Human vs. Robot: Chash Of The Image Tags. You will be taking central role of judging who tags better - the human or the machine. You will be presented two sets of images for a given text tag and need to vote for the set that better represents the concept of the text tag. As every good judges you will need to be unbiased and make up your mind only on the facts, so you will not know which set was tagged by humans or respectively by machines. You will get five rounds to decide and pronounce a winner. Of course you can play as many times as you wish, and even invite your friends to try it out and have fun.
The game is made possible by the joined efforts of Algolia and Imagga. Algolia is building powerful search technology for exploration of large data sets. Algolia’s hosted search API delivers instant and relevant results as you type your search query. Imagga’s part is to provide the automated machine tagging of all the images you will be seeing in the search results.
It might be just a game, but the real idea is to demonstrate how powerful machine recognition is nowadays. It can really replace or at least greatly assist people in the process of tagging photos - it’s much faster, more cost effective, most of the time - more consistent and even more precise than human tagging. This empowers a lot of use-cases in stock photography, digital asset management, advertising, cloud storage and photo sharing that are otherwise not feasible or even not possible with human tagging.
Why don’t you play and judge for yourself Clash of the image tags!