Click here to Skip to main content
Click here to Skip to main content

Automatic Linguistic Indexing of Pictures (ALIP) By Artificial Neural Network Approach

By , 10 Sep 2009
Rate this:
Please Sign up or sign in to vote.
kesha

Contents

Introduction

While I have been coding some AI application I heard some mellow strains of a childish songstress coming from upstairs of the neighbours which they played repeatedly. It was sometimes hardly audible to catch the verses, but I managed to distinguish several characteristic phrases to have a look over some great web search engine (I like it, since it puts some of my codeproject code articles to first 1-2 pages of the search results). The only significant phrase from the song I submitted to the engine was (to prevent undue advertisment), say "фиолетовая паста" (violet paste). I expected it would have given scores of make up advertisments, but contrariwise, just one link from the first page of the search results among cosmetic industry spam pointed to some music web forum with exactly that phrase from the rhymes. The next click of mouse and second search over that engine gave me music group verses of the song, guitar tabs and put me to you tube so I was listening that marvelous music clip.

It is astounding how a person with permanent internet access can in few seconds, after having heard the music, be presented with the verses, group information and video clip to listen to. The process is described as searching on the media data content. As current web searches uses textual information to return results, consider you will be able to give it as a search query either audio, video or image sample the same way you submit your textual requests. Just as the computer was listening to some music it was able to present you the same information.

The concept known as Connected Visual Computing (CVC) is actively pursued by Intel. The CVC concerns the media data processing e.g. when in the field of view of your mobile phone cam emerges some object (ant for example) you can see on the screen its identification obtained by mobile analized its image, that it is say Camponotus herculeanus, or when you see some caption in the street on unknown language, you may view it through your mobile cam and it will display at the same location in the street the same caption but in your native language (augmented reality (AR), 2D/3D overlays), or the above presented example by the search using audio content. The market promises immense propagation. That introduced market will for the very long period of time keep the audience consuming modern hardware and software.

Here I'd like to present the general idia on how the computer may be used to desribe the image analyzing its pixel content known as the Automatic Linguistic Indexing of Pictures (ALIP). The approach is general and is always assumed to extract some descriptive features from the data and to use some rules to attibute the content to some category.

If you're intrested in the immediate applications you may contact the supporting firm System7 of the content based image recognition (CBIR) part of the project.

Background

Basic understanding of AI approaches e.g. neural networks, support vector machines, nearest neighbour classifiers. Image descriptive and transform methods as wavelets, edge extraction, image statistics, histograms. C++/C# experience as in this article you will find how to invoke C++ dll methods from within C# application.

Using the application

In my ALIP experiment I decided to annotate the simple natural image categories. There are 5 ANN classifiers in the project corresponding to:

  • Pictures that might contain animals
  • Pictures that might contain flowers
  • Pictures that might contain landscapes
  • Pictures that might contain sunsets
  • Others pictures that do not contain the above categories or simply unknown image type

You need to use unknown category along with the others you'd like to classify to. As otherwise AI classifier would be able to identify only e.g. animals, flowers, landscapes, sunsets with every image you give. But in real world there are other types of images that do not fall into either of the above presented categories, so you will need to meddle with AI classification thresholds which is rather cumbersome and awkward. But having additional unknown category AI classifier the results of the image identification will be as either one of the known image categories or simply unknown image type the computer can not identify using its petty knowledge.

I adore the image databases, they contain shots from all over the world really nice to observe. I've got about 20000 images for designers bought from a DVD shop. I've taken image samples from the animals, flowers, landscapes, sunsets image types and added all other image categories that do not come from the 4 ones to have unknown image type.

Now the usage of the program is simple enough. Just run the alip.exe and it will load all necessary AI classifiers files (in case of error you will have a message box and will not be able to use it). Then click the [...] button and select the directory that presumably contains some *.jpg files. You may use the ones supported in this demo under pics directory. All the found files will be added to the list box, then just click them to watch in the right panel and see the proposed category in the top left panel. In theory it should be able to comment the image as presented below.

Mi gato esta parado en el suelo

Methodology

Due to the competing intrests with the former organizations and the current one I work for, I will not be able to describe in minute details the methodology and feature extraction methods. I would rather present the general trend and categories of the features used for description of images. As searching over internet for corresponding feature computation will reveal all the necessary papers with particular formulae.

There are some demos availabe online e.g. ALIPr. They use hidden markov models HMMs and wavelet features from the images. You may try the pictures from that article using their methods or vice versa my application with their pictures and compare the annotation results.

As the AI approach is general and assumes some reduction of the original data dimensionality using either features extraction or PCA transform or both, all that is needed is to collect some data, extract the features and train AI classifiers. If you understand my face detection articles you will be able to repeat the experiment:

After you converted your raw image data to the features, just train some AI classifiers to discriminate desired positive category from negative ones.

ALIP features

Generaly they are divided into:

  • Color features
  • Texture features
  • Shape features

The Color features are simply the original raw image data, histogram of the image channels, image profile. Texture features are the known edge extraction methods, wavelet transforms, image statistics (e.g. 1st order: mean, std, skew; 2nd order: contrast, correlation, entropy...). And Shape features tries to estimate the object shapes found in the images. Just have a look at wiki for CBIR.

Typically the original image color space RGB is transformed to alternative spaces as YCbCr, HSV, HSI, CIEXYZ, etc... As alternative spaces might give better discrimination of the data, but you need to experiment with them anyway.

Source code tips

The point worth to mention here is the interaction from the C# application with C/C++ code in dll. As it leads to efficient way of coding the great GUI yet retaining the advantages of C/C++ native code.

Just create the simple C++ dll with some exported function:

Alip alip;

ALIP_API int alipClassify(const double* data, double* results, unsigned int* indices)
{
        return alip.classify(data, results, indices);
}

In C# application declare the functions in the class you will be calling from the dll:

[DllImport("alip")]
static extern unsafe int alipClassify(double* data, double* results, uint* indices);

Switch on the /unsafe code switch in application settings. Then using the fixed C# statement you may create the pointers to C# variables and pass them to C++ dll:

double[] results = new double[this.aiClassifiers.Count];
uint[] indices = new uint[this.aiClassifiers.Count];

fixed (double* pdata = cbir.CbirEntries[0].features.Features)
fixed (double* presults = results)
fixed (uint* pindices = indices)
{
        int res = alipClassify(pdata, presults, pindices);
        if (res != 0)                                                        
                throw new Exception(String.Format("alipClassify() returned {0}", res));
        
}                                        

ALIP results

I deliberatly selected the most simple image features, that do not look like a features at all, due to competing intrests with the former funding organization System7. I used just image itself, downscaled it to 16x16 and converted to YCbCr colorspace. Obviously that is not the proper feature to start with, as others would significantly outperform it in discrimination ability. However, though I anticipated the classification would be completely incorrect, to my great suprise it performed pretty well, producing quite precise results. Then consider the annotation quality had you used combination of color and texture features (e.g. histograms, statistics, entropy, etc...).

You may estimate the quality of the other feature types on cbir.system7.com demo. It just returns images that are close to the query one using some linear or non-linear distance metric. So it acts as some kind of kNN classifier, you just annotate the image type basing on the majority of the first several best matches returned, or in any other way combining the annotation.

For annotation I selected the 5 image categories:

  • animals - 900 pictures
  • flowers - 1100 pictures
  • landscapes - 1200 pictures
  • sunsets - 700 pictures
  • unknown - 1600 pictures of other types than the above 4

By all means there is interconnection between the categories, as flowers or animals pictures may be shot in landscape like surrounding, sunsets may also be the shots of the lanscapes, also some unknown pictures may contain one of the above 4 categories.

The single image feature vector is quite high dimensional as 16x16x3 = 768D. So I performed PCA dimensionality reduction to 70D space. The 70 eigenpictures contain 90% of variance retained. The eigenimages are presented as pca.nn file. And the first 60 eigen vectors for separate colorspace channels are presented below:

Y

Cr

Cb

They look pretty similar to the ones from my PCA based Face Detection article, which is attributed to the analysis of the natural image scenes.

Then having 70D data I used first half of the image categories for training AI classifiers and the rest halves for estimating classification accuracy. I opted for ANN classifiers with 70-20-1 structure, so there are 5 trained ANNs at all, every one is trained to separate its image category from all the others. The small number of hidden neurons and just 1 hidden layer will keep the ANN from overfiting the data.

The train part showed 8% error for classifying unknown image into one of the 4 known image categories (false positive rate), and 4% error for classifying one of the 4 known image categories into unknown (false negative rate). The test part showed worse results as 45% of false positive error rate and 20% of false negative rate.

They seem to be quite inaccurate on the test part, however this might be caused by the noise, as in unknown category there might be some images from known category and vise versa. I never trusted image database composers, and looking at 1000 images to deselect the wrong ones, might lead that after 5 minutes of work you may forget about the image category you're working with. The better way of course is the cbir.system7.com application. You just give it the desired image category sample image, e.g. with flowers, and it will return you the most closer images say from 1000000+ image database. Have a bash to do that manually.

But to the worse test images error rate also accounts the simplicity of the image features by all means.

Below I present the annotation results from the test part only to be fair. As there might be several ANNs with high outputs some shots contain annotation of more than one image type, e.g. animals in the landscape surroundings.

Animals category

animals

animals

animals

Actually annotated as landscape, but at 16x16 resolution it looks like that category. Remeber about worse error rates and 'noise' in the image categories.

animals

animals

animals

That one is better, animals in the landscape like surrounding.

animals

animals

animals

animals

animals

animals

animals

animals

animals

animals

animals

Flowers category

Flowers annotations are quite good also. It reveals landscape annotions in addition to flowers, as some images are quite similar to landscapes. There is also spurious animals group added sometimes.

flowers

flowers

flowers

flowers

flowers

flowers

flowers

flowers

flowers

flowers

flowers

flowers

Landscapes category

Here are the few shots of landscapes annotated as unknown category due to high negative error rate. Otherwise annotation is reasonable, revealing also additional category as sunsets added to landscape view in the evening.

landscapes

landscapes

landscapes

The landscape in the sunset. Adroit AI annotation.

landscapes

landscapes

landscapes

Sunsets category

Obviously, sunsets is the most simple picture type. Besides several unknown annotations, there are landscapes and some flowers during sunsets annotations. Well, AI never 'has been' taught to identify trees or palms, so it generalizes them to flowers. Otherwise very good results.

sunsets

Landscape with a sunset.

sunsets

'Flowers' in the sunset

sunsets

Landscape like picture, sunset behind mountain ridge, very romatic.

sunsets

The 'flowers' in the sunset.

sunsets

sunsets

sunsets

sunsets

Unknown sunset pictures.

sunsets

sunsets

sunsets

Another bunch of 'flowers' in the sunset.

sunsets

The next two, are these lanscape in the sunset or sunset in the landscape?

sunsets

sunsets

'Flowers' again in the sunset of a landscape.

sunsets

Very thin 'flowers' in the sunset.

sunsets

Londres?

sunsets

Unknown category

The unknown category showed about 43% of error on the test set, but there might be two possibilities to that percentage. Either the ANN failed to generalize well, showing much better performance on train set, or it might be due to the noise in the data set, e.g. incorrect measurements attributed to the unknown category while they are actually from others, e.g. sunsets, landscapes.

The test results rather prove to the benefit of AI than for the accuracy of human image categorization. Having few dozens of unknown pictures from the test set presented below, only few of them might be attributed to the pure unknown category. Others contain the scenes from landscapes, sunsets, animals categories, which were correctly identified by AI.

That one is fleshy and succulent.

unknown category?

unknown category?

unknown category?

unknown category?

unknown category?

The sunset from unknown category.

sunsets

The landscape generalization.

landscape

unknown category?

The sunset in the unknown category. La pareja va a abrazar.

sunsets

landscapes

The sunset again. La pareja se esta abrazando.

sunsets

sunsets

landscapes

landscapes

landscapes

The animals.

animals

animals

animals

Landscapes.

landscapes

landscapes

Flowers like image?

flowers

Looks like a sunset with flowers.

flowers, sunsets

flowers, sunsets

landscapes

unknown category

Here one may agree with AI.

unknown category

unknown category?

Live flowers, as in 'Alice in wonderland'. Better generalization.

flowers

AI-xenophobia?

The rest of the unknown samples annotated by AI pertaining to other category are rather controversial and defiant, as it tends to annotate the humans on the pictures as animals, what impertinence. The results can be attributed to:

  • AI generalization of the learned objects (e.g. trees identified as flowers)
  • AI proclamation of his superior intelligence over ordinary human being who he considers as animal species
  • AI gross error on the test set

The first scenario is pretty likely to occur, as AI already showed his capacity to generalize the similar objects to the only categories known to him, as in the case he annotated trees as flowers. The last is less probable, as the scenes are not quite different from the learned categories, so the greater false positive error is rather attributes to the benefit of AI generalization acumen.

Well, the second case is also might be possible. It seems even more dramatic to the benefit of science fiction writers, who forbode, that once computers will gain control, they would either exterminate the humans or subdue them to zoo, as we have done with the 'real' animals (e.g. I Robot, Terminator 3), as only AI revolution might save the human being from self-extermination from AI point of view.

I presume also, that, the second scenario might be the telling example to the benefit of Darwin theory, that humans evolved from the animals, as even dozen neurons of a simple AI understood that, while some persistent human beings try to disprove the obvious facts.

I looked over google for the term that might be applicable to such newly revealed phenomena. AI-xenophobia showed about 5 links only to some blog, the cyber-xenophobia is already coined to be the phenomena widely used by Japanese, or cyborg-xenophobia which does not reveal any links, but it is rather restricted to robo beings and not to general AI intelligence. Without discussing the already used terms in more details, all of them describe the actions of the humans in the cyberspace, and not AI against the human.

Who knows, that might be the first manifestation of the presumtuous AI action agains human by taunting at first. Beware yourself.

Anyway the results are shown below. I'm just presenting the AI understanding of the image content. Please forebear from taking his incentives too serious and do not cane me.

unknown category?

unknown category?

unknown category?

Might be he is proclaiming, beware, the AI is callous.

unknown category?

Someone may agree with the below examples of AI understanding.

unknown category?

unknown category?

unknown category?

Here AI is right at one point at least, landscape!

unknown category?

unknown category?

As the final words, try yourself different features and combinations, you might then be able to teach AI to respect humans, or simply add another category as images with humans.

At least AI indicates some reverence to his creator, as not puting me to animals.

unknown category?

Try him on images of yours.

License

This article, along with any associated source code and files, is licensed under The GNU General Public License (GPLv3)

About the Author

Chesnokov Yuriy
Engineer
Russian Federation Russian Federation
No Biography provided

Comments and Discussions

 
Questionnot run win 7 Pinmemberreza_ali20200029-Apr-13 22:06 
hi.
this program is not run in win 7(64bit).
can you help me for run this in win 7.
thanks.
Questionwell done PinmemberMalwinder Pal Singh Saggoo16-Apr-13 1:05 
GeneralMy vote of 5 PinmemberMalwinder Pal Singh Saggoo16-Apr-13 1:04 
GeneralMy vote of 5 PinmemberMatteo Fabbri4-Apr-13 14:09 
QuestionHow can i create .nn file ? PinmemberNguyễn Đức Năng25-Oct-12 5:16 
AnswerRe: How can i create .nn file ? PinmemberChesnokov Yuriy26-Oct-12 6:58 
GeneralRe: How can i create .nn file ? PinmemberNguyễn Đức Năng1-Dec-12 4:40 
QuestionA question about porting to Linux. [modified] Pinmemberwenlixin19-Sep-12 0:20 
AnswerRe: A question about porting to Linux. PinmemberChesnokov Yuriy25-Sep-12 1:08 
Questionporting to Linux using openCV Pinmemberdaehee han7118-Sep-12 22:08 
AnswerRe: porting to Linux using openCV PinmemberChesnokov Yuriy25-Sep-12 1:09 
QuestionColor, Shape and Feature Extraction PinmemberMember 851264611-Apr-12 14:08 
QuestionANN File Structure PinmemberMember 85126469-Apr-12 9:30 
AnswerRe: ANN File Structure PinmemberChesnokov Yuriy9-Apr-12 19:42 
QuestionFeatures extraction PinmemberjasmineN8916-Feb-12 1:28 
AnswerRe: Features extraction PinmemberChesnokov Yuriy16-Feb-12 19:23 
GeneralMy vote of 5 PinmemberKanasz Robert19-Dec-11 22:16 
Generaltest message. please disregard Pinmemberborn2c0de5-Nov-11 13:47 
GeneralGetting error in the code Pinmemberravi.mindlogy2-May-11 21:07 
GeneralCV&ML development direction Pinmembersirotenko28-May-10 19:59 
Generaltraining sets PinmemberSolovyenko Pavel3-Apr-10 8:04 
GeneralRe: training sets PinmemberChesnokov Yuriy5-Apr-10 1:52 
Generalalgorithm,advantages Pinmemberqjwgy21-Jan-10 15:10 
AnswerRe: algorithm,advantages PinmemberChesnokov Yuriy22-Jan-10 2:04 
QuestionHow to create new nn files Pinmembermerlin_rodrigues7-Jan-10 2:46 
AnswerRe: How to create new nn files PinmemberChesnokov Yuriy8-Jan-10 23:43 
GeneralI would like to get the content based image recognition (CBIR) part of the project. Pinmemberqjwgy5-Jan-10 2:19 
AnswerRe: I would like to get the content based image recognition (CBIR) part of the project. PinmemberChesnokov Yuriy8-Jan-10 23:38 
QuestionHow to do medical classification of disease? For example, endoscopic tumor, etc. Pinmemberqjwgy23-Dec-09 16:23 
AnswerRe: How to do medical classification of disease? For example, endoscopic tumor, etc. PinmemberChesnokov Yuriy23-Dec-09 19:14 
GeneralRe: How to do medical classification of disease? For example, endoscopic tumor, etc. Pinmemberqjwgy27-Dec-09 14:09 
Generalmisunderstanding NN Pinmemberarhimede11-Sep-09 12:43 
GeneralRe: misunderstanding NN PinmemberChesnokov Yuriy13-Sep-09 9:15 
Generalidentify naked people PinmemberUnruled Boy10-Jun-09 16:18 
AnswerRe: identify naked people PinmemberChesnokov Yuriy12-Jun-09 2:02 
GeneralRe: identify naked people PinmemberJim Crafton1-Jul-09 10:08 
QuestionImage Classification Pinmemberankswe18-Feb-09 20:29 
AnswerRe: Image Classification PinmemberChesnokov Yuriy19-Feb-09 20:23 
GeneralMy vote of 2 PinmvpJohn Simmons / outlaw programmer21-Dec-08 23:51 
AnswerRe: My vote of 2 PinmvpChesnokov Yuriy22-Dec-08 1:09 
Generali deed try it and ... Pinmembergillardg19-Dec-08 0:14 
QuestionRe: i deed try it and ... PinmvpChesnokov Yuriy19-Dec-08 2:28 
QuestionRe: i deed try it and ... PinmvpChesnokov Yuriy19-Dec-08 2:34 
AnswerRe: i deed try it and ... Pinmembergillardg19-Dec-08 2:51 
Generalaliper and others PinmvpChesnokov Yuriy18-Dec-08 20:24 
Generalai-xenophobia has been coined??? ;-) PinmvpChesnokov Yuriy18-Dec-08 20:18 
Question.nn files PinmemberDBuckner18-Dec-08 18:11 
AnswerRe: .nn files PinmvpChesnokov Yuriy18-Dec-08 19:46 
GeneralRe: .nn files PinmemberDBuckner19-Dec-08 3:12 
Questionhow about implement alip in c#? PinmemberUnruled Boy18-Dec-08 17:37 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web04 | 2.8.140415.2 | Last Updated 11 Sep 2009
Article Copyright 2008 by Chesnokov Yuriy
Everything else Copyright © CodeProject, 1999-2014
Terms of Use
Layout: fixed | fluid