Common Vision Blox 15.0
All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Properties Events Friends Modules Pages
Search and Classification with Polimago

C++ .NET Python
Cvb::Polimago Stemmer.Cvb.Polimago cvb.polimago

Introduction

Polimago is a trainable, supervised pattern recognition tool based on the kernel-learning methods of statistical learning theory. Thus, it aims to find a classifier separating two or more pattern classes which are provided as labeled examples during training.

Polimago is designed to operate very fast on images using the CPU only. The demand for annotated training examples is reasonably small compared to e.g. convolutional networks.

The software package aids the development of applications in machine vision for:

  • pattern recognition, i.e. find patterns within an image
  • pattern classification, i.e. assign a class label to a pattern

in a broad range of possible environments, encompassing industrial or medical imaging, human face detection or the classification of organic objects.

The package contains applications for:

  • interactive training and
  • testing as well as
  • a library of modules to execute the trained tasks.

The library functions are described in the API-part of this documentation. They also include functions needed for training, thus users can create their own training programs.

Here, we give a brief overview of the general mode of operation.

Prerequisites

Polimago generally operates on images provided by the Common Vision Blox Image Manager. The input images, however, must meet certain criteria for Polimago to work on them:

  1. The pixel format needs to be 8 bit unsigned.
    • If your image source provides a different bit-depth, use functions like MapTo8Bit, ConvertTo8BPPUnsigned or ScaleTo8BPPUnsigned as they provide the functionality required to correct the bit depth.
  2. The input images for Polimago are required to be monochrome or RGB, i.e. they need to have either one or three planes of pixel data.
    • If your image source provides a different number of planes the function CreateImageSubList may be applied to reduce the number of planes.
  3. The pixel data layout in memory must be linear (i.e. the X- and Y-VPAT of the input images must have the same increment for every column/line).
    • This can easily be verified with the function GetLinearAccess.
    • If GetLinearAccess returns FALSE you may use e.g. CreateDuplicateImageEx to correct the image's data layout.
  4. Finally, for 3-planar images (RGB) it is necessary for the x- and y-increments to be the same for all planes of the source image.
    • This may also be verified with the GetLinearAccess function by looking at the increments returned for the individual planes.
    • If the increments differ, again, CreateDuplicateImageEx may be used to correct the memory layout.

Condition 3 and 4 are usually only violated if the source image has been pre-processed e.g. with CreateImageMap or if the images have been acquired using some very old equipment. Otherwise these conditions are usually met 99% of the time.

Detecting patterns in images

To find a pattern in an image, Polimago searches for this pattern in the image and returns the positions of the patterns found. Before doing so, the pattern must be trained using some example images, which contain this pattern. To simplify the training, no negative examples of the image background need to be given. Polimago extracts these examples itself. Therefore, only positive examples of the pattern to be found need to be trained.

Training of such a Polimago search-classifier is done within the TeachBench as a Polimago Search Project. The following screenshot demonstrates the training set for detecting different kinds of cookies in an image.

Fig.1 - The training set for finding any cookie type contains a "Cookie" class with 21 examples.

Classifing patterns

In a second mode, Polimago operates as a classifier to assign one of several classes to a pattern. In this case, the pattern is not searched for in the image, its position is assumed to be known. In this classification mode, class labels can be chosen to be categorial (i.e. strings) or numerical expressions (scalars or vectors). When using numerical class labels, Polimago will assign a numerical value to a pattern via regression, i.e. an intermediate value that comes closest to the previously trained class labels. In the case of categorial class labels, Polimago will assign exactly one of the labels to the pattern.

Training of such a Polimago regression-classifier is done within the TeachBench as a Polimago Classification and Regression Project. The following screenshot shows the training set and a test image for classifing the cookie type.

Fig.2 - The training data for the classification of the cookie type contains four classes with 69 examples each. The test cookie is classified as type "Hazelnut".

Invariant pattern searching

Sometimes the patterns to be searched for are present in geometrically transformed versions, but should still be found. During training, Polimago can ensure that geometric transformations do not affect the search, i.e. pattern recognition will be invariant under these transformations.

Additional to the default translational invariance, Polimago can generate three further invariances

  • rotational invariance
  • scale invariance
  • affine invariance

For rotational and scale invariance, a range can be specified for the respective transformation. Affine invariance can partially compensate for perspective distortions, e.g. when images are captured at different viewing angles. Of course, this only works for planar objects or if the 3D structure is only small compared to the lateral size of the object pictured.

In addition to the generation of invariant classifiers, the results of the pattern search also contain specific values of the transformation parameters. Polimago thus provides the simultaneous invariant search for patterns and the measurement of transformation parameters.

Fig.3 - Selecting invariances in the "Learning Parameters" tab of TeachBench.

Processing speed

More often than not, processing time is a decisive factor in an image processing project. Polimago was designed to get search results as quickly as possible. To this end, Polimago learns to distinguish interesting image regions from uninteresting regions during the training process. As a result, there are some parameters that influence the processing speed during the search process.

One of these parameters is the Grid Step Size, which describes the graininess with which Polimago attempts to find interesting image regions during the search process. Higher values coarsen and speed up the search, but increase the risk of missing a pattern. This search parameter in turn is closely related to the Extraction Radius training parameter, which Polimago uses to learn to recognize interesting image regions.

We recommend testing different settings until the requirements are met. The TeachBench provides methods for this.

In many cases, if the pattern to be found is stable and easy to recognize, a search classifier may be sufficient. If the pattern to be found is rather indeterminate and difficult to detect (e.g. defects), it is often advisable to split the entire search process into a search classifier and a subsequent verification using a regression classifier.

The reason for this is that although a search classifier can be sufficiently accelerated by parameter optimization under certain circumstances, it may then produce false positives. Instead of preventing those false positives, they can often be eliminated as such by a second classification step. This classification step generally takes very little time compared to the runtime of an error-free search classifier.

Mention should also be made of the possibility of running several search classifiers in parallel, which is particularly useful when finding several different types of patterns that are not suitable for splitting the task into searching and classifying.

Examples

Example Description
Search and classify Search for cookies and classify them
Search and classify (QML C++ version) Search for cookies and classify them (QML C++ version)
Search and classify (QML Python version) Search for cookies and classify them (QML Python version)