Image analysis and pattern recognition systems often require to match two sets of points in space. This is because the analysed images are raster graphics or the extracted features are pixels subset of the original image. There exist a lot of applications and many methods that can be used to reach satisfactory results. But in any case the problem comes down to this :
Given
two sets of points in ddimensional space, we need to determine whether there
is a transformation among a specified group of space transformation,
that maps the first set onto (or satisfactorily close to) the second set of
points.
Most of the time the considered groups are translations and rigid motions in d‑dimensions. The meaning of "satisfactorily close to" depends on the method used and will be detailed later.
All of the methods must be robust to random perturbations, random point addition or deletion, and to scaling.
This tutorial is organized as follows. In Section 1, there is an outline of the applications of point pattern matching. In Section 2, we will describe 2 methods often given as references. These methods are: Branch and Bounded Algorithm and an algorithm proposed by F. Murtagh.
We encounter point pattern matching problems in such diverse fields as machine vision, astronautics, document processing, computational biology and computational chemistry.
In machine vision this can be used
to match corners and junction extracted by some detectors from pairs of
stereoscopic images in order to reconstruct a tridimensional scene.
The Institute for Aerospace and
Astronautics of the Technical University of Berlin uses pattern matching to
determine TUBSAT satellites attitude. The TUBSAT is equipped with a camera that
takes pictures of the starry sky. Matching the stars pattern with catalog
information enable control center to mesure the satellite position and orientation. The
same method could be used to classify starfield pictures in a catalog or
matching satellite images of landscape.
In the disciplines of biology and
chemistry, it has been used for autoradiograph alignment, pharmacophore
identification and protein structure alignment.
It is also used in image registration and
modelbased object recognition.
Overview
This method, proposed by Mount, Netanyahu and Le Moigne, is an approximation algorithm loosely based on a geometric branchand bound approach due to Huttenlocher and Rucklidge.
As afore mentioned, the goal is to find the affine transformations permitting to map one point set (A) on the other (B). For the sake of clarity, we will consider only rotation and translation in a two dimension space, the extension to higher dimensions being straightforward. The two point sets are obtained by feature extractors such as corner and junction detectors.
The method roughly consists in building and searching a tree where each node is associated with a set of transformations contained in some axisaligned hyperrectangle in the six dimensional transformation space. The transformation space can be expressed as a 2X2 array representing the rotation matrix and a 2element column vector representing the translation, giving six dimensions. The tree is formed by splitting the nodes until we reach the node that gives us the transformation t that minimize the distance between the transformed set A ( t(A) ) and B. We will use the Partial Hausdorff Distance to measure this distance. If we fail to find a transformation inside the error limits, the pattern doesn’t match.
· The hyperrectangles are called cells C. Each cell contains a pair of transformations t_{hi} and t_{lo} whose coordinates are the upper and lower bounds on the transformations of the cell. Any transformation whose coordinates lie between t_{hi} and t_{lo} is contained in the cell. The status of a cell is either Active, meaning that it is a candidate to provide the searched transformation or Killed, obviously meaning it cannot provide the solution. The cell is associated to a set of uncertainty regions, one for each point to be matched.
·
As
afore said the transformations are expressed as a 2x2 rotation matrix M
and a twoelement translation vector T. There is a transformation t
applied to a point a of set A:
t(a) = Ma + T.
·
By
"uncertainty region of a point", we mean the rectangle defined by the two corner
points obtained by the way of two transformations, t_{hi}
and t_{lo} on this point.
Building uncertainty regions. Here, t_{lo} is the identity transformation and t_{hi} is the composition of a 90^{o} ccw rotation and a translation. 
·
The
size of the uncertainty region is the size of its largest side;
·
The
size of a cell is the largest size among the associated uncertainty regions for
each point of a set.
·
The
algorithm is said to be an approximation algorithm because we will use the Partial Hausdorff Distance on a parameter a bit
lower than q and instead of killing a node when its lower PHD is not lower than the lowest till now, we
will now kill them when they wont be significantly lower. To do this
we will introduce three new parameters (e_{a}, e_{r}, and e_{q}), respectively absolute error bound, relative
error bound and quantile error bound. These parameters allow a trade off
between fastness and robustness.
·
The
weak quantile q’ is defined as q(1  e_{q} ).
What a daunting name for such a
simple concept! The Partial Hausdorff Distance between two sets is the k^{th}
smallest distance between a point and its nearest neighbour in the other set.
Here’s how to compute the Hausdorff Distance between set A and set B:
For each point of A find its nearest
neighbour in B, compute the Euclidian distance between each pairs and take the
k^{th} smallest.
In this method, we express k in
terms of a quantile 0 < q
£ 1 where k = qA.
Now, let’s explain the advantage of
using the PHD as a distance measure.
Nothing can guarantee that the two
sets will have the same number of points. Generally, that won’t be the case.
Using two different feature extraction methods on the two base images can be
one reason for this. Another one could be the presence of occlusion. The
feature points that are absent from the other image are called outliers.
By taking the maximum or the sum of
the distance set resulting from computing distance between each point in one
set and its nearest neighbour in the other set, we can obtain a result that
may be affected by the outliers. By selecting the k^{th} smallest
distance, we can reduce the probability of these measures being affected by
these outliers.
The inputs are: sets A and B, the first cell C_{0}, the Hausdorff quantile q and the tree approximation parameters e_{a}, e_{r}, and e_{q.}
To compute the lower bound of a cell on the PHD, we must compute the uncertainty regions for all points in A. After that, we compute the distance from those regions to the nearest neighbour in B (this is the shortest distance between the point of b and any point of the associated uncertainty region, if the point lies in the region then the distance is zero). Collect the k^{th} smallest of these distance where k=qA.
To compute the upper bound of a cell, we use the mean transformation t_{m} of the actual cell and we return the PHD on the weak quantile between t_{m}(A) and B. We use t_{m} because the best PHD a cell can produce is smaller than (or equal to) any sample contained in it.
The algorithm starts with an initial cell that is assumed to contain the searched transformation within its lower and upper bounds. Its status is Active. We keep track of the smallest PHD. At the beginning PHDbest = ¥.
We compute the lower bound of a cell on PHD. If the lower bound exceeds PHDBest, we kill the cell. Otherwise, we compute the upper bound, and if it is less than PHDBest, we update PHDBest. If the PHD doesn’t satisfy the stopping criterion, we split the cell into two new cells in order to reduce the size of the uncertainty regions as much as possible. The new cells status are set to active and those cells are placed in a priority queue sorted by their size in view of the next iteration. We split a cell according to its dimension that influences its size the most.
We stop the algorithm when the lower bound of a cell is lower than e_{a }or when all the cells are killed, which means that the patterns don’t match.
For best understanding, let's trace the algorithm with the following example.
We want to find the transform that maps set A on set B with a distance <= 0,25 (the threshold)  
B is identical to A, a with rotation and a translation applied to it.  
We begin with cell C_{0} that contains :
Instead of using a 2X2 rotation matrix and a 2element translation vector, we use one transform matrix. The six dimensions of a cell are the transforms two first rows. Point of A an B must be expressed as 3element vector with last component set to one. 

 




 


Although this algorithm uses brute force, it introduces the interesting concept: the world view vector of a point. This vector represents the "world" as seen from its associated point. This algorithm was developed to match star patterns, which contain a limited number of points, e.g. the Big Dipper (7 stars). That is why the use of brute force is not a big problem here.
Let’s explain this algorithm in detail. Remember that we want to map point set A on point set B as well as the transformation that realizes that. First, we must build a "world view vector" for each point i of the two sets. This vector contains the other n1 points of the set, represented by their polar coordinates (r,tetha) in respect to i. The vector is sorted by the amplitude of the angle tetha for each point in it. The radius components of all vectors are normalized on the largest r for scale independence. To facilitate the comparisons between the world view vectors, all the angles will be between 0^{o} and 360^{o} with increments of 1^{o}. Here's what we do in order to match the sets:
For each point i in A compare its world vector with each in B. The difference between the xy coordinates of matching point in B and the matching point in A gives us the translation vector. If a certain amount of pairs above a defined threshold are matching, we have patterns that match.
Rotation is handled as follows. We consider all the possible matchings between A and 360 versions of B: i.e. the "world vector" of B would be altogether rotated by 1^{o} in successive versions. This implies 360 runs of the above algorithm (remember "brute force"...).
Hausdorff
Distance
http://www.dai.ed.ac.uk/CVonline/LOCAL_COPIES/CHETVERIKOV/mysquash/node4.html
http://www.cs.cornell.edu/vision/hausdorff/hausmatch.html
Modelbased
object recognition
http://graphics.lcs.mit.edu/~seth/pubs/taskforce/paragraph3_5_0_0_1.html
http://ais.gmd.de/AS/janus/new/janus/cubefit.html
http://www.cs.cf.ac.uk/Dave/AI2/node201.html
http://www.dai.ed.ac.uk/CVonline/LOCAL_COPIES/OWENS/LECT13/node5.html
2D Gel Matching
http://gelmatching.inf.fuberlin.de/
Pharmacophore identification
http://www.netsci.org/Science/Cheminform/feature02.html
Autoradiograph alignment
http://www.cse.msu.edu/~dutanico/Rad/match2.html
D. M. Mount, N.S. Netanyahu, J. Le
Moigne, Efficient algorithms for robust feature matching. Pattern Recognition
vol. 32 (1999) pp. 1738.
F. Murtagh,
A New Approach to Point Pattern Matching. Publications of the Astronomical Society of the Pacific, 1992, in press.
D.P. Huttenlocher, W.J. Rucklidge, A
multiresolution technique for comparing images using the Hausdorff distance,
Proc. IEEE Conf. On Computer Vision and Pattern Recognition, New York, June
1993, pp. 705706.