The Kohonen Self-Organizing Map

Introduction

This chapter discusses one type of unsupervised competitive learning, the Kohonen feature map, or self-organizing map (SOM). As you recall, in unsupervised learning there are no expected outputs presented to a neural network, as in a supervised training algorithm such as backpropagation. Instead, a network, by its self-organizing properties, is able to infer relationships and learn more as more inputs are presented to it. One advantage to this scheme is that you can expect the system to change with changing conditions and inputs. The system constantly learns. The Kohonen SOM is a neural network system developed by Teuvo Kohonen of Helsinki University of Technology and is often used to classify inputs into different categories. Applications for feature maps can be traced to many areas, including speech recognition and robot motor control.

Competitive Learning

A Kohonen feature map may be used by itself or as a layer of another neural network. A Kohonen layer is composed of neurons that compete with each other. Like in Adaptive Resonance Theory, the Kohonen SOM is another case of using a winner-take-all strategy. Inputs are fed into each of the neurons in the Kohonen layer (from the input layer). Each neuron determines its output according to a weighted sum formula:

Output = Σ wij xi

The weights and the inputs are usually normalized, which means that the magnitude of the weight and input vectors are set equal to one. The neuron with the largest output is the winner. This neuron has a final output of 1. All other neurons in the layer have an output of zero. Differing input patterns end up firing different winner neurons. Similar or identical input patterns classify to the same output neuron. You get like inputs clustered together. In Chapter Application to Pattern Recognition, you will see the use of a Kohonen network in pattern classification.

Normalization of a Vector

Consider a vector, A = ax + by + cz. The normalized vector A’ is obtained by dividing each component of A by the square root of the sum of squares of all the components. In other words each component is multiplied by 1/ [radic](a2 + b2 + c2). Both the weight vector and the input vector are normalized during the operation of the Kohonen feature map. The reason for this is the training law uses subtraction of the weight vector from the input vector. Using normalization of the values in the subtraction reduces both vectors to a unit-less status, and hence, makes the subtraction of like quantities possible. You will learn more about the training law shortly.

Lateral Inhibition

Lateral inhibition is a process that takes place in some biological neural networks. Lateral connections of neurons in a given layer are formed, and squash distant neighbors. The strength of connections is inversely related to distance. The positive, supportive connections are termed as excitatory while the negative, squashing connections are termed inhibitory.

A biological example of lateral inhibition occurs in the human vision system.

The Mexican Hat Function

Figure The mexican hat function showing lateral inhibition shows a function, called the mexican hat function, which shows the relationship between the connection strength and the distance from the winning neuron. The effect of this function is to set up a competitive environment for learning. Only winning neurons and their neighbors participate in learning for a given input pattern.

 The mexican hat function showing lateral inhibition.

Training Law for the Kohonen Map

The training law for the Kohonen feature map is straightforward. The change in weight vector for a given output neuron is a gain constant, alpha, multiplied by the difference between the input vector and the old weight vector:

Wnew = Wold + alpha * (Input -Wold)

Both the old weight vector and the input vector are normalized to unit length. Alpha is a gain constant between 0 and 1.

Significance of the Training Law

Let us consider the case of a two-dimensional input vector. If you look at a unit circle, as shown in Figure The training law for the Kohonen map as shown on a unit circle, the effect of the training law is to try to align the weight vector and the input vector. Each pattern attempts to nudge the weight vector closer by a fraction determined by alpha. For three dimensions the surface becomes a unit sphere instead of a circle. For higher dimensions you term the surface a hypersphere. It is not necessarily ideal to have perfect alignment of the input and weight vectors. You use neural networks for their ability to recognize patterns, but also to generalize input data sets. By aligning all input vectors to the corresponding winner weight vectors, you are essentially memorizing the input data set classes. It may be more desirable to come close, so that noisy or incomplete inputs may still trigger the correct classification.

The training law for the Kohonen map as shown on a unit circle.

The Neighborhood Size and Alpha

In the Kohonen map, a parameter called the neighborhood size is used to model the effect of the mexican hat function. Those neurons that are within the distance specified by the neighborhood size participate in training and weight vector changes; those that are outside this distance do not participate in learning. The neighborhood size typically is started as an initial value and is decreased as the input pattern cycles continue. This process tends to support the winner-take-all strategy by eventually singling out a winner neuron for a given pattern.

Figure Winner neuron with a neighborhood size of 2 for a Kohonen map shows a linear arrangement of neurons with a neighborhood size of 2. The hashed central neuron is the winner. The darkened adjacent neurons are those that will participate in training.

Winner neuron with a neighborhood size of 2 for a Kohonen map.

Besides the neighborhood size, alpha typically is also reduced during simulation. You will see these features when we develop a Kohonen map program.

C++ Code for Implementing a Kohonen Map

The C++ code for the Kohonen map draws on much of the code developed for the backpropagation simulator. The Kohonen map is a much simpler program and may not rely on as large a data set for input. The Kohonen map program uses only two files, an input file and an output file. In order to use the program, you must create an input data set and save this in a file called input.dat. The output file is called kohonen.dat and is saved in your current working directory. You will get more details shortly on the formats of these files.

The Kohonen Network

The Kohonen network has two layers, an input layer and a Kohonen output layer. (See Figure A Kohonen network). The input layer is a size determined by the user and must match the size of each row (pattern) in the input data file.

A Kohonen network.

Modeling Lateral Inhibition and Excitation

The mexican hat function shows positive values for an immediate neighborhood around the neuron and negative values for distant neurons. A true method of modeling would incorporate mutual excitation or support for neurons that are within the neighborhood (with this excitation increasing for nearer neurons) and inhibition for distant neurons outside the neighborhood. For the sake of computational efficiency, we model lateral inhibition and excitation by looking at the maximum output for the output neurons and making that output belong to a winner neuron. Other outputs are inhibited by setting their outputs to zero. Training, or weight update, is performed on all outputs that are within a neighborhood size distance from the winner neuron. Neurons outside the neighborhood do not participate in training. The true way of modeling lateral inhibition would be too expensive since the number of lateral connections is quite large. You will find that this approximation will lead to a network with many if not all of the properties of a true modeling approach of a Kohonen network.

Classes to be Used

We use many of the classes from the backpropagation simulator. We require only two layers, the input layer and the Kohonen layer. We make a new layer class called the Kohonen layer class, and a new network class called the Kohonen_network.

Revisiting the Layer Class

The layer class needs to be slightly modified, as shown in Listing Modification of layer.h

// layer.h             V.Rao, H. Rao
// header file for the layer class hierarchy and
// the network class
 
#define MAX_LAYERS    5
#define MAX_VECTORS   100
 
class network;
class Kohonen_network;
 
class layer
{
 
protected:
 
       int num_inputs;
       int num_outputs;
       float *outputs;// pointer to array of outputs
       float *inputs; // pointer to array of inputs, which
                                     // are outputs of some other layer
 
       friend network;
       friend Kohonen_network; // update for Kohonen model
 
public:
 
       virtual void calc_out()=0;
};
...

Here the changes are indicated in italic. You notice that the Kohonen_network is made a friend to the layer class, so that the Kohonen_network can have access to the data of a layer.

A New Layer Class for a Kohonen Layer

The next step to take is to create a Kohonen_layer class and a Kohonen_network class. This is shown in Listing The Kohonen_layer class and Kohonen_network class in layerk.h

// layerk.h           V.Rao, H. Rao
// header file for the Kohonen layer and
// the Kohonen network
 
class Kohonen_network;
 
class Kohonen_layer: public layer {
 
protected:
       float * weights;
       int winner_index;
       float win_distance;
       int neighborhood_size;
       friend Kohonen_network;
 
public:
        Kohonen_layer(int, int, int);
        ~Kohonen_layer();
        virtual void calc_out();
        void randomize_weights();
        void update_neigh_size(int);
        void update_weights(const float);
        void list_weights();
        void list_outputs();
        float get_win_dist();
 
};
 
class Kohonen_network {
 
private:
        layer *layer_ptr[2];
        int layer_size[2];
        int neighborhood_size;
 
public:
        Kohonen_network();
        ~Kohonen_network();
        void get_layer_info();
        void set_up_network(int);
        void randomize_weights();
        void update_neigh_size(int);
        void update_weights(const float);
        void list_weights();
        void list_outputs();
        void get_next_vector(FILE *);
        void process_next_pattern();
        float get_win_dist();
        int get_win_index();
 
};

The Kohonen_layer is derived from the layer class, so it has pointers inherited that point to a set of outputs and a set of inputs. Let’s look at some of the functions and member variables.

Kohonen_layer:

  float * weights Pointer to the weights matrix.

  int winner_index Index value of the output, which is the winner.

  float win_distance The Euclidean distance of the winner weight vector from the input vector.

  int neighborhood_size The size of the neighborhood.

  Kohonen_layer(int, int, int) Constructor for the layer: inputs, outputs, and the neighborhood size.

  ~Kohonen_layer() Destructor.

  virtual void calc_out() The function to calculate the outputs; for the Kohonen layer this models lateral competition.

  void randomize_weights() A function to initialize weights with random normal values.

  void update_neigh_size(nt) This function updates the neighborhood size with a new value.

  void update_weights(const float) This function updates the weights according to the training law using the passed parameter, alpha.

  void list_weights() This function can be used to list the weight matrix.

  void list_outputs() This function is used to write outputs to the output file.

  float get_win_dist() Returns the Euclidean distance between the winner weight vector and the input vector.

Kohonen_network:

  layer *layer_ptr[2] Pointer array; element 0 points to the input layer, element 1 points to the Kohonen layer.

  int layer_size[2] Array of layer sizes for the two layers.

  int neighborhood_size The current neighborhood size.

  Kohonen_network() Constructor.

  ~Kohonen_network() Destructor.

  void get_layer_info() Gets information about the layer sizes.

  void set_up_network(int) Connects layers and sets up the Kohonen map.

  void randomize_weights() Creates random normalized weights.

  void update_neigh_size(int) Changes the neighborhood size.

  void update_weights(const float) Performs weight update according to the training law.

  void list_weights() Can be used to list the weight matrix.

  void list_outputs() Can be used to list outputs.

  void get_next_vector(FILE *) Function gets another input vector from the input file.

  void process_next_pattern() Applies pattern to the Kohonen map.

  float get_win_dist() Returns the winner’s distance from the input vector.

  int get_win_index() Returns the index of the winner.

Implementation of the Kohonen Layer and Kohonen Network

This listing shows the layerk.cpp file, which has the implementation of the functions outlined.

Implementation file for the Kohonen layer and Kohonen network :layerk.cpp

// layerk.cpp          V.Rao, H.Rao
// compile for floating point hardware if available
#include “layer.cpp
#include “layerk.h”
 
// ————————————————————-
//                          Kohonen layer
//—————————————————————
 
Kohonen_layer::Kohonen_layer(int i, int o, int  init_neigh_size) {
num_inputs=i;
num_outputs=o;
neighborhood_size=init_neigh_size;
weights = new float[num_inputs*num_outputs];
outputs = new float[num_outputs];
}
 
Kohonen_layer::~Kohonen_layer()
{
delete [num_outputs*num_inputs] weights;
delete [num_outputs] outputs;
}
void Kohonen_layer::calc_out()
{
// implement lateral competition
// choose the output with the largest
// value as the winner; neighboring
// outputs participate in next weight
// update. Winner’s output is 1 while
// all other outputs are zero
 
int i,j,k;
float accumulator=0.0;
float maxval;
winner_index=0;
maxval=-1000000;
 
for (j=0; j<num_outputs; j++)
        {
 
        for (i=0; i<num_inputs; i++)
 
                {
                k=i*num_outputs;
                if (weights[k+j]*weights[k+j] > 1000000.0)
                       {
                       cout << “weights are blowing up\n”;
                       cout << “try a smaller learning constant\n”;
                       cout << “e.g. beta=0.02
                       aborting...\n”;
                       exit(1);
                       }
                outputs[j]=weights[k+j]*(*(inputs+i));
                accumulator+=outputs[j];
                }
        // no squash function
        outputs[j]=accumulator;
        if (outputs[j] > maxval)
                {
                maxval=outputs[j];
                winner_index=j;
                }
                accumulator=0;
                }
 
        // set winner output to 1
        outputs[winner_index]=1.0;
        // now zero out all other outputs
        for (j=0; j< winner_index; j++)
                outputs[j]=0;
        for (j=num_outputs-1; j>winner_index; j—)
                outputs[j]=0;
 
}
 
void Kohonen_layer::randomize_weights()
{
int i, j, k;
const unsigned first_time=1;
 
const unsigned not_first_time=0;
float discard;
float norm;
 
discard=randomweight(first_time);
 
for (i=0; i< num_inputs; i++)
        {
        k=i*num_outputs;
        for (j=0; j< num_outputs; j++)
                {
                weights[k+j]=randomweight(not_first_time);
                }
        }
 
// now need to normalize the weight vectors
// to unit length
// a weight vector is the set of weights for
// a given output
 
for (j=0; j< num_outputs; j++)
    {
    norm=0;
          for (i=0; i< num_inputs; i++)
                {
                k=i*num_outputs;
                norm+=weights[k+j]*weights[k+j];
}
        norm = 1/((float)sqrt((double)norm));
 
for (i=0; i< num_inputs; i++)
        {
        k=i*num_outputs;
        weights[k+j]*=norm;
        }
    }
 
}
 
void Kohonen_layer::update_neigh_size(int new_neigh_size)
{
neighborhood_size=new_neigh_size;
}
 
void Kohonen_layer::update_weights(const float alpha)
{
int i, j, k;
int start_index, stop_index;
// learning law: weight_change =
//             alpha*(input-weight)
//      zero change if input and weight
// vectors are aligned
// only update those outputs that
// are within a neighborhood’s distance
// from the last winner
start_index = winner_index -
                      neighborhood_size;
 
if (start_index < 0)
       start_index =0;
 
stop_index = winner_index +
                      neighborhood_size;
 
if (stop_index > num_outputs-1)
       stop_index = num_outputs-1;
 
for (i=0; i< num_inputs; i++)
        {
        k=i*num_outputs;
        for (j=start_index; j<=stop_index; j++)
               weights[k+j] +=
                      alpha*((*(inputs+i))-weights[k+j]);
    }
 
}
 
void Kohonen_layer::list_weights()
{
int i, j, k;
 
for (i=0; i< num_inputs; i++)
        {
        k=i*num_outputs;
        for (j=0; j< num_outputs; j++)
                cout << “weight[“<<i<<”,”<<
                        j<<”] is: “<<weights[k+j];
        }
}
 
void Kohonen_layer::list_outputs()
{
int i;
 
for (i=0; i< num_outputs; i++)
       {
       cout << “outputs[“<<i<<
                      “] is: “<<outputs[i];
       }
 
}
 
float Kohonen_layer::get_win_dist()
{
int i, j, k;
j=winner_index;
float accumulator=0;
 
float * win_dist_vec = new float [num_inputs];
 
for (i=0; i< num_inputs; i++)
       {
       k=i*num_outputs;
       win_dist_vec[i]=(*(inputs+i))-weights[k+j];
    accumulator+=win_dist_vec[i]*win_dist_vec[i];
    }
 
win_distance =(float)sqrt((double)accumulator);
 
delete [num_inputs]win_dist_vec;
 
return win_distance;
 
}
 
Kohonen_network::Kohonen_network()
{
 
}
 
Kohonen_network::~Kohonen_network()
{
}
 
void Kohonen_network::get_layer_info()
{
int i;
 
//—————————————————————
//
//    Get layer sizes for the Kohonen network
//
// ————————————————————-
 
cout << “ Enter in the layer sizes separated by spaces.\n”;
cout << “ A Kohonen network has an input layer \n”;
cout << “ followed by a Kohonen (output) layer \n”;
 
for (i=0; i<2; i++)
        {
        cin >> layer_size[i];
        }
 
// ———————————————————————————
// size of layers:
//             input_layer       layer_size[0]
//             Kohonen_layer      layer_size[1]
//———————————————————————————-
 
}
void Kohonen_network::set_up_network(int nsz)
{
int i;
 
// set up neighborhood size
neighborhood_size = nsz;
 
//———————————————————————————-
// Construct the layers
//———————————————————————————-
 
layer_ptr[0] = new input_layer(0,layer_size[0]);
 
layer_ptr[1] =
       new Kohonen_layer(layer_size[0],
                      layer_size[1],neighborhood_size);
 
for (i=0;i<2;i++)
        {
        if (layer_ptr[i] == 0)
                {
                cout << “insufficient memory\n”;
                cout << “use a smaller architecture\n”;
                exit(1);
                }
        }
 
//———————————————————————————-
// Connect the layers
//———————————————————————————-
// set inputs to previous layer outputs for the Kohonen layer
layer_ptr[1]->inputs = layer_ptr[0]->outputs;
 
}
 
void Kohonen_network::randomize_weights()
{
 
((Kohonen_layer *)layer_ptr[1])
               ->randomize_weights();
}
 
void Kohonen_network::update_neigh_size(int n)
{
((Kohonen_layer *)layer_ptr[1])
               ->update_neigh_size(n);
}
 
void Kohonen_network::update_weights(const float a)
{
((Kohonen_layer *)layer_ptr[1])
               ->update_weights(a);
}
 
void Kohonen_network::list_weights()
{
((Kohonen_layer *)layer_ptr[1])
               ->list_weights();
}
 
void Kohonen_network::list_outputs()
{
((Kohonen_layer *)layer_ptr[1])
               ->list_outputs();
}
 
void Kohonen_network::get_next_vector(FILE * ifile)
{
int i;
float normlength=0;
int num_inputs=layer_ptr[1]->num_inputs;
float *in = layer_ptr[1]->inputs;
// get a vector and normalize it
for (i=0; i<num_inputs; i++)
        {
        fscanf(ifile,”%f”,(in+i));
        normlength += (*(in+i))*(*(in+i));
        }
fscanf(ifile,”\n”);
normlength = 1/(float)sqrt((double)normlength);
for (i=0; i< num_inputs; i++)
        {
        (*(in+i)) *= normlength;
        }
 
}
 
void Kohonen_network::process_next_pattern()
{
        layer_ptr[1]->calc_out();
}
 
float Kohonen_network::get_win_dist()
{
float  retval;
retval=((Kohonen_layer *)layer_ptr[1])
               ->get_win_dist();
 
return retval;
}
 
int Kohonen_network::get_win_index()
{
return ((Kohonen_layer *)layer_ptr[1])
               ->winner_index;
}

Flow of the Program and the main() Function

The main() function is contained in a file called kohonen.cpp, which is shown in next listing. To compile this program, you need only compile and make this main file, kohonen.cpp. Other files are included in this.

The main implementation file, kohonen.cpp for the Kohonen Map program

// kohonen.cpp        V. Rao, H. Rao
// Program to simulate a Kohonen map
 
#include “layerk.cpp
 
#define INPUT_FILE “input.dat
#define OUTPUT_FILE “kohonen.dat
#define dist_tol   0.05
 
void main()
{
 
int neighborhood_size, period;
float avg_dist_per_cycle=0.0;
float dist_last_cycle=0.0;
float avg_dist_per_pattern=100.0; // for the latest cycle
float dist_last_pattern=0.0;
float total_dist;
float alpha;
unsigned startup;
int max_cycles;
int patterns_per_cycle=0;
 
int total_cycles, total_patterns;
// create a network object
Kohonen_network knet;
 
FILE * input_file_ptr, * output_file_ptr;
 
// open input file for reading
if ((input_file_ptr=fopen(INPUT_FILE,”r”))==NULL)
               {
               cout << “problem opening input file\n”;
               exit(1);
               }
 
// open writing file for writing
if ((output_file_ptr=fopen(OUTPUT_FILE,”w”))==NULL)
               {
               cout << “problem opening output file\n”;
               exit(1);
               }
 
// ————————————————————-
//     Read in an initial values for alpha, and the
//  neighborhood size.
//  Both of these parameters are decreased with
//  time. The number of cycles to execute before
//  decreasing the value of these parameters is
//            called the period. Read in a value for the
//            period.
// ————————————————————-
              cout << “ Please enter initial values for:\n”;
              cout << “alpha (0.01-1.0),\n”;
              cout << “and the neighborhood size (integer between 0
              and
              50)\n”;
              cout << “separated by spaces, e.g. 0.3 5 \n “;
 
              cin >> alpha >> neighborhood_size ;
 
              cout << “\nNow enter the period, which is the\n”;
              cout << “number of cycles after which the values\n”;
              cout << “for alpha the neighborhood size are
                      decremented\n”;
              cout << “choose an integer between 1 and 500 , e.g. 50 \n”;
 
              cin >> period;
 
              // Read in the maximum number of cycles
              // each pass through the input data file is a cycle
              cout << “\nPlease enter the maximum cycles for the
                      simulation\n”;
              cout << “A cycle is one pass through the data set.\n”;
              cout << “Try a value of 500 to start with\n\n”;
 
              cin >> max_cycles;
 
// the main loop
//
//    continue looping until the average distance is less
//            than the tolerance specified at the top of this file
//            , or the maximum number of
//            cycles is exceeded;
 
// initialize counters
total_cycles=0; // a cycle is once through all the input data
total_patterns=0; // a pattern is one entry in the input data
 
// get layer information
knet.get_layer_info();
 
// set up the network connections
knet.set_up_network(neighborhood_size);
// initialize the weights
 
// randomize weights for the Kohonen layer
// note that the randomize function for the
// Kohonen simulator generates
// weights that are normalized to length = 1
knet.randomize_weights();
 
// write header to output file
fprintf(output_file_ptr,  “cycle\tpattern\twin index\tneigh_size\tavg_dist_per_pa  tern\n”);
 
fprintf(output_file_ptr,   “———————————————————————————\n”);
 
// main loop
 
startup=1;
total_dist=0;
 
while (
                      (avg_dist_per_pattern > dist_tol)
                      && (total_cycles < max_cycles)
                      || (startup==1)
                      )
{
startup=0;
dist_last_cycle=0; // reset for each cycle
patterns_per_cycle=0;
// process all the vectors in the datafile
 
while (!feof(input_file_ptr))
        {
        knet.get_next_vector(input_file_ptr);
        // now apply it to the Kohonen network
        knet.process_next_pattern();
 
dist_last_pattern=knet.get_win_dist();
 
// print result to output file
fprintf(output_file_ptr,”%i\t%i\t%i\t\t%i\t\t%f\n”,
     total_cycles,total_patterns,knet.get_win_index(),
     neighborhood_size,avg_dist_per_pattern);
     total_patterns++;
 
     // gradually reduce the neighborhood size
     // and the gain, alpha
     if (((total_cycles+1) % period) == 0)
             {
             if (neighborhood_size > 0)
                    neighborhood_size —;
             knet.update_neigh_size(neighborhood_size);
             if (alpha>0.1)
                    alpha -= (float)0.1;
             }
 
     patterns_per_cycle++;
     dist_last_cycle += dist_last_pattern;
     knet.update_weights(alpha);
     dist_last_pattern = 0;
  }
 
avg_dist_per_pattern= dist_last_cycle/patterns_per_cycle;
total_dist += dist_last_cycle;
total_cycles++;
 
fseek(input_file_ptr, 0L, SEEK_SET); // reset the file
                              pointer
                              // to the beginning of
                              // the file
} // end main loop
 
cout << “\n\n\n\n\n\n\n\n\n\n\n”;
cout << “—————————————————————————-\n”;
cout << “    done \n”;
avg_dist_per_cycle= total_dist/total_cycles;
 
cout << “\n”;
cout << “——>average dist per cycle = “ << avg_dist_per_cycle   << “ <—-\n”;
cout << “——>dist last cycle = “ << dist_last_cycle << “ <—-   \n”;
cout << “->dist last cycle per pattern= “ <<   avg_dist_per_pattern << “ <—-\n”;
cout << “——————>total cycles = “ << total_cycles << “ <—-\n”;
cout << “——————>total patterns = “ << total_patterns << “ <—-   \n”;
cout << “—————————————————————————-\n”;
// close the input file
fclose(input_file_ptr);
}
 

Flow of the Program

The flow of the program is very similar to the backpropagation simulator. The criterion for ending the simulation in the Kohonen program is the average winner distance. This is a Euclidean distance measure between the input vector and the winner’s weight vector. This distance is the square root of the sum of the squares of the differences between individual vector components between the two vectors.

Results from Running the Kohonen Program

Once you compile the program, you need to create an input file to try it. We will first use a very simple input file and examine the results.

A Simple First Example

Let us create an input file, input.dat, which contains only two arbitrary vectors:

0.4 0.98 0.1 0.2

0.5 0.22 0.8 0.9

 

The file contains two four-dimensional vectors. We expect to see output that contains a different winner neuron for each of these patterns. If this is the case, then the Kohonen map has assigned different categories for each of the input vectors, and, in the future, you can expect to get the same winner classification for vectors that are close to or equal to these vectors.

By running the Kohonen map program, you will see the following output (user input is italic):

Please enter initial values for:
alpha (0.01-1.0),
and the neighborhood size (integer between 0 and 50)
separated by spaces, e.g. 0.3 5
0.3 5
Now enter the period, which is the
number of cycles after which the values
for alpha the neighborhood size are decremented
choose an integer between 1 and 500 , e.g. 50
50
Please enter the maximum cycles for the simulation
A cycle is one pass through the data set.
Try a value of 500 to start with
500
 
 Enter in the layer sizes separated by spaces.
 A Kohonen network has an input layer
 followed by a Kohonen (output) layer
4 10
——————————————————————————
       done
——>average dist per cycle = 0.544275 <——-
——>dist last cycle = 0.0827523 <——-
->dist last cycle per pattern= 0.0413762 <——-
——————>total cycles = 11 <——-
——————>total patterns = 22 <——-
——————————————————————————

The layer sizes are given as 4 for the input layer and 10 for the Kohonen layer. You should choose the size of the Kohonen layer to be larger than the number of distinct patterns that you think are in the input data set. One of the outputs reported on the screen is the distance for the last cycle per pattern. This value is listed as 0.04, which is less than the terminating value set at the top of the kohonen.cpp file of 0.05. The map converged on a solution. Let us look at the file, kohonen.dat, the output file, to see the mapping to winner indexes:

cycle     pattern     win index     neigh_size    avg_dist_per_pattern
——————————————————————————————————————————————————————————————————————
0         0           1             5             100.000000
0         1           3             5             100.000000
1         2           1             5             0.304285
1         3           3             5             0.304285
2         4           1             5             0.568255
2         5           3             5             0.568255
3         6           1             5             0.542793
3         7           8             5             0.542793
4         8           1             5             0.502416
4         9           8             5             0.502416
5         10          1             5             0.351692
5         11          8             5             0.351692
6         12          1             5             0.246184
6         13          8             5             0.246184
7         14          1             5             0.172329
7         15          8             5             0.172329
8         16          1             5             0.120630
8         17          8             5             0.120630
9         18          1             5             0.084441
9         19          8             5             0.084441
10        20          1             5             0.059109
10        21          8             5             0.059109

In this example, the neighborhood size stays at its initial value of 5. In the first column you see the cycle number, and in the second the pattern number. Since there are two patterns per cycle, you see the cycle number repeated twice for each cycle.

The Kohonen map was able to find two distinct winner neurons for each of the patterns. One has winner index 1 and the other index 8.

Orthogonal Input Vectors Example

For a second example, look at Figure Orthogonal input vectors, where we choose input vectors on a two-dimensional unit circle that are 90° apart. The input.dat file should look like the following:

 1  0
 0  1
-1  0
 0 -1

Orthogonal input vectors.

Using the same parameters for the Kohonen network, but with layer sizes of 2 and 10, what result would you expect? The output file, kohonen.dat, follows:

cycle     pattern     win index    neigh_size      avg_dist_per_pattern
———————————————————————————————————————————————————————————————————————
0         0           4            5               100.000000
0         1           0            5               100.000000
0         2           9            5               100.000000
0         3           3            5               100.000000
1         4           4            5               0.444558
1         5           0            5               0.444558
 
497       1991        6            0               0.707107
498       1992        0            0               0.707107
498       1993        0            0               0.707107
498       1994        6            0               0.707107
498       1995        6            0               0.707107
499       1996        0            0               0.707107
499       1997        0            0               0.707107
499       1998        6            0               0.707107
499       1999        6            0               0.707107

You can see that this example doesn’t quite work. Even though the neighborhood size gradually got reduced to zero, the four inputs did not get categorized to different outputs. The winner distance became stuck at the value of 0.707, which is the distance from a vector at 45°. In other words, the map generalizes a little too much, arriving at the middle value for all of the input vectors.

You can fix this problem by starting with a smaller neighborhood size, which provides for less generalization. By using the same parameters and a neighborhood size of 2, the following output is obtained.

cycle     pattern     win index    neigh_size    avg_dist_per_pattern
—————————————————————————————————————————————————————————————————————
0         0           5            2             100.000000
0         1           6            2             100.000000
0         2           4            2             100.000000
0         3           9            2             100.000000
1         4           0            2             0.431695
1         5           6            2             0.431695
1         6           3            2             0.431695
1         7           9            2             0.431695
2         8           0            2             0.504728
2         9           6            2             0.504728
2         10          3            2             0.504728
2         11          9            2             0.504728
3         12          0            2             0.353309
3         13          6            2             0.353309
3         14          3            2             0.353309
3         15          9            2             0.353309
4         16          0            2             0.247317
4         17          6            2             0.247317
4         18          3            2             0.247317
4         19          9            2             0.247317
5         20          0            2             0.173122
5         21          6            2             0.173122
5         22          3            2             0.173122
5         23          9            2             0.173122
6         24          0            2             0.121185
6         25          6            2             0.121185
6         26          3            2             0.121185
6         27          9            2             0.121185
7         28          0            2             0.084830
7         29          6            2             0.084830
7         30          3            2             0.084830
7         31          9            2             0.084830
8         32          0            2             0.059381
8         33          6            2             0.059381
8         34          3            2             0.059381
8         35          9            2             0.059381

For this case, the network quickly converges on a unique winner for each of the four input patterns, and the distance criterion is below the set criterion within eight cycles. You can experiment with other input data sets and combinations of Kohonen network parameters.

Variations and Applications of Kohonen Networks

There are many variations of the Kohonen network. Some of these will be briefly discussed in this section.

Using a Conscience

DeSieno has used a conscience factor in a Kohonen network. For a winning neuron, if the neuron is winning more than a fair share of the time (roughly more than 1/n, where n is the number of neurons), then this neuron has a threshold that is applied temporarily to allow other neurons the chance to win. The purpose of this modification is to allow more uniform weight distribution while learning is taking place.

LVQ: Learning Vector Quantizer

You have read about LVQ (Learning Vector Quantizer) in previous chapters. In light of the Kohonen map, it should be pointed out that the LVQ is simply a supervised version of the Kohonen network. Inputs and expected output categories are presented to the network for training. You get data clustered, just as a Kohonen network, according to the similarity to other data inputs.

Counterpropagation Network

A neural network topology, called a counterpropagation network, is a combination of a Kohonen layer with a Grossberg layer. This network was developed by Robert Hecht-Nielsen and is useful for prototyping of systems, with a fairly rapid training time compared to backpropagation. The Kohonen layer provides for categorization, while the Grossberg layer allows for Hebbian conditioned learning. Counterpropagation has been used successfully in data compression applications for images. Compression ratios of 10:1 to 100:1 have been obtained, using a lossy compression scheme that codes the image with a technique called vector quantization, where the image is broken up into representative subimage vectors. The statistics of these vectors is such that you find that a large part of the image can be adequately represented by a subset of all the vectors. The vectors with the highest frequency of occurrence are coded with the shortest bit strings, hence you achieve data compression.

Application to Speech Recognition

Kohonen created a phonetic typewriter by classifying speech waveforms of different phonemes of Finnish speech into different categories using a Kohonen SOM. The Kohonen phoneme map used 50 samples of each phoneme for calibration. These samples caused excitation in a neighborhood of cells more strongly than in other cells. A neighborhood was labeled with the particular phoneme that caused excitation. For an utterance of speech made to the network, the exact neighborhoods that were active during the utterance were noted, and for how long, and in what sequence. Short excitations were taken as transitory sounds. The information obtained from the network was then pieced together to find out the words in the utterance made to the network.

Summary

In this chapter, you have learned about one of the important types of competitive learning called Kohonen feature map. The most significant points of this discussion are outlined as follows:

  The Kohonen feature map is an example of an unsupervised neural network that is mainly used as a classifier system or data clustering system. As more inputs are presented to this network, the network improves its learning and is able to adapt to changing inputs.

  The training law for the Kohonen network tries to align the weight vectors along the same direction as input vectors.

  The Kohonen network models lateral competition as a form of self-organization. One winner neuron is derived for each input pattern to categorize that input.

  Only neurons within a certain distance (neighborhood) from the winner are allowed to participate in training for a given input pattern.