This chapter discusses one type of unsupervised competitive learning, the *Kohonen** feature map*, or *self-organizing
map* (*SOM*). As you recall, in unsupervised learning there are no
expected outputs presented to a neural network, as in a supervised training
algorithm such as backpropagation. Instead, a
network, by its self-organizing properties, is able to infer relationships and
learn more as more inputs are presented to it. One advantage to this scheme is
that you can expect the system to change with changing conditions and inputs.
The system constantly learns. The Kohonen SOM is a
neural network system developed by Teuvo Kohonen of Helsinki University of Technology and is often
used to classify inputs into different categories. Applications for feature
maps can be traced to many areas, including speech recognition and robot motor
control.

A Kohonen feature map may be used by itself or as a layer of another neural network. A Kohonen layer is composed of neurons that compete with each other. Like in Adaptive Resonance Theory, the Kohonen SOM is another case of using a winner-take-all strategy. Inputs are fed into each of the neurons in the Kohonen layer (from the input layer). Each neuron determines its output according to a weighted sum formula:

Output = Σ w_{ij}
x_{i}

The weights and the inputs are usually normalized, which means that the
magnitude of the weight and input vectors are set equal to one. The neuron with
the largest output is the winner. This neuron has a final output of 1. All
other neurons in the layer have an output of zero. Differing input patterns end
up firing different winner neurons. Similar or identical input patterns
classify to the same output neuron. You get like inputs clustered together. In
Chapter **Application to Pattern
Recognition**, you will see the use of a Kohonen
network in pattern classification.

Consider a vector, **A** = *ax* + *by* + *cz*.
The normalized vector **A’** is obtained by dividing each component of
**A** by the square root of the sum of squares of all the components. In
other words each component is multiplied by 1/ [radic](*a*^{2}
+ *b*^{2} + *c ^{2}*). Both the weight vector and the
input vector are normalized during the operation of the Kohonen
feature map. The reason for this is the training law uses subtraction of the
weight vector from the input vector. Using normalization of the values in the
subtraction reduces both vectors to a unit-less status, and hence, makes the
subtraction of like quantities possible. You will learn more about the training
law shortly.

*Lateral inhibition* is a process that takes place in some biological
neural networks. Lateral connections of neurons in a given layer are formed,
and squash distant neighbors. The strength of connections is inversely related
to distance. The positive, supportive connections are termed as *excitatory*
while the negative, squashing connections are termed *inhibitory*.

A biological example of lateral inhibition occurs in the human vision system.

Figure **The mexican hat function showing lateral inhibition** shows a function,
called the **mexican**** hat** function, which
shows the relationship between the connection strength and the distance from
the winning neuron. The effect of this function is to set up a competitive
environment for learning. Only winning neurons and their neighbors participate
in learning for a given input pattern.

The **mexican****
hat** function showing lateral inhibition.

The training law for the Kohonen feature map is straightforward. The change in weight vector for a given output neuron is a gain constant, alpha, multiplied by the difference between the input vector and the old weight vector:

**W**_{new}
= **W**_{old} + alpha * (Input -**W**_{old})

Both the old weight vector and the input vector are normalized to unit length. Alpha is a gain constant between 0 and 1.

Let us consider the case of a two-dimensional input vector. If you look at a
unit circle, as shown in Figure **The
training law for the Kohonen map as shown on a unit
circle**, the effect of the training law is to try to align the weight vector
and the input vector. Each pattern attempts to nudge the weight vector closer
by a fraction determined by alpha. For three dimensions the surface becomes a
unit sphere instead of a circle. For higher dimensions you term the surface a *hypersphere*. It is not necessarily ideal to have
perfect alignment of the input and weight vectors. You use neural networks for
their ability to recognize patterns, but also to generalize input data sets. By
aligning all input vectors to the corresponding winner weight vectors, you are
essentially *memorizing* the input data set classes. It may be more
desirable to come close, so that noisy or incomplete inputs may still trigger
the correct classification.

The training law for the Kohonen map as shown on a unit circle.

In the Kohonen map, a parameter called the *neighborhood
size* is used to model the effect of the **mexican****
hat** function. Those neurons that are within the distance specified by the
neighborhood size participate in training and weight vector changes; those that
are outside this distance do not participate in learning. The neighborhood size
typically is started as an initial value and is decreased as the input pattern
cycles continue. This process tends to support the winner-take-all strategy by
eventually singling out a winner neuron for a given pattern.

Figure **Winner neuron with a
neighborhood size of 2 for a Kohonen map** shows a
linear arrangement of neurons with a neighborhood size of 2. The hashed central
neuron is the winner. The darkened adjacent neurons are those that will
participate in training.

Winner neuron with a neighborhood size of 2 for a Kohonen map.

Besides the neighborhood size, **alpha** typically is also reduced during
simulation. You will see these features when we develop a Kohonen
map program.

The C++ code for the Kohonen map draws on much of
the code developed for the backpropagation simulator.
The Kohonen map is a much simpler program and may not
rely on as large a data set for input. The Kohonen
map program uses only two files, an input file and an output file. In order to
use the program, you must create an input data set and save this in a file
called **input.dat.**
The output file is called **kohonen.dat** and is
saved in your current working directory. You will get more details shortly on
the formats of these files.

The Kohonen network has two layers, an *input*
layer and a *Kohonen** output* layer. (See
Figure **A Kohonen
network**). The input layer is a size determined by the user and must match
the size of each row (pattern) in the input data file.

A Kohonen network.

The **mexican**** hat** function shows
positive values for an immediate neighborhood around the neuron and negative
values for distant neurons. A true method of modeling would incorporate mutual
excitation or support for neurons that are within the neighborhood (with this
excitation increasing for nearer neurons) and inhibition for distant neurons
outside the neighborhood. For the sake of computational efficiency, we model
lateral inhibition and excitation by looking at the maximum output for the
output neurons and making that output belong to a winner neuron. Other outputs
are inhibited by setting their outputs to zero. Training, or weight update, is
performed on all outputs that are within a neighborhood size distance from the
winner neuron. Neurons outside the neighborhood do not participate in training.
The true way of modeling lateral inhibition would be too expensive since the
number of lateral connections is quite large. You will find that this
approximation will lead to a network with many if not all of the properties of
a true modeling approach of a Kohonen network.

We use many of the classes from the backpropagation
simulator. We require only two layers, the input layer and the Kohonen layer. We make a new layer class called the **Kohonen**** layer** class, and a new **network**
class called the **Kohonen****_network**.

The **layer** class needs to be slightly modified, as shown in Listing **Modification of layer.h**

`// layer.h V.Rao, H. Rao`

`// header file for the layer class hierarchy and`

`// the network class`

`#define MAX_LAYERS 5`

`#define MAX_VECTORS 100`

`class network;`

class Kohonen_network;

`class layer`

`{`

`protected:`

` int num_inputs;`

` int num_outputs;`

` float *outputs;// pointer to array of outputs`

` float *inputs; // pointer to array of inputs, which`

` // are outputs of some other layer`

` friend network;`

` `*friend Kohonen_network; // update for Kohonen model*

`public:`

` virtual void calc_out()=0;`

`};`

`...`

Here
the changes are indicated in italic. You notice that the **Kohonen****_network**
is made a friend to the **layer** class, so that the **Kohonen****_network**
can have access to the data of a layer.

The
next step to take is to create a **Kohonen****_layer**
class and a **Kohonen****_network** class. This
is shown in Listing **The Kohonen_layer class and Kohonen_network
class in layerk.h**

// layerk.h V.Rao, H. Rao

`// header file for the Kohonen layer and`

`// the Kohonen network`

`class Kohonen_network;`

class Kohonen_layer: public layer {

`protected:`

` float * weights;`

` int winner_index;`

` float win_distance;`

` int neighborhood_size;`

` friend Kohonen_network;`

`public:`

` Kohonen_layer(int, int, int);`

` ~Kohonen_layer();`

` virtual void calc_out();`

` void randomize_weights();`

` void update_neigh_size(int);`

` void update_weights(const float);`

` void list_weights();`

` void list_outputs();`

` float get_win_dist();`

`};`

class Kohonen_network {

`private:`

` layer *layer_ptr[2];`

` int layer_size[2];`

` int neighborhood_size;`

`public:`

` Kohonen_network();`

` ~Kohonen_network();`

` void get_layer_info();`

` void set_up_network(int);`

` void randomize_weights();`

` void update_neigh_size(int);`

` void update_weights(const float);`

` void list_weights();`

` void list_outputs();`

` void get_next_vector(FILE *);`

` void process_next_pattern();`

` float get_win_dist();`

` int get_win_index();`

`};`

The
**Kohonen****_layer** is derived from the **layer**
class, so it has pointers inherited that point to a set of outputs and a set of
inputs. Let’s look at some of the functions and member variables.

**Kohonen****_layer:**

**•** **float
* weights** Pointer to the weights matrix.

**•** **int**** winner_index** Index value of the output,
which is the winner.

**•** **float
win_distance** The Euclidean distance of the winner weight vector from the
input vector.

**•** **int**** neighborhood_size** The size of the
neighborhood.

**•** **Kohonen****_layer(int, int, int)** Constructor for the
layer: inputs, outputs, and the neighborhood size.

**•** **~Kohonen_layer()** Destructor.

**•** **virtual
void calc_out()** The function to calculate the
outputs; for the Kohonen layer this models lateral
competition.

**•** **void
randomize_weights()** A function to initialize weights with random normal
values.

**•** **void
update_neigh_size(nt)** This function updates the
neighborhood size with a new value.

**•** **void
update_weights(const float)** This function updates the weights according to
the training law using the passed parameter, alpha.

**•** **void
list_weights()** This function can be used to list the weight matrix.

**•** **void
list_outputs()** This function is used to write outputs to the output file.

**•** **float
get_win_dist()** Returns the Euclidean distance
between the winner weight vector and the input vector.

**Kohonen****_network:**

**•** **layer
*layer_ptr[2]** Pointer array; element 0 points to
the input layer, element 1 points to the Kohonen
layer.

**•** **int**** layer_size[2]** Array of layer sizes for
the two layers.

**•** **int**** neighborhood_size** The current
neighborhood size.

**•** **Kohonen****_network()** Constructor.

**•** **~Kohonen_network()** Destructor.

**•** **void
get_layer_info()** Gets information about the layer
sizes.

**•** **void
set_up_network(int)** Connects layers and sets up
the Kohonen map.

**•** **void
randomize_weights()** Creates random normalized weights.

**•** **void
update_neigh_size(int)** Changes the neighborhood
size.

**•** **void
update_weights(const float)** Performs weight update according to the
training law.

**•** **void
list_weights()** Can be used to list the weight matrix.

**•** **void
list_outputs()** Can be used to list outputs.

**•** **void
get_next_vector(FILE *)** Function gets another input vector from the input
file.

**•** **void
process_next_pattern()** Applies pattern to the Kohonen
map.

**•** **float
get_win_dist()** Returns the winner’s
distance from the input vector.

**•** **int**** get_win_index()** Returns the index of the
winner.

This listing
shows the *layerk.cpp* file, which has the
implementation of the functions outlined.

Implementation
file for the Kohonen layer and Kohonen
network :layerk.cpp

`// layerk.cpp V.Rao, H.Rao`

`// compile for floating point hardware if available`

`#include “layer.cpp”`

`#include “layerk.h”`

`// ————————————————————-`

`// Kohonen layer`

`//—————————————————————`

Kohonen_layer::Kohonen_layer(int i, int o, int init_neigh_size) {

num_inputs=i;

num_outputs=o;

`neighborhood_size=init_neigh_size;`

`weights = new float[num_inputs*num_outputs];`

`outputs = new float[num_outputs];`

`}`

Kohonen_layer::~Kohonen_layer()

`{`

`delete [num_outputs*num_inputs] weights;`

`delete [num_outputs] outputs;`

`}`

`void Kohonen_layer::calc_out()`

`{`

`// implement lateral competition`

`// choose the output with the largest`

`// value as the winner; neighboring`

`// outputs participate in next weight`

`// update. Winner’s output is 1 while`

`// all other outputs are zero`

int i,j,k;

`float accumulator=0.0;`

`float maxval;`

`winner_index=0;`

maxval=-1000000;

`for (j=0; j<num_outputs; j++)`

` {`

` for (i=0; i<num_inputs; i++)`

` {`

` k=i*num_outputs;`

` if (weights[k+j]*weights[k+j] > 1000000.0)`

` {`

` cout << “weights are blowing up\n”;`

` cout << “try a smaller learning constant\n”;`

` cout << “e.g. beta=0.02`

` aborting...\n”;`

` exit(1);`

` }`

` outputs[j]=weights[k+j]*(*(inputs+i));`

` accumulator+=outputs[j];`

` }`

` // no squash function`

` outputs[j]=accumulator;`

` if (outputs[j] > maxval)`

` {`

` maxval=outputs[j];`

` winner_index=j;`

` }`

` accumulator=0;`

` }`

` // set winner output to 1`

` outputs[winner_index]=1.0;`

` // now zero out all other outputs`

` for (j=0; j< winner_index; j++)`

` outputs[j]=0;`

` for (j=num_outputs-1; j>winner_index; j—)`

` outputs[j]=0;`

`}`

`void Kohonen_layer::randomize_weights()`

`{`

int i, j, k;

`const unsigned first_time=1;`

`const unsigned not_first_time=0;`

`float discard;`

`float norm;`

`discard=randomweight(first_time);`

`for (i=0; i< num_inputs; i++)`

` {`

` k=i*num_outputs;`

` for (j=0; j< num_outputs; j++)`

` {`

` weights[k+j]=randomweight(not_first_time);`

` }`

` }`

`// now need to normalize the weight vectors`

`// to unit length`

`// a weight vector is the set of weights for`

`// a given output`

`for (j=0; j< num_outputs; j++)`

` {`

` norm=0;`

` for (i=0; i< num_inputs; i++)`

` {`

` k=i*num_outputs;`

` norm+=weights[k+j]*weights[k+j];`

`}`

` norm = 1/((float)sqrt((double)norm));`

`for (i=0; i< num_inputs; i++)`

` {`

` k=i*num_outputs;`

` weights[k+j]*=norm;`

` }`

` }`

`}`

`void Kohonen_layer::update_neigh_size(int new_neigh_size)`

`{`

`neighborhood_size=new_neigh_size;`

`}`

`void Kohonen_layer::update_weights(const float alpha)`

`{`

int i, j, k;

int start_index, stop_index;

`// learning law: weight_change =`

`// alpha*(input-weight)`

`// zero change if input and weight`

`// vectors are aligned`

`// only update those outputs that`

`// are within a neighborhood’s distance`

`// from the last winner`

`start_index = winner_index -`

` neighborhood_size;`

`if (start_index < 0)`

` start_index =0;`

`stop_index = winner_index +`

` neighborhood_size;`

`if (stop_index > num_outputs-1)`

` stop_index = num_outputs-1;`

`for (i=0; i< num_inputs; i++)`

` {`

` k=i*num_outputs;`

` for (j=start_index; j<=stop_index; j++)`

` weights[k+j] +=`

` alpha*((*(inputs+i))-weights[k+j]);`

` }`

`}`

`void Kohonen_layer::list_weights()`

`{`

int i, j, k;

`for (i=0; i< num_inputs; i++)`

` {`

` k=i*num_outputs;`

` for (j=0; j< num_outputs; j++)`

` cout << “weight[“<<i<<”,”<<`

` j<<”] is: “<<weights[k+j];`

` }`

`}`

`void Kohonen_layer::list_outputs()`

`{`

int i;

`for (i=0; i< num_outputs; i++)`

` {`

` cout << “outputs[“<<i<<`

` “] is: “<<outputs[i];`

` }`

`}`

`float Kohonen_layer::get_win_dist()`

`{`

int i, j, k;

`j=winner_index;`

`float accumulator=0;`

`float * win_dist_vec = new float [num_inputs];`

`for (i=0; i< num_inputs; i++)`

` {`

` k=i*num_outputs;`

` win_dist_vec[i]=(*(inputs+i))-weights[k+j];`

` accumulator+=win_dist_vec[i]*win_dist_vec[i];`

` }`

`win_distance =(float)sqrt((double)accumulator);`

`delete [num_inputs]win_dist_vec;`

`return win_distance;`

`}`

Kohonen_network::Kohonen_network()

`{`

`}`

Kohonen_network::~Kohonen_network()

`{`

`}`

`void Kohonen_network::get_layer_info()`

`{`

int i;

`//—————————————————————`

`//`

`// Get layer sizes for the Kohonen network`

`//`

`// ————————————————————-`

cout << “ Enter in the layer sizes separated by spaces.\n”;

cout << “ A Kohonen network has an input layer \n”;

cout << “ followed by a Kohonen (output) layer \n”;

`for (i=0; i<2; i++)`

` {`

` cin >> layer_size[i];`

` }`

`// ———————————————————————————`

`// size of layers:`

`// input_layer layer_size[0]`

`// Kohonen_layer layer_size[1]`

`//———————————————————————————-`

`}`

`void Kohonen_network::set_up_network(int nsz)`

`{`

int i;

`// set up neighborhood size`

`neighborhood_size = nsz;`

`//———————————————————————————-`

`// Construct the layers`

`//———————————————————————————-`

`layer_ptr[0] = new input_layer(0,layer_size[0]);`

`layer_ptr[1] =`

` new Kohonen_layer(layer_size[0],`

` layer_size[1],neighborhood_size);`

`for (i=0;i<2;i++)`

` {`

` if (layer_ptr[i] == 0)`

` {`

` cout << “insufficient memory\n”;`

` cout << “use a smaller architecture\n”;`

` exit(1);`

` }`

` }`

`//———————————————————————————-`

`// Connect the layers`

`//———————————————————————————-`

`// set inputs to previous layer outputs for the Kohonen layer`

`layer_ptr[1]->inputs = layer_ptr[0]->outputs;`

`}`

`void Kohonen_network::randomize_weights()`

`{`

`((Kohonen_layer *)layer_ptr[1])`

` ->randomize_weights();`

`}`

`void Kohonen_network::update_neigh_size(int n)`

`{`

`((Kohonen_layer *)layer_ptr[1])`

` ->update_neigh_size(n);`

`}`

`void Kohonen_network::update_weights(const float a)`

`{`

`((Kohonen_layer *)layer_ptr[1])`

` ->update_weights(a);`

`}`

`void Kohonen_network::list_weights()`

`{`

`((Kohonen_layer *)layer_ptr[1])`

` ->list_weights();`

`}`

`void Kohonen_network::list_outputs()`

`{`

`((Kohonen_layer *)layer_ptr[1])`

` ->list_outputs();`

`}`

`void Kohonen_network::get_next_vector(FILE * ifile)`

`{`

int i;

`float normlength=0;`

int num_inputs=layer_ptr[1]->num_inputs;

`float *in = layer_ptr[1]->inputs;`

`// get a vector and normalize it`

`for (i=0; i<num_inputs; i++)`

` {`

` fscanf(ifile,”%f”,(in+i));`

` normlength += (*(in+i))*(*(in+i));`

` }`

fscanf(ifile,”\n”);

normlength = 1/(float)sqrt((double)normlength);

`for (i=0; i< num_inputs; i++)`

` {`

` (*(in+i)) *= normlength;`

` }`

`}`

`void Kohonen_network::process_next_pattern()`

`{`

` layer_ptr[1]->calc_out();`

`}`

`float Kohonen_network::get_win_dist()`

`{`

`float retval;`

retval=((Kohonen_layer *)layer_ptr[1])

` ->get_win_dist();`

`return retval;`

`}`

int Kohonen_network::get_win_index()

`{`

`return ((Kohonen_layer *)layer_ptr[1])`

` ->winner_index;`

`}`

The
**main()** function is contained in a file called **kohonen.cpp**,
which is shown in next listing. To compile this program, you need only compile and
make this main file, kohonen.cpp. Other files are
included in this.

The
main implementation file, kohonen.cpp for the Kohonen Map program

`// kohonen.cpp V. Rao, H. Rao`

`// Program to simulate a Kohonen map`

`#include “layerk.cpp”`

`#define INPUT_FILE “input.dat”`

`#define OUTPUT_FILE “kohonen.dat”`

`#define dist_tol 0.05`

`void main()`

`{`

int neighborhood_size, period;

`float avg_dist_per_cycle=0.0;`

`float dist_last_cycle=0.0;`

`float avg_dist_per_pattern=100.0; // for the latest cycle`

`float dist_last_pattern=0.0;`

`float total_dist;`

`float alpha;`

`unsigned startup;`

int max_cycles;

int patterns_per_cycle=0;

int total_cycles, total_patterns;

`// create a network object`

Kohonen_network knet;

`FILE * input_file_ptr, * output_file_ptr;`

`// open input file for reading`

`if ((input_file_ptr=fopen(INPUT_FILE,”r”))==NULL)`

` {`

` cout << “problem opening input file\n”;`

` exit(1);`

` }`

`// open writing file for writing`

`if ((output_file_ptr=fopen(OUTPUT_FILE,”w”))==NULL)`

` {`

` cout << “problem opening output file\n”;`

` exit(1);`

` }`

`// ————————————————————-`

`// Read in an initial values for alpha, and the`

`// neighborhood size.`

`// Both of these parameters are decreased with`

`// time. The number of cycles to execute before`

`// decreasing the value of these parameters is`

`// called the period. Read in a value for the`

`// period.`

`// ————————————————————-`

` cout << “ Please enter initial values for:\n”;`

` cout << “alpha (0.01-1.0),\n”;`

` cout << “and the neighborhood size (integer between 0`

` and`

` 50)\n”;`

` cout << “separated by spaces, e.g. 0.3 5 \n “;`

` cin >> alpha >> neighborhood_size ;`

` cout << “\nNow enter the period, which is the\n”;`

` cout << “number of cycles after which the values\n”;`

` cout << “for alpha the neighborhood size are`

` decremented\n”;`

` cout << “choose an integer between 1 and 500 , e.g. 50 \n”;`

` cin >> period;`

` // Read in the maximum number of cycles`

` // each pass through the input data file is a cycle`

` cout << “\nPlease enter the maximum cycles for the`

` simulation\n”;`

` cout << “A cycle is one pass through the data set.\n”;`

` cout << “Try a value of 500 to start with\n\n”;`

` cin >> max_cycles;`

`// the main loop`

`//`

`// continue looping until the average distance is less`

`// than the tolerance specified at the top of this file`

`// , or the maximum number of`

`// cycles is exceeded;`

`// initialize counters`

`total_cycles=0; // a cycle is once through all the input data`

`total_patterns=0; // a pattern is one entry in the input data`

`// get layer information`

knet.get_layer_info();

`// set up the network connections`

knet.set_up_network(neighborhood_size);

`// initialize the weights`

`// randomize weights for the Kohonen layer`

`// note that the randomize function for the`

`// Kohonen simulator generates`

`// weights that are normalized to length = 1`

knet.randomize_weights();

`// write header to output file`

fprintf(output_file_ptr, “cycle\tpattern\twin index\tneigh_size\tavg_dist_per_pa tern\n”);

fprintf(output_file_ptr, “———————————————————————————\n”);

`// main loop`

`startup=1;`

`total_dist=0;`

`while (`

` (avg_dist_per_pattern > dist_tol)`

` && (total_cycles < max_cycles)`

` || (startup==1)`

` )`

`{`

`startup=0;`

dist_last_cycle=0; // reset for each cycle

`patterns_per_cycle=0;`

`// process all the vectors in the datafile`

`while (!feof(input_file_ptr))`

` {`

` knet.get_next_vector(input_file_ptr);`

` // now apply it to the Kohonen network`

` knet.process_next_pattern();`

dist_last_pattern=knet.get_win_dist();

`// print result to output file`

fprintf(output_file_ptr,”%i\t%i\t%i\t\t%i\t\t%f\n”,

` total_cycles,total_patterns,knet.get_win_index(),`

` neighborhood_size,avg_dist_per_pattern);`

` total_patterns++;`

` // gradually reduce the neighborhood size`

` // and the gain, alpha`

` if (((total_cycles+1) % period) == 0)`

` {`

` if (neighborhood_size > 0)`

` neighborhood_size —;`

` knet.update_neigh_size(neighborhood_size);`

` if (alpha>0.1)`

` alpha -= (float)0.1;`

` }`

` patterns_per_cycle++;`

` dist_last_cycle += dist_last_pattern;`

` knet.update_weights(alpha);`

` dist_last_pattern = 0;`

` }`

avg_dist_per_pattern= dist_last_cycle/patterns_per_cycle;

`total_dist += dist_last_cycle;`

`total_cycles++;`

fseek(input_file_ptr, 0L, SEEK_SET); // reset the file

` pointer`

` // to the beginning of`

` // the file`

`} // end main loop`

cout << “\n\n\n\n\n\n\n\n\n\n\n”;

cout << “—————————————————————————-\n”;

cout << “ done \n”;

avg_dist_per_cycle= total_dist/total_cycles;

cout << “\n”;

cout << “——>average dist per cycle = “ << avg_dist_per_cycle << “ <—-\n”;

cout << “——>dist last cycle = “ << dist_last_cycle << “ <—- \n”;

cout << “->dist last cycle per pattern= “ << avg_dist_per_pattern << “ <—-\n”;

cout << “——————>total cycles = “ << total_cycles << “ <—-\n”;

cout << “——————>total patterns = “ << total_patterns << “ <—- \n”;

cout << “—————————————————————————-\n”;

`// close the input file`

fclose(input_file_ptr);

}

The
flow of the program is very similar to the backpropagation
simulator. The criterion for ending the simulation in the Kohonen
program is the *average winner distance*. This is a Euclidean distance
measure between the input vector and the winner’s weight vector. This
distance is the square root of the sum of the squares of the differences
between individual vector components between the two vectors.

Once you compile the program, you need to create an input file to try it. We will first use a very simple input file and examine the results.

Let us create an input file, input.dat, which contains only two arbitrary vectors:

0.4 0.98 0.1 0.2 |

0.5 0.22 0.8 0.9 |

The file contains two four-dimensional vectors. We
expect to see output that contains a different winner neuron for each of these
patterns. If this is the case, then the Kohonen map
has assigned different categories for each of the input vectors, and, in the
future, you can expect to get the same winner classification for vectors that
are close to or equal to these vectors.

By
running the Kohonen map program, you will see the
following output (user input is italic):

`Please enter initial values for:`

`alpha (0.01-1.0),`

`and the neighborhood size (integer between 0 and 50)`

`separated by spaces, e.g. 0.3 5`

0.3 5

`Now enter the period, which is the`

`number of cycles after which the values`

`for alpha the neighborhood size are decremented`

`choose an integer between 1 and 500 , e.g. 50`

50

`Please enter the maximum cycles for the simulation`

`A cycle is one pass through the data set.`

`Try a value of 500 to start with`

500

` Enter in the layer sizes separated by spaces.`

` A Kohonen network has an input layer`

` followed by a Kohonen (output) layer`

4 10

`——————————————————————————`

` done`

`——>average dist per cycle = 0.544275 <——-`

`——>dist last cycle = 0.0827523 <——-`

`->dist last cycle per pattern= 0.0413762 <——-`

`——————>total cycles = 11 <——-`

`——————>total patterns = 22 <——-`

`——————————————————————————`

The
layer sizes are given as 4 for the input layer and 10 for the Kohonen layer. You should choose the size of the Kohonen layer to be larger than the number of distinct
patterns that you think are in the input data set. One of the outputs reported
on the screen is the distance for the last cycle per pattern. This value is
listed as 0.04, which is less than the terminating value set at the top of the kohonen.cpp file of 0.05. The map converged on a solution.
Let us look at the file, kohonen.dat, the output
file, to see the mapping to winner indexes:

`cycle pattern win index neigh_size avg_dist_per_pattern`

`——————————————————————————————————————————————————————————————————————`

`0 0 1 5 100.000000`

`0 1 3 5 100.000000`

`1 2 1 5 0.304285`

`1 3 3 5 0.304285`

`2 4 1 5 0.568255`

`2 5 3 5 0.568255`

`3 6 1 5 0.542793`

`3 7 8 5 0.542793`

`4 8 1 5 0.502416`

`4 9 8 5 0.502416`

`5 10 1 5 0.351692`

`5 11 8 5 0.351692`

`6 12 1 5 0.246184`

`6 13 8 5 0.246184`

`7 14 1 5 0.172329`

`7 15 8 5 0.172329`

`8 16 1 5 0.120630`

`8 17 8 5 0.120630`

`9 18 1 5 0.084441`

`9 19 8 5 0.084441`

`10 20 1 5 0.059109`

`10 21 8 5 0.059109`

In this example, the neighborhood size stays at its initial value of 5. In the first column you see the cycle number, and in the second the pattern number. Since there are two patterns per cycle, you see the cycle number repeated twice for each cycle.

The Kohonen map was able to
find two distinct winner neurons for each of the patterns. One has winner index
1 and the other index 8.

For
a second example, look at **Figure Orthogonal
input vectors**, where we choose input vectors on a two-dimensional unit
circle that are 90° apart. The input.dat file
should look like the following:

` 1 0`

` 0 1`

`-1 0`

` 0 -1`

Orthogonal input vectors.

Using
the same parameters for the Kohonen network, but with
layer sizes of 2 and 10, what result would you expect? The output file, kohonen.dat, follows:

`cycle pattern win index neigh_size avg_dist_per_pattern`

`———————————————————————————————————————————————————————————————————————`

`0 0 4 5 100.000000`

`0 1 0 5 100.000000`

`0 2 9 5 100.000000`

`0 3 3 5 100.000000`

`1 4 4 5 0.444558`

`1 5 0 5 0.444558`

`497 1991 6 0 0.707107`

`498 1992 0 0 0.707107`

`498 1993 0 0 0.707107`

`498 1994 6 0 0.707107`

`498 1995 6 0 0.707107`

`499 1996 0 0 0.707107`

`499 1997 0 0 0.707107`

`499 1998 6 0 0.707107`

`499 1999 6 0 0.707107`

You can see that this example doesn’t quite work. Even though the neighborhood size gradually got reduced to zero, the four inputs did not get categorized to different outputs. The winner distance became stuck at the value of 0.707, which is the distance from a vector at 45°. In other words, the map generalizes a little too much, arriving at the middle value for all of the input vectors.

You
can fix this problem by starting with a smaller neighborhood size, which
provides for less generalization. By using the same parameters and a
neighborhood size of 2, the following output is obtained.

`cycle pattern win index neigh_size avg_dist_per_pattern`

`—————————————————————————————————————————————————————————————————————`

`0 0 5 2 100.000000`

`0 1 6 2 100.000000`

`0 2 4 2 100.000000`

`0 3 9 2 100.000000`

`1 4 0 2 0.431695`

`1 5 6 2 0.431695`

`1 6 3 2 0.431695`

`1 7 9 2 0.431695`

`2 8 0 2 0.504728`

`2 9 6 2 0.504728`

`2 10 3 2 0.504728`

`2 11 9 2 0.504728`

`3 12 0 2 0.353309`

`3 13 6 2 0.353309`

`3 14 3 2 0.353309`

`3 15 9 2 0.353309`

`4 16 0 2 0.247317`

`4 17 6 2 0.247317`

`4 18 3 2 0.247317`

`4 19 9 2 0.247317`

`5 20 0 2 0.173122`

`5 21 6 2 0.173122`

`5 22 3 2 0.173122`

`5 23 9 2 0.173122`

`6 24 0 2 0.121185`

`6 25 6 2 0.121185`

`6 26 3 2 0.121185`

`6 27 9 2 0.121185`

`7 28 0 2 0.084830`

`7 29 6 2 0.084830`

`7 30 3 2 0.084830`

`7 31 9 2 0.084830`

`8 32 0 2 0.059381`

`8 33 6 2 0.059381`

`8 34 3 2 0.059381`

`8 35 9 2 0.059381`

For
this case, the network quickly converges on a unique winner for each of the
four input patterns, and the distance criterion is below the set criterion
within eight cycles. You can experiment with other input data sets and
combinations of Kohonen network parameters.

There are many variations of the Kohonen network. Some of these will be briefly discussed in this section.

DeSieno has used a *conscience* factor in a Kohonen network. For a winning neuron, if the neuron is
winning more than a fair share of the time (roughly more than 1/*n*, where
*n* is the number of neurons), then this neuron has a threshold that is
applied temporarily to allow other neurons the chance to win. The purpose of
this modification is to allow more uniform weight distribution while learning
is taking place.

You have read about LVQ (Learning Vector Quantizer) in previous chapters. In light of the Kohonen map, it should be pointed out that the LVQ is simply a supervised version of the Kohonen network. Inputs and expected output categories are presented to the network for training. You get data clustered, just as a Kohonen network, according to the similarity to other data inputs.

A
neural network topology, called a *counterpropagation*
network, is a combination of a Kohonen layer with a Grossberg layer. This network was developed by Robert Hecht-Nielsen and is useful for prototyping of systems,
with a fairly rapid training time compared to backpropagation.
The Kohonen layer provides for categorization, while
the Grossberg layer allows for Hebbian
conditioned learning. Counterpropagation has been
used successfully in data compression applications for images. Compression
ratios of 10:1 to 100:1 have been obtained, using a lossy
compression scheme that codes the image with a technique called *vector quantization*, where the image is broken up into
representative subimage vectors. The statistics of
these vectors is such that you find that a large part of the image can be
adequately represented by a subset of all the vectors. The vectors with the
highest frequency of occurrence are coded with the shortest bit strings, hence
you achieve data compression.

Kohonen created a *phonetic typewriter* by classifying
speech waveforms of different phonemes of Finnish speech
into different categories using a Kohonen SOM. The Kohonen phoneme map used 50 samples of each phoneme for
calibration. These samples caused excitation in a neighborhood of cells more
strongly than in other cells. A neighborhood was labeled with the particular
phoneme that caused excitation. For an utterance of speech made to the network,
the exact neighborhoods that were active during the utterance were noted, and
for how long, and in what sequence. Short excitations were taken as transitory
sounds. The information obtained from the network was then pieced together to
find out the words in the utterance made to the network.

In this chapter, you have learned about one of the important types of competitive learning called Kohonen feature map. The most significant points of this discussion are outlined as follows:

**•** The
Kohonen feature map is an example of an unsupervised
neural network that is mainly used as a classifier system or data clustering
system. As more inputs are presented to this network, the network improves its
learning and is able to adapt to changing inputs.

**•** The
training law for the Kohonen network tries to align
the weight vectors along the same direction as input vectors.

**•** The
Kohonen network models lateral competition as a form
of self-organization. One winner neuron is derived for each input pattern to
categorize that input.

**•** Only
neurons within a certain distance (neighborhood) from the winner are allowed to
participate in training for a given input pattern.