Application to Nonlinear Optimization

Introduction

Nonlinear optimization is an area of operations research, and efficient algorithms for some of the problems in this area are hard to find. In this chapter, we describe the traveling salesperson problem and discuss how this problem is formulated as a nonlinear optimization problem in order to use neural networks (Hopfield and Kohonen) to find an optimum solution. We start with an explanation of the concepts of linear, integer linear and nonlinear optimization.

An optimization problem has an objective function and a set of constraints on the variables. The problem is to find the values of the variables that lead to an optimum value for the objective function, while satisfying all the constraints. The objective function may be a linear function in the variables, or it may be a nonlinear function. For example, it could be a function expressing the total cost of a particular production plan, or a function giving the net profit from a group of products that share a given set of resources. The objective may be to find the minimum value for the objective function, if, for example, it represents cost, or to find the maximum value of a profit function. The resources shared by the products in their manufacturing are usually in limited supply or have some other restrictions on their availability. This consideration leads to the specification of the constraints for the problem.

Each constraint is usually in the form of an equation or an inequality. The left side of such an equation or inequality is an expression in the variables for the problem, and the right-hand side is a constant. The constraints are said to be linear or nonlinear depending on whether the expression on the left-hand side is a linear function or nonlinear function of the variables. A linear programming problem is an optimization problem with a linear objective function as well as a set of linear constraints. An integer linear programming problem is a linear programming problem where the variables are required to have integer values. A nonlinear optimization problem has one or more of the constraints nonlinear and/or the objective function is nonlinear.

Here are some examples of statements that specify objective functions and constraints:

  Linear objective function: Maximize Z = 3X1 + 4X2 + 5.7X3

  Linear equality constraint: 13X1 - 4.5X2 + 7X3 = 22

  Linear inequality constraint: 3.6X1 + 8.4X2 - 1.7X3 ≤ 10.9

  Nonlinear objective function: Minimize Z = 5X2 + 7XY + Y2

  Nonlinear equality constraint: 4X + 3XY + 7Y + 2Y2 = 37.6

  Nonlinear inequality constraint: 4.8X + 5.3XY + 6.2Y2 ≥ 34.56

An example of a linear programming problem is the blending problem. An example of a blending problem is that of making different flavors of ice cream blending different ingredients, such as sugar, a variety of nuts, and so on, to produce different amounts of ice cream of many flavors. The objective in the problem is to find the amounts of individual flavors of ice cream to produce with given supplies of all the ingredients, so the total profit is maximized.

A nonlinear optimization problem example is the quadratic programming problem. The constraints are all linear but the objective function is a quadratic form. A quadratic form is an expression of two variables with 2 for the sum of the exponents of the two variables in each term.

An example of a quadratic programming problem, is a simple investment strategy problem that can be stated as follows. You want to invest a certain amount in a growth stock and in a speculative stock, achieving at least 25% return. You want to limit your investment in the speculative stock to no more than 40% of the total investment. You figure that the expected return on the growth stock is 18%, while that on the speculative stock is 38%. Suppose G and S represent the proportion of your investment in the growth stock and the speculative stock, respectively. So far you have specified the following constraints. These are linear constraints:

G + S = 1

This says the proportions add up to 1.

S ≤ 0.4

This says the proportion invested in speculative stock is no more than 40%.

1.18G + 1.38S ≥ 1.25

This says the expected return from these investments should be at least 25%.

Now the objective function needs to be specified. You have specified already the expected return you want to achieve. Suppose that you are a conservative investor and want to minimize the variance of the return. The variance works out as a quadratic form. Suppose it is determined to be:

 
    2G2  + 3S2 - GS

This quadratic form, which is a function of G and S, is your objective function that you want to minimize subject to the (linear) constraints previously stated.

Neural Networks for Optimization Problems

It is possible to construct a neural network to find the values of the variables that correspond to an optimum value of the objective function of a problem. For example, the neural networks that use the Widrow-Hoff learning rule find the minimum value of the error function using the least mean squared error. Neural networks such as the feedforward backpropagation network use the steepest descent method for this purpose and find a local minimum of the error, if not the global minimum. On the other hand, the Boltzmann machine or the Cauchy machine uses statistical methods and probabilities and achieves success in finding the global minimum of an error function. So we have an idea of how to go about using a neural network to find an optimum value of a function. The question remains as to how the constraints of an optimization problem should be treated in a neural network operation. A good example in answer to this question is the traveling salesperson problem. Let’s discuss this example next.

Traveling Salesperson Problem

The traveling salesperson problem is well-known in optimization. Its mathematical formulation is simple, and one can state a simple solution strategy also. Such a strategy is often impractical, and as yet, there is no efficient algorithm for this problem that consistently works in all instances. The traveling salesperson problem is one among the so- called NP-complete problems, about which you will read more in what follows. That means that any algorithm for this problem is going to be impractical with certain examples. The neural network approach tends to give solutions with less computing time than other available algorithms for use on a digital computer. The problem is defined as follows. A traveling salesperson has a number of cities to visit. The sequence in which the salesperson visits different cities is called a tour. A tour should be such that every city on the list is visited once and only once, except that he returns to the city from which he starts. The goal is to find a tour that minimizes the total distance the salesperson travels, among all the tours that satisfy this criterion.

A simple strategy for this problem is to enumerate all feasible tours—a tour is feasible if it satisfies the criterion that every city is visited but once—to calculate the total distance for each tour, and to pick the tour with the smallest total distance. This simple strategy becomes impractical if the number of cities is large. For example, if there are 10 cities for the traveling salesperson to visit (not counting home), there are 10! = 3,628,800 possible tours, where 10! denotes the factorial of 10—the product of all the integers from 1 to 10—and is the number of distinct permutations of the 10 cities. This number grows to over 6.2 billion with only 13 cities in the tour, and to over a trillion with 15 cities.

The TSP in a Nutshell

For n cities to visit, let Xij be the variable that has value 1 if the salesperson goes from city i to city j and value 0 if the salesperson does not go from city i to city j. Let dij be the distance from city i to city j. The traveling salesperson problem (TSP) is stated as follows:

Minimize the linear objective function:

subject to:

for each j=1, …, n (linear constraint)

for each i=1, …, n (linear constraint)

ɷXij for 1 for all i and j (integer constraint)

This is a 0-1 integer linear programming problem.

Solution via Neural Network

This section shows how the linear and integer constraints of the TSP are absorbed into an objective function that is nonlinear for solution via Neural network.

The first consideration in the formulation of an optimization problem is the identification of the underlying variables and the type of values they can have. In a traveling salesperson problem, each city has to be visited once and only once, except the city started from. Suppose you take it for granted that the last leg of the tour is the travel between the last city visited and the city from which the tour starts, so that this part of the tour need not be explicitly included in the formulation. Then with n cities to be visited, the only information needed for any city is the position of that city in the order of visiting cities in the tour. This suggests that an ordered n-tuple is associated with each city with some element equal to 1, and the rest of the n – 1 elements equal to 0. In a neural network representation, this requires n neurons associated with one city. Only one of these n neurons corresponding to the position of the city, in the order of cities in the tour, fires or has output 1. Since there are n cities to be visited, you need n2 neurons in the network. If these neurons are all arranged in a square array, you need a single 1 in each row and in each column of this array to indicate that each city is visited but only once.

Let xij be the variable to denote the fact that city i is the jth city visited in a tour. Then xij is the output of the jth neuron in the array of neurons corresponding to the ith city. You have n2 such variables, and their values are binary, 0 or 1. In addition, only n of these variables should have value 1 in the solution. Furthermore, exactly one of the x’s with the same first subscript (value of i) should have value 1. It is because a given city can occupy only one position in the order of the tour. Similarly, exactly one of the x’s with the same second subscript (value of j) should have value 1. It is because a given position in the tour can be only occupied by one city. These are the constraints in the problem. How do you then describe the tour? We take as the starting city for the tour to be city 1 in the array of cities. A tour can be given by the sequence 1, a, b, c, …, q, indicating that the cities visited in the tour in order starting at 1 are, a, b, c, …, q and back to 1. Note that the sequence of subscripts a, b, …, q is a permutation of 2, 3, … n – 1, xa1=1, xb2=1, etc.

Having frozen city 1 as the first city of the tour, and noting that distances are symmetric, the distinct number of tours that satisfy the constraints is not n!, when there are n cities in the tour as given earlier. It is much less, namely, n!/2n. Thus when n is 10, the number of distinct feasible tours is 10!/20, which is 181,440. If n is 15, it is still over 43 billion, and it exceeds a trillion with 17 cities in the tour. Yet for practical purposes there is not much comfort knowing that for the case of 13 cities, 13! is over 6.2 billion and 13!/26 is only 239.5 million—it is still a tough combinatorial problem.

The cost of the tour is the total distance traveled, and it is to be minimized. The total distance traveled in the tour is the sum of the distances from each city to the next. The objective function has one term that corresponds to the total distance traveled in the tour. The other terms, one for each constraint, in the objective function are expressions, each attaining a minimum value if and only if the corresponding constraint is satisfied. The objective function then takes the following form. Hopfield and Tank formulated the problem as one of minimizing energy. Thus it is, customary to refer to the value of the objective function of this problem, while using Hopfield-like neural network for its solution, as the energy level, E, of the network. The goal is to minimize this energy level.

In formulating the equation for E, one uses constant parameters A1, A2, A3, and A4 as coefficients in different terms of the expression on the right-hand side of the equation. The equation that defines E is given as follows. Note that the notation in this equation includes dij for the distance from city i to city j.

E = A1 ΣiΣk Σj≠kxikxij + A2 Σi Σk xiΣk Σj≠kxkixji + A3[( ΣiΣkxik) - n]2 + A4Σk Σk Σj≠k Σidkjxki(xj,i+1 + xj,i-1)

Our first observation at this point is that E is a nonlinear function of the x’s, as you have quadratic terms in it. So this formulation of the traveling salesperson problem renders it a nonlinear optimization problem.

All the summations indicated by the occurrences of the summation symbol Σ, range from 1 to n for the values of their respective indices. This means that the same summand such as x12x33 also as x33x12, appears twice with only the factors interchanged in their order of occurrence in the summand. For this reason, many authors use an additional factor of 1/2 for each term in the expression for E. However, when you minimize a quantity z with a given set of values for the variables in the expression for z, the same values for these variables minimize any whole or fractional multiple of z, as well.

The third summation in the first term is over the index j, from 1 to n, but excluding whatever value k has. This prevents you from using something like x12x12. Thus, the first term is an abbreviation for the sum of n2(n – 1) terms with no two factors in a summand equal. This term is included to correspond to the constraint that no more than one neuron in the same row can output a 1. Thus, you get 0 for this term with a valid solution. This is also true for the second term in the right-hand side of the equation for E. Note that for any value of the index i, xii has value 0, since you are not making a move like, from city i to the same city i in any of the tours you consider as a solution to this problem. The third term in the expression for E has a minimum value of 0, which is attained if and only if exactly n of the n2 x’s have value 1 and the rest 0.

The last term expresses the goal of finding a tour with the least total distance traveled, indicating the shortest tour among all possible tours for the traveling salesperson. Another important issue about the values of the subscripts on the right-hand side of the equation for E is, what happens to i + 1, for example, when i is already equal to n, and to i–1, when i is equal to 1. The i + 1 and i – 1 seem like impossible values, being out of their allowed range from 1 to n. The trick is to replace these values with their moduli with respect to n. This means, that the value n + 1 is replaced with 1, and the value 0 is replaced with n in the situations just described.

Modular values are obtained as follows. If we want, say 13 modulo 5, we subtract 5 as many times as possible from 13, until the remainder is a value between 0 and 4, 4 being 5 – 1. Since we can subtract 5 twice from 13 to get a remainder of 3, which is between 0 and 4, 3 is the value of 13 modulo 5. Thus (n + 3) modulo n is 3, as previously noted. Another way of looking at these results is that 3 is 13 modulo 5 because, if you subtract 3 from 13, you get a number divisible by 5, or which has 5 as a factor. Subtracting 3 from n + 3 gives you n, which has n as a factor. So 3 is (n + 3) modulo n. In the case of –1, by subtracting (n – 1) from it, we get -n, which can be divided by n getting –1. So (n – 1) is the value of (–1) modulo n.

Example of a Traveling Salesperson Problem for Hand Calculation

Suppose there are four cities in the tour. Call these cities, C1, C2, C3, and C4. Let the matrix of distances be the following matrix D.

           0   10   14    7
     D =  10    0    6   12
          14    6    0    9
           7   12    9    0

From our earlier discussion on the number of valid and distinct tours, we infer that there are just three such tours. Since it is such a small number, we can afford to enumerate the three tours, find the energy values associated with them, and pick the tour that has the smallest energy level among the three. The three tours are:

  Tour 1. 1 - 2 - 3 - 4 - 1 In this tour city 2 is visited first, followed by city 3, from where the salesperson goes to city 4, and then returns to city 1. For city 2, the corresponding ordered array is (1, 0, 0, 0), because city 2 is the first in this permutation of cities. Then x21 = 1, x22 = 0, x23 = 0, x24 = 0. Also (0, 1, 0, 0), (0, 0, 1, 0), and (0, 0, 0, 1) correspond to cities 3, 4, and 1, respectively. The total distance of the tour is d12 + d23 + d34 + d41= 10 + 6 + 9 + 7 = 32.

  Tour 2. 1 - 3 - 4 - 2 - 1

  Tour 3. 1 - 4 - 2 - 3 - 1

There seems to be some discrepancy here. If there is one, we need an explanation. The discrepancy is that we can find many more tours that should be valid because no city is visited more than once. You may say they are distinct from the three previously listed. Some of these additional tours are:

  Tour 4. 1 - 2 - 4 - 3 - 1

  Tour 5. 3 - 2 - 4 - 1 - 3

  Tour 6. 2 - 1 - 4 - 3 - 2

There is no doubt that these three tours are distinct from the first set of three tours. And in each of these three tours, every city is visited exactly once, as required in the problem. So they are valid tours as well. Why did our formula give us 3 for the value of the number of possible valid tours, while we are able to find 6?

The answer lies in the fact that if two valid tours are symmetric and have the same energy level, because they have the same value for the total distance traveled in the tour, one of them is in a sense redundant, or one of them can be considered degenerate, using the terminology common to this context. As long as they are valid and give the same total distance, the two tours are not individually interesting, and any one of the two is enough to have. By simple inspection, you find the total distances for the six listed tours. They are 32 for tour 1, 32 also for tour 6, 45 for each of tours 2 and 4, and 39 for each of tours 3 and 5. Notice also that tour 6 is not very different from tour 1. Instead of starting at city 1 as in tour 1, if you start at city 2 and follow tour 1 from there in reverse order of visiting cities, you get tour 6. Therefore, the distance covered is the same for both these tours. You can find similar relationships for other pairs of tours that have the same total distance for the tour. Either by reversing the order of cities in a tour, or by making a circular permutation of the order of the cities in a tour, you get another tour with the same total distance. This way you can find tours. Thus, only three distinct total distances are possible, and among them 32 is the lowest. The tour 1 - 2 - 3 - 4 - 1 is the shortest and is an optimum solution for this traveling salesperson problem. There is an alternative optimum solution, namely tour 6, with 32 for the total length of the tour. The problem is to find an optimal tour, and not to find all optimal tours if more than one exist. That is the reason only three distinct tours are suggested by the formula for the number of distinct valid tours in this case.

The formula we used to get the number of valid and distinct tours to be 3 is based on the elimination of such symmetry. To clarify all this discussion, you should determine the energy levels of all your six tours identified earlier, hoping to find pairs of tours with identical energy levels.

Note that the first term on the right-hand side of the equation results in 0 for a valid tour, as this term is to ensure there is no more than a single 1 in each row. That is, in any summand in the first term, at least one of the factors, xik or xij, where kj has to be 0 for a valid tour. So all those summands are 0, making the first term itself 0. Similarly, the second term is 0 for a valid tour, because in any summand at least one of the factors, xki or xji, where kj has to be 0 for a valid tour. In all, exactly 4 of the 16 x’s are each 1, making the total of the x’s 4. This causes the third term to be 0 for a valid tour. These observations make it clear that it does not matter for valid tours, what values are assigned to the parameters A1, A2, and A3. Assigning large values for these parameters would cause the energy levels, for tours that are not valid, to be much larger than the energy levels for valid tours. Thereby, these tours become unattractive for the solution of the traveling salesperson problem. Let us use the value 1/2 for the parameter A4.

Let us demonstrate the calculation of the value for the last term in the equation for E, in the case of tour 1. Recall that the needed equation is

E = A1 ΣiΣk Σj≠k xikxij + A2 ΣiΣk Σj≠kxkixji + A3[( ΣiΣkxjk) - n]2 + A4Σk Σj≠kΣidkjxki(xj,i+1 + xj,i-1)

The last term expands as given in the following calculation:

    A4{ d12[x12(x23 +x21) + x13(x24 + x22) + x14(x21 + x23)] +
    d13[x12(x33 + x31) + x13(x34 + x32) + x14(x31 + x33)] +
    d14[x12(x43 + x41) + x13(x44 + x42) + x14(x41 + x43)] +
    d21[x21(x12 + x14) + x23(x14 + x12) + x24(x11 + x13)] +
    d23[x21(x32 + x34) + x23(x34 + x32) + x24(x31 + x33)] +
    d24[x21(x42 + x44) + x23(x44 + x42) + x24(x41 + x43)] +
    d31[x31(x12 + x14) + x32(x13 + x11) + x34(x11 + x13)] +
    d32[x31(x22 + x24) + x32(x23 + x21) + x34(x21 + x23)] +
    d34[x31(x42 + x44) + x32(x43 + x41) + x34(x41 + x43)] +
    d41[x41(x12 + x14) + x42(x13 + x11) + x43(x14 + x12)] +
    d42[x41(x22 + x24) + x42(x23 + x21) + x43(x24 + x22)] +
    d43[x41(x32 + x34) + x42(x33 + x31) + x43(x34 + x32)] }

When the respective values are substituted for the tour 1 - 2 - 3 - 4 - 1, the previous calculation becomes:

    1/2{10[0(0 + 1) + 0(0 + 0) + 1(1 + 0)] + 14[0(0 + 0) + 0(0 + 1) +   1(0 + 0)] +
    7[0(1 + 0) + 0(0 + 0) + 1(0 + 1)] + 10[1(0 + 1) + 0(1 + 0) +   0(0 + 0)] +
    6[1(1 + 0) + 0(0 + 1) + 0(0 + 0)] + 12[1(0 + 0) + 0(0 + 0) +   0(0 + 1)] +
    14[0(0 + 1) + 1(0 + 0) + 0(0 + 0)] + 6[0(0 + 0) + 1(0 + 1) +   0(1 + 0)] +
    9[0(0 + 0) + 1(1 + 0) + 0(0 + 1)] + 7[0(0 + 1) + 0(0 + 0) +   1(1 + 0)] +
    12[0(0 + 0) + 0(0 + 1) + 1(0 + 0)] + 9[0(1 + 0) + 0(0 + 0) +  1(0 + 1)]}
    = 1/2( 10 + 0 + 7 + 10 + 6 + 0 + 0 + 6 + 9 + 7 + 0 + 9) = 1/2(64) = 32

Table 15.1 contains the values we get for the fourth term on the right-hand side of the equation, and for E, with the six listed tours.

Energy Levels for Six of the Valid Tours

Tour #

Non-Zero x’s

Value for the Last Term

Energy Level

Comment

1

x14, x21, x32, x43

32

32

1 - 2 - 3 - 4 - 1 tour

2

x14, x23, x31, x42

45

45

1 - 3 - 4 - 2 - 1 tour

3

x14, x22, x33, x41

39

39

1 - 4 - 2 - 3 - 1 tour

4

x14, x21, x33, x42

45

45

1 - 2 - 4 - 3 - 1 tour

5

x13, x21, x34, x42

39

39

3 - 2 - 4 - 1 - 3 tour

6

x11, x24, x33, x42

32

32

2 - 1 - 4 - 3 - 2 tour

Neural Network for Traveling Salesperson Problem

Hopfield and Tank used a neural network to solve a traveling salesperson problem. The solution you get may not be optimal in certain instances. But by and large you may get a solution close to an optimal solution. One cannot find for the traveling salesperson problem a consistently efficient method of solution, as it has been proved to belong to a set of problems called NP-complete problems. NP-complete problems have the distinction that there is no known algorithm that is efficient and practical, and there is little likelihood that such an algorithm will be developed in the future. This is a caveat to keep in mind when using a neural network to solve a traveling salesperson problem.

Network Choice and Layout

We will describe the use of a Hopfield network to attempt to solve the traveling salesperson problem. There are n2 neurons in the network arranged in a two-dimensional array of n neurons per row and n per column. The network is fully connected. The connections in the network in each row and in each column are lateral connections.

Layout of a Hopfield network for the traveling salesperson problem.

The most important task on hand then is finding an appropriate connection weight matrix. It is constructed taking into account that nonvalid tours should be prevented and valid tours should be preferred. A consideration in this regard is, for example, no two neurons in the same row (column) should fire in the same cycle of operation of the network. As a consequence, the lateral connections should be for inhibition and not for excitation.

In this context, the Kronecker delta function is used to facilitate simple notation. The Kronecker delta function has two arguments, which are given usually as subscripts of the symbol δ. By definition δik has value 1 if i = k, and 0 if ik. That is, the two subscripts should agree for the Kronecker delta to have value 1. Otherwise, its value is 0.

We refer to the neurons with two subscripts, one for the city it refers to, and the other for the order of that city in the tour. Therefore, an element of the weight matrix for a connection between two neurons needs to have four subscripts, with a comma after two of the subscripts. An example is wik,lj referring to the weight on the connection between the (ik) neuron and the (lj) neuron. The value of this weight is set as follows:

Wik,lj = -A1δil(1-δkj)-A2δkj(1-δkj(1-δil)-A3-A4 dilj, k+1 + δj,k-1)

Here the negative signs indicate inhibition through the lateral connections in a row or column. The -A3 is a term for global inhibition.

Inputs

The inputs to the network are chosen arbitrarily. If as a consequence of the choice of the inputs, the activations work out to give outputs that add up to the number of cities, an initial solution for the problem, a legal tour, will result. A problem may also arise that the network will get stuck at a local minimum. To avoid such an occurrence, random noise can be added. Usually you take as the input at each neuron a constant times the number of cities and adjust this adding a random number, which can differ for different neurons.

Activations, Outputs, and Their Updating

We denote the activation of the neuron in the ith row and jth column by aij, and the output is denoted by xij. A time constant τ, and a gain λ are used as well. A constant m is another parameter used. Also, Δt denotes the increment in time, from one cycle to the next. Keep in mind that the index for the summation Σ ranges from 1 to n, the number of cities. Excluded values of the index are shown by the use of the symbol ≠.

The change in the activation is then given by Δaij, where:

Δaij = Δt (Term1 + Term2 + Term3 + Term4 + Term5)

Term1 = - aij

Term2 = - A1 k≠jxik

Term3 = - A2

Σk≠ixkj

Term4 = - A3i Σkxik - m)

Term5 = - A4 Σk≠idik(xk,j+1 + xk,j-1)

To update the activation of the neuron in the ith row and jth column, you take:

aijnew = aijold + Δaij

The output of a neuron in the ith row and jth column is calculated from:

xin = (1 + tanh(λaij))/2


NOTE: 

, which is the original sigmoid function


The function used here is the hyperbolic tangent function. The gain parameter mentioned earlier λ is. The output of each neuron is calculated after updating the activation. Ideally, you want to get the outputs as 0’s and 1’s, preferably a single one for each row and each column, to represent a tour that satisfies the conditions of the problem. But the hyperbolic tangent function gives a real number, and you have to settle for a close enough value to 1 or 0. You may get, for example, 0.96 instead of 1, or 0.07 instead of 0. The solution is to be obtained from such values by rounding up or down so that 1 or 0 will be used, as the case may be

Performance of the Hopfield Network

Let us now look at Hopfield and Tank’s approach at solving the TSP.

Hopfield and Tank Example

The Hopfield network’s use in solving the traveling salesperson problem is a pioneering effort in the use of the neural network approach for this problem. Hopfield and Tank’s example is for a problem with 10 cities. The parameters used were, A1= 500, A2 = 500, A3 = 200, A4 = 500, τ = 1, λ = 50, and m = 15. A good solution corresponding to a local minimum for E is the expected, if not the best, solution (global minimum). An annealing process could be considered to move out of any local minimum. As was mentioned before, the traveling salesperson problem is one of those problems for which a single approach cannot be found that will be successful in all cases. There isn’t very much guidance as to how to choose the parameters in general for the use of the Hopfield network to solve the traveling salesperson problem.

C++ Implementation of the Hopfield Network for the Traveling Salesperson Problem

We present a C++ program for the Hopfield network operation for the traveling salesperson problem. The header file is in the Listing 15.1, and the source file is in the Listing 15.2. A tsneuron class is declared for a neuron and a network class for the network. The network class is declared a friend in the tsneuron class. The program follows the procedure described for setting inputs, connection weights, and updating.

Program Details

The following is a listing of the characteristics of the C++ program along with definitions and/or functions.

  The number of cities, the number of iterations, and the distances between the cities are solicited from the user.

  The distances are taken as integer values. If you want to use real numbers as distances, the type for distance matrix needs to be changed to float, and corresponding changes are needed for calcdist ( ) function, etc.

  tourcity and tourorder arrays keep track of the cities that have to be covered and the order in which it is to be done.

  A neuron corresponds to each combination of a city and its order in the tour. The ith city visited in the order j , is the neuron corresponding to the element j + i*n, in the array for neurons. Here n is the number of cities. The i and the j vary from 0 to n – 1. There are n2 neurons.

  mtrx is the matrix giving the weights on the connections between the neurons. It is a square matrix of order n2.

  An input vector is generated at random in the function main ( ), and is later referred to as ip

  asgninpt ( ) function presents the input vector ip to the network and determines the initial activations of the neurons.

  getacts ( ) function updates the activations of the neurons after each iteration.

  getouts ( ) function computes the outputs after each iteration. la is used as abbreviation for lambda in its argument.

  iterate ( ) function carries out the number of iterations desired.

  findtour ( ) function determines the tour orders of cities to be visited using the outputs of the neurons. When used at the end of the iterations, it gives the solution obtained for the traveling salesperson problem.

  calcdist ( ) function calculates the distance of the tour in the solution.

Header file for the C++ program for the Hopfield network for the traveling salesperson problem

//trvslsmn.h  V. Rao,  H. Rao
#include <iostream.h>
#include <stdlib.h>
#include <math.h>
#include <stdio.h>
 
#define MXSIZ 11
 
class tsneuron
        {
        protected:
        int cit,ord;
        float output;
        float activation;
        friend class network;
 
        public:
        tsneuron() { };
        void getnrn(int,int);
        };
 
class network
        {
        public:
        int  citnbr;
        float pra,prb,prc,prd,totout,distnce;
 
        tsneuron (tnrn)[MXSIZ][MXSIZ];
        int dist[MXSIZ][MXSIZ];
        int tourcity[MXSIZ];
        int tourorder[MXSIZ];
        float outs[MXSIZ][MXSIZ];
        float acts[MXSIZ][MXSIZ];
        float mtrx[MXSIZ][MXSIZ];
        float citouts[MXSIZ];
        float ordouts[MXSIZ];
 
        network() { };
        void getnwk(int,float,float,float,float);
        void getdist(int);
        void findtour();
        void asgninpt(float *);
        void calcdist();
        void iterate(int,int,float,float,float);
        void getacts(int,float,float);
        void getouts(float);
 
//print functions
 
        void prdist();
        void prmtrx(int);
        void prtour();
        void practs();
        void prouts();
        };

Source File for Hopfield Network for Traveling Salesperson Problem

The following listing gives the source code for the C++ program for the Hopfield network for traveling salesperson problem. The user is prompted to input the number of cities and the maximum number of iterations for the operation of the network.

The parameters a, b, c, d declared in the function main correspond to A1, A2, A3, and A4, respectively. These and the parameters tau, lambda, and nprm are all given values in the declaration statements in the function main. If you change these parameter values or change the number of cities in the traveling salesperson problem, the program will compile but may not work as you’d like.

Listing 15.2 Source file for the C++ program for the Hopfield network for the traveling salesperson problem

//trvslsmn.cpp V. Rao,  H. Rao
#include “trvslsmn.h”
#include <stdlib.h>
#include <time.h>
 
//generate random noise
 
int randomnum(int maxval)
{
// random number generator
// will return an integer up to maxval
 
return rand() % maxval;
}
 
//Kronecker delta function
 
int krondelt(int i,int j)
       {
       int k;
       k= ((i == j) ? (1):(0));
       return k;
       };
 
void tsneuron::getnrn(int i,int j)
       {
       cit = i;
       ord = j;
       output = 0.0;
       activation = 0.0;
       };
 
//distances between cities
 
void network::getdist(int k)
       {
       citnbr = k;
       int i,j;
       cout<<”\n”;
 
for(i=0;i<citnbr;++i)
       {
       dist[i][i]=0;
       for(j=i+1;j<citnbr;++j)
              {
              cout<<”\ntype distance (integer) from city “<<
              i<<” to city “<<j<<”\n”;
              cin>>dist[i][j];
              }
       cout<<”\n”;
       }
 
for(i=0;i<citnbr;++i)
       {
       for(j=0;j<i;++j)
              {
              dist[i][j] = dist[j][i];
              }
       }
prdist();
cout<<”\n”;
}
 
//print distance matrix
 
void network::prdist()
       {
       int i,j;
       cout<<”\n Distance Matrix\n”;
 
       for(i=0;i<citnbr;++i)
              {
              for(j=0;j<citnbr;++j)
                     {
                     cout<<dist[i][j]<<”  “;
                     }
       cout<<”\n”;
       }
}
 
//set up network
 
void network::getnwk(int citynum,float a,float b,float c,float d)
        {
        int i,j,k,l,t1,t2,t3,t4,t5,t6;
        int p,q;
        citnbr = citynum;
        pra = a;
        prb = b;
        prc = c;
        prd = d;
        getdist(citnbr);
 
        for(i=0;i<citnbr;++i)
                {
                for(j=0;j<citnbr;++j)
                       {
                       tnrn[i][j].getnrn(i,j);
                       }
                }
 
        //find weight matrix
 
        for(i=0;i<citnbr;++i)
                 {
                 for(j=0;j<citnbr;++j)
                        {
                        p = ((j == citnbr-1) ? (0) : (j+1));
                        q = ((j == 0) ? (citnbr-1) : (j-1));
                        t1 = j + i*citnbr;
                        for(k=0;k<citnbr;++k)
                                {
                 for(l=0;l<citnbr;++l)
                        {
                        t2 = l + k*citnbr;
                        t3 = krondelt(i,k);
                        t4 = krondelt(j,l);
                        t5 = krondelt(l,p);
                        t6 = krondelt(l,q);
                        mtrx[t1][t2] =
                        -a*t3*(1-t4) -b*t4*(1-t3)
                        -c -d*dist[i][k]*(t5+t6);
            }
                 }
            }
        }
        prmtrx(citnbr);
}
 
//print weight matrix
 
void network::prmtrx(int k)
         {
         int i,j,nbrsq;
         nbrsq = k*k;
         cout<<”\nWeight Matrix\n”;
         for(i=0;i<nbrsq;++i)
                {
                for(j=0;j<nbrsq;++j)
                       {
                       if(j%k == 0)
                              {
                              cout<<”\n”;
                              }
                       cout<<mtrx[i][j]<<”  “;
                       }
                cout<<”\n”;
                }
         }
 
//present input to network
 
void network::asgninpt(float *ip)
       {
       int i,j,k,l,t1,t2;
 
       for(i=0;i<citnbr;++i)
             {
             for(j =0;j<citnbr;++j)
                    {
                    acts[i][j] = 0.0;
                    }
             }
 
       //find initial activations
       for(i=0;i<citnbr;++i)
             {
             for(j =0;j<citnbr;++j)
                    {
                    t1 = j + i*citnbr;
                    for(k=0;k<citnbr;++k)
                           {
                           for(l=0;l<citnbr;++l)
                                  {
                                  t2 = l + k*citnbr;
                                  acts[i][j] +=
                                  mtrx[t1][t2]*ip[t1];
                                  }
                           }
                    }
             }
 
       //print activations
       cout<<”\ninitial activations\n”;
       practs();
       }
 
//find activations
 
void network::getacts(int nprm,float dlt,float tau)
       {
       int i,j,k,p,q;
       float r1, r2, r3, r4,r5;
       r3 = totout - nprm ;
 
       for(i=0;i<citnbr;++i)
             {
             r4 = 0.0;
             p = ((i == citnbr-1) ? (0) : (i+1));
             q = ((i == 0) ? (citnbr-1) : (i-1));
             for(j=0;j<citnbr;++j)
                    {
                    r1 = citouts[i] - outs[i][j];
                    r2 = ordouts[i] - outs[i][j];
                    for(k=0;k<citnbr;++k)
                           {
                           r4 += dist[i][k] *
                           (outs[k][p] + outs[k][q]);
                           }
                    r5 = dlt*(-acts[i][j]/tau -
                    pra*r1 -prb*r2 -prc*r3 -prd*r4);
                    acts[i][j] += r5;
                    }
             }
       }
 
//find outputs and totals for rows and columns
 
void network::getouts(float la)
       {
       float b1,b2,b3,b4;
       int i,j;
       totout = 0.0;
 
       for(i=0;i<citnbr;++i)
             {
             citouts[i] = 0.0;
             for(j=0;j<citnbr;++j)
             {
             b1 = la*acts[i][j];
             b4 = b1/500.0;
             b2 = exp(b4);
             b3 = exp(-b4);
             outs[i][j] = (1.0+(b2-b3)/(b2+b3))/2.0;
             citouts[i] += outs[i][j];};
             totout += citouts[i];
             }
       for(j=0;j<citnbr;++j)
             {
             ordouts[j]  = 0.0;
             for(i=0;i<citnbr;++i)
                    {
                    ordouts[j] += outs[i][j];
                    }
             }
       }
 
//find tour
 
void network::findtour()
       {
       int i,j,k,tag[MXSIZ][MXSIZ];
       float tmp;
       for (i=0;i<citnbr;++i)
              {
              for(j=0;j<citnbr;++j)
                     {
                     tag[i][j] = 0;
                     }
              }
              for (i=0;i<citnbr;++i)
                     {
                     tmp = -10.0;
                     for(j=0;j<citnbr;++j)
                           {
                           for(k=0;k<citnbr;++k)
                                 {
                                 if((outs[i][k] >=tmp)&&
                                 (tag[i][k] ==0))
                                        tmp = outs[i][k];
                                 }
                           if((outs[i][j] ==tmp)&&
                           (tag[i][j]==0))
                                 {
                                 tourcity[i] =j;
                                 tourorder[j] = i;
                                 cout<<”\ntourcity “<<i
                                 <<” tour order “<<j<<”\n”;
                                 for(k=0;k<citnbr;++k)
                                        {
                                        tag[i][k] = 1;
                                        tag[k][j] = 1;
                                        }
                                 }
                           }
                     }
              }
 
//print outputs
 
void network::prouts()
       {
       int i,j;
       cout<<”\nthe outputs\n”;
       for(i=0;i<citnbr;++i)
              {
              for(j=0;j<citnbr;++j)
                     {
                     cout<<outs[i][j]<<”  “;
                     }
              cout<<”\n”;
              }
       }
 
//calculate total distance for tour
 
void network::calcdist()
       {
       int i, k, l;
       distnce = 0.0;
 
       for(i=0;i<citnbr;++i)
              {
              k = tourorder[i];
              l = ((i == citnbr-1 ) ? (tourorder[0]):(tourorder[i+1]));
              distnce += dist[k][l];
              }
       cout<<”\n distance of tour is : “<<distnce<<”\n”;
       }
 
// print tour
 
void network::prtour()
       {
       int i;
       cout<<”\n the tour :\n”;
 
       for(i=0;i<citnbr;++i)
              {
              cout<<tourorder[i]<<”  “;
              cout<<”\n”;
              }
       }
 
//print activations
 
void network::practs()
       {
       int i,j;
       cout<<”\n the activations:\n”;
       for(i=0;i<citnbr;++i)
              {
              for(j=0;j<citnbr;++j)
                     {
                     cout<<acts[i][j]<<”  “;
                     }
              cout<<”\n”;
              }
       }
 
//iterate specified number of times
 
void network::iterate(int nit,int nprm,float dlt,float tau,float la)
       {
       int k;
 
       for(k=1;k<=nit;++k)
              {
              getacts(nprm,dlt,tau);
              getouts(la);
              }
       cout<<”\n” <<nit<<” iterations completed\n”;
       practs();
       cout<<”\n”;
       prouts();
       cout<<”\n”;
       }
 
void main()
{
 
//numit = #iterations; n = #cities; u=intial input; nprm - parameter n’
//dt = delta t;
// —————————————————-
// parameters to be tuned are here:
       int u=1;
       int nprm=10;
       float a=40.0;
       float b=40.0;
       float c=30.0;
       float d=60.0;
       float dt=0.01;
       float tau=1.0;
       float lambda=3.0;
//——————————————————-
       int i,n2;
       int numit=100;
       int n=4;
       float input_vector[MXSIZ*MXSIZ];
 
       srand ((unsigned)time(NULL));
 
       cout<<”\nPlease type number of cities, number of iterations\n”;
 
       cin>>n>>numit;
       cout<<”\n”;
       if (n>MXSIZ)
              {
              cout << “choose a smaller n value\n”;
              exit(1);
              }
       n2 = n*n;
       for(i=0;i<n2;++i)
              {
              if(i%n == 0)cout<<”\n”;
              input_vector[i] =(u + (float)(randomnum(100)/100.0))/20.0;
              cout<<input_vector[i]<<” “;
              }
 
//create network and operate
 
       network *tntwk = new network;
       if (tntwk==0)
              {
              cout << “not enough memory\n”;
              exit(1);
              }
       tntwk->getnwk(n,a,b,c,d);
       tntwk->asgninpt(input_vector);
       tntwk->getouts(lambda);
       tntwk->prouts();
       tntwk->iterate(numit,nprm,dt,tau,lambda);
       tntwk->findtour();
       tntwk->prtour();
       tntwk->calcdist();
       cout<<”\n”;
}

Output from Your C++ Program for the Traveling Salesperson Problem

A three-city tour problem is trivial, since there is just one value for the total distance no matter how you permute the cities for the order of the tour. In this case the natural order is itself an optimal solution. The program is run for two cases, for illustration. The first run is for a problem with four cities. The second one is for a five-city problem. By the way, the cities are numbered from 0 to n – 1. The same parameter values are used in the two runs. The number of cities, and consequently, the matrix of distances were different. In the first run, the number of iterations asked for is 30, and in the second run it is 40.

The solution you get for the four-city problem is not the one in natural order. The total distance of the tour is 32. The tour in the solution is 1 - 0 - 3 - 2 - 1. This tour is equivalent to the tour in natural order, as you can see by starting at 0 and reading cyclically right to left in the previous sequence of cities.

The solution you get for the five-city problem is the tour 1 - 2 - 0 - 4 - 3 - 1. This reads, starting from 0 as either 0 - 4 - 3 - 1 - 2 - 0, or as 0 - 2 - 1 - 3 - 4 - 0. It is different from the tour in natural order. It has a total distance of 73, as compared to the distance of 84 with the natural order. It is not optimal though, as a tour of shortest distance is 0 - 4 - 2 - 1 - 3 - 0 with total distance 50.

Can the program give you the shorter tour? Yes, the solution can be different if you run the program again because each time you run the program, the input vector is different, as it is chosen at random. The parameter values given in the program are by guess.

Note that the seed for the random number generator is given in the statement

srand ((unsigned)time(NULL));

The program gives you the order in the tour for each city. For example, if it says tourcity 1 tour order 2, that means the second (tour) city is the city visited third (in tour order). Your tour orders are also with values from 0 to n – 1, like the cities.

The user input is in italic, and computer output is normal, as you have seen before.

Output for a Four-City Problem

Please type number of cities, number of iterations
4 30
 
0.097 0.0585 0.053 0.078                //input vector—there
are 16 neurons in the network
0.0725 0.0535 0.0585 0.0985
0.0505 0.061 0.0735 0.057
0.0555 0.075 0.0885 0.0925
 
type distance (integer) from city 0 to city 1
10
type distance (integer) from city 0 to city 2
14
type distance (integer) from city 0 to city 3
7
 
type distance (integer) from city 1 to city 2
6
type distance (integer) from city 1 to city 3
12
 
type distance (integer) from city 2 to city 3
9
 
 Distance Matrix
0  10  14  7
10  0  6  12
14  6  0  9
7  12  9  0
 
 Weight Matrix              //16x16 matrix of weights. There are
16 neurons in the network.
-30  -70   -70  -70
-70  -630  -30  -630
-70  -870  -30  -70
-30  -70   -70  -630
 
-70   -30  -70  -70
-630  -70  -630 -30
-870  -70  -870 -70
-70   -30  -70  -30
 
-70  -70   -30  -70
-30  -630  -70  -630
-30  -870  -70  -70
-70  -70   -30  -630
 
-70   -70  -70   -30
-630  -30  -630  -70
-870  -30  -870  -70
-630  -30  -630  -30
 
-70  -630  -30  -630
-30  -70   -70  -70
-70  -390  -30  -630
-70  -630  -30  -70
 
-630  -70  -630  -30
-70   -30  -70   -70
-390  -70  -390  -30
-630  -70  -630  -70
 
-30  -630  -70  -630
-70  -70   -30  -70
-30  -390  -70  -630
-30  -630  -70  -70
 
-630  -30  -630  -70
-70   -70  -70   -30
-390  -30  -390  -70
-870  -30  -870  -70
 
-70  -870  -30  -870
-70  -390  -30  -390
-30  -70   -70  -870
-70  -870  -30  -390
 
-870  -70  -870  -30
-390  -70  -390  -30
-70   -30  -70   -30
-870  -70  -870  -30
 
-30  -870  -70  -870
-30  -390  -70  -390
-70  -70   -30  -870
-30  -870  -70  -390
 
-870  -30  -870  -70
-390  -30  -390  -70
-70   -70  -70   -70
-450  -30  -450  -70
 
-70  -450  -30  -450
-70  -750  -30  -750
-70  -570  -30  -450
-70  -450  -30  -750
 
-450  -70  -450  -30
-750  -70  -750  -30
-570  -70  -570  -30
-450  -70  -450  -30
 
-30  -450  -70  -450
-30  -750  -70  -750
-30  -570  -70  -450
-30  -450  -70  -750
 
-450  -30  -450  -70
-750  -30  -750  -70
-570  -30  -570  -70
-70   -70  -70   -30
 
initial activations
 
 the activations:
-333.680054  -215.280014  -182.320023  -371.280029
-255.199997  -207.580002  -205.920013  -425.519989
-258.560028  -290.360016  -376.320007  -228.000031
-278.609985  -363  -444.27005  -377.400024
 
the outputs
0.017913  0.070217  0.100848  0.011483
0.044685  0.076494  0.077913  0.006022
0.042995  0.029762  0.010816  0.060882
0.034115  0.012667  0.004815  0.010678
 
 30 iterations completed
 
 the activations:
-222.586884  -176.979172  -195.530823  -380.166107
-164.0271    -171.654053  -214.053177  -421.249023
-158.297867  -218.833755  -319.384827  -245.097473
-194.550751  -317.505554  -437.527283  -447.651581
 
the outputs
0.064704  0.10681  0.087355  0.010333
0.122569  0.113061  0.071184  0.006337
0.130157  0.067483  0.021194  0.050156
0.088297  0.021667  0.005218  0.004624
 
tourcity 0 tour order 1
 
tourcity 1 tour order 0
 
tourcity 2 tour order 3
 
tourcity 3 tour order 2
 
 the tour :
1
0
3
2
 
 distance of tour is : 32

Output for a Five-City Problem

Please type number of cities, number of iterations
5 40
 
0.0645 0.069 0.0595 0.0615 0.0825   //input vector—there are 25
neurons in the network
0.074 0.0865 0.056 0.095 0.06
0.0625 0.0685 0.099 0.0645 0.0615
0.056 0.099 0.065 0.093 0.051
0.0675 0.094 0.0595 0.0635 0.0515
 
type distance (integer) from city 0 to city 1
10
type distance (integer) from city 0 to city 2
14
type distance (integer) from city 0 to city 3
7
type distance (integer) from city 0 to city 4
6
 
type distance (integer) from city 1 to city 2
12
type distance (integer) from city 1 to city 3
9
type distance (integer) from city 1 to city 4
18
 
type distance (integer) from city 2 to city 3
24
type distance (integer) from city 2 to city 4
16
 
type distance (integer) from city 3 to city 4
32
 
 Distance Matrix
0   10  14   7  6
10   0  12   9  18
14  12  0   24  16
7   9   24   0  32
6   18  16  32  0
 
Weight Matrix    //25x25 matrix of weights. There are 25 neurons
in the network.
 
-30  -70   -70  -70   -70
-70  -630  -30  -30   -630
-70  -70   -30  -70   -70
-70  -630  -70  -630  -30
-30  -870  -70  -70   -30
 
-70   -30  -70   -70  -70
-630  -70  -630  -30  -30
-870  -70  -70   -30  -70
-70   -30  -630  -70  -630
-30   -30  -70   -70  -70
 
-70  -70   -30  -70   -70
-30  -630  -70  -630  -30
-30  -70   -70  -70   -30
-70  -30   -30  -630  -70
-630 -30   -70  -70   -70
 
-70  -70   -70   -30   -70
-30  -30   -630  -70   -630
-30  -70   -70   -70   -70
-30  -630  -30   -30   -630
-70  -870  -70   -630  -30
 
-70   -70  -70   -70   -30
-630  -30  -30   -630  -70
-870  -70  -630  -30   -30
-630  -30  -70   -70   -70
-70   -70  -630  -70   -630
 
-70  -630  -30  -30   -630
-30  -70   -70  -70   -70
-70  -630  -70  -630  -30
-30  -70   -30  -70   -70
-70  -750  -30  -630  -70
 
-630  -70  -630  -30  -30
-70   -30  -70   -70  -70
-750  -30  -630  -70  -630
-30   -70  -70   -30  -70
-70   -30  -30   -30  -630
 
-30  -630  -70   -630  -30
-70  -70   -30   -70   -70
-30  -30   -30   -630  -70
-630 -70   -70   -70   -30
-70  -30   -630  -30   -30
 
-30  -30   -630  -70   -630
-70  -70   -70   -30   -70
-30  -630  -30   -30   -630
-70  -70   -70   -70   -70
-30  -750  -70   -870  -30
 
-630  -30  -30   -630  -70
-70   -70  -70   -70   -30
-750  -70  -870  -30   -30
-870  -70  -750  -30   -30
-750  -30  -870  -70   -870
 
-70  -870  -30  -30   -870
-70  -750  -30  -30   -750
-30  -870  -70  -870  -30
-30  -750  -70  -750  -30
-30  -70   -30  -870  -70
 
-870  -70  -870  -30  -30
-750  -70  -750  -30  -30
-70   -30  -870  -70  -870
-30   -30  -750  -70  -750
-30   -70  -30   -30  -870
 
-30   -870  -70  -870  -30
-30   -750  -70  -750  -30
-70   -30   -30  -870  -70
-870  -30   -30  -750  -70
-750  -70   -870 -30   -30
 
-30  -30   -870  -70   -870
-30  -30   -750  -70   -750
-70  -870  -30   -30   -870
-70  -750  -30   -30   -750
-70  -70   -70   -450  -30
 
-870  -30  -30   -870  -70
-750  -30  -30   -750  -70
-70   -70  -450  -30   -30
-450  -70  -570  -30   -30
-570  -70  -450  -70   -450
 
-70  -450  -30  -30   -450
-70  -570  -30  -30   -570
-70  -450  -70  -450  -30
-30  -570  -70  -570  -30
-30  -1470 -30  -450  -70
 
-450  -70  -450  -30  -30
-570  -70  -570  -30  -30
-1470 -30  -450  -70  -450
-30   -30  -570  -70  -570
-30   -30  -30   -30  -450
 
-30  -450  -70   -450  -30
-30  -570  -70   -570  -30
-30  -30   -30   -450  -70
-450 -30   -30   -570  -70
-570 -30   -450  -30   -30
 
-30  -30    -450  -70   -450
-30  -30    -570  -70   -570
-30  -450   -30   -30   -450
-70  -570   -30   -30   -570
-70  -1470  -70   -390  -30
 
-450   -30   -30    -450  -70
-570   -30   -30    -570  -70
-1470  -70   -390   -30   -30
-390   -70   -1110  -30   -30
-1110  -70   -390   -70   -390
 
-70  -390   -30  -30    -390
-70  -1110  -30  -30    -1110
-70  -390   -70  -390   -30
-30  -1110  -70  -1110  -30
-30  -990   -30  -390   -70
 
-390   -70  -390   -30  -30
-1110  -70  -1110  -30  -30
-990   -30  -390   -70  -390
-30    -30  -1110  -70  -1110
-30    -30  -30    -30  -390
 
-30    -390   -70   -390   -30
-30    -1110  -70   -1110  -30
-30    -30    -30   -390   -70
-390   -30    -30   -1110  -70
-1110  -30    -390  -30    -30
 
-30  -30    -390   -70  -390
-30  -30    -1110  -70  -1110
-30  -390   -30    -30  -390
-70  -1110  -30    -30  -1110
-70  -990   -30    -30  -990
 
-390  -30  -30  -390   -70
-1110 -30  -30  -1110  -70
-990  -30  -30  -990   -70
-1950 -30  -30  -1950  -70
-70   -70  -70  -70    -30
 
initial activations
 
 the activations:
-290.894989  -311.190002  -218.365005  -309.344971  -467.774994
-366.299957  -421.254944  -232.399963  -489.249969  -467.399994
-504.375     -552.794983  -798.929871  -496.005005  -424.964935
-374.639984  -654.389832  -336.049988  -612.870056  -405.450012
-544.724976  -751.060059  -418.285034  -545.465027  -500.065063
 
the outputs
0.029577  0.023333  0.067838  0.023843  0.003636
0.012181  0.006337  0.057932  0.002812  0.003652
0.002346  0.001314  6.859939e-05  0.002594  0.006062
0.011034  0.000389  0.017419  0.000639  0.00765
0.001447  0.000122  0.006565  0.001434  0.002471
 
40 iterations completed
 
 the activations:
-117.115494  -140.58519   -85.636215   -158.240143  -275.021301
-229.135956  -341.123871  -288.208496  -536.142212  -596.154297
-297.832794  -379.722595  -593.842102  -440.377625  -442.091064
-209.226883  -447.291016  -283.609589  -519.441101  -430.469696
-338.93219   -543.509766  -386.950531  -538.633606  -574.604492
 
the outputs
0.196963  0.156168  0.263543  0.130235  0.035562
0.060107  0.016407  0.030516  0.001604  0.000781
0.027279  0.010388  0.000803  0.005044  0.004942
0.07511   0.004644  0.032192  0.001959  0.005677
0.016837  0.001468  0.009533  0.001557  0.001012
 
tourcity 0 tour order 2
 
tourcity 1 tour order 0
 
tourcity 2 tour order 1
 
tourcity 3 tour order 4
 
tourcity 4 tour order 3
 
 the tour :
1
2
0
4
3
 
 distance of tour is : 73

Other Approaches to Solve the Traveling Salesperson Problem

The following describes a few other methods for solving the traveling salesperson problem.

Anzai’s Presentation

Yuichiro Anzai describes the Hopfield network for the traveling salesperson problem in a slightly different way. For one thing, a global inhibition term is not used. A threshold value is associated with each neuron, added to the activation, and taken as the average of A1 and A2, using our earlier notation. The energy function is formulated slightly differently, as follows:

E = A1 Σ1k xik-1)2 + A2 Σik xki-1)2 + A4 Σk Σj≠k Σi≠k dkj xki(xj,i+1 + xj,i-1)

The first term is 0 if the sum of the outputs is 1 in each column. The same is true for the second term with regard to rows.

The output is calculated using a parameter λ, here called the reference activation level, as:

xij = (1 + tan tanh(aij/λ))/2

The parameters used are A1 = 1/2, A2 = 1/2, A4 = 1/2, Δt = 1, τ = 1, and λ = 1. An attempt is made to solve the problem for a tour of 10 cities. The solution obtained is not crisp, in the sense that exactly one 1 occurs in each row and each column, there are values of varying magnitude with one dominating value in each row and column. The prominent value is considered to be part of the solution.

Kohonen’s Approach for the Traveling Salesperson Problem

Kohonen’s self-organizing maps can be used for the traveling salesperson problem. We summarize the discussion of this approach described in Eric Davalo’s and Patrick Naim’s work. Each city considered for the tour is referenced by its x and y coordinates. To each city there corresponds a neuron. The neurons are placed in a single array, unlike the two-dimensional array used in the Hopfield approach. The first and the last neurons in the array are considered to be neighbors.

There is a weight vector for each neuron, and it also has two components. The weight vector of a neuron is the image of the neuron in the map, which is sought to self-organize. There are as many input vectors as there are cities, and the coordinate pair of a city constitutes an input vector. A neuron with a weight vector closest to the input vector is selected. The weights of neurons in a neighborhood of the selected neuron are modified, others are not. A gradually reducing scale factor is also used for the modification of weights.

One neuron is created first, and its weight vector has 0 for its components. Other neurons are created one at a time, at each iteration of learning. Neurons may also be destroyed. The creation of the neuron and destruction of the neuron suggest adding a city provisionally to the final list in the tour and dropping a city also provisionally from that list. Thus the possibility of assigning any neuron to two inputs or two cities is prevented. The same is true about assigning two neurons to the same input.

As the input vectors are presented to the network, if an unselected neuron falls in the neighborhood of the closest twice, it is created. If a created neuron is not selected in three consecutive iterations for modification of weights, along with others being modified, it is destroyed.

That a tour of shortest distance results from this network operation is apparent from the fact that the closest neurons are selected. It is reported that experimental results are very promising. The computation time is small, and solutions somewhat close to the optimal are obtained, if not the optimal solution itself. As was before about the traveling salesperson problem, this is an NP-complete problem and near efficient (leading to suboptimal solutions, but faster) approaches to it should be accepted.

Algorithm for Kohonen’s Approach

A gain parameter λ and a scale factor q are used while modifying the weights. A value between 0.02 and 0.2 was tried in previous examples for q. A distance of a neuron from the selected neuron is defined to be an integer between 0 and n – 1, where n is the number of cities for the tour. This means that these distances are not necessarily the actual distances between the cities. They could be made representative of the actual distances in some way. One such attempt is described in the following discussion on C++ implementation. This distance is denoted by dj for neuron j. A squashing function similar to the Gaussian density function is also used.

The details of the algorithm are in a paper referenced in Davalo. The steps of the algorithm to the extent given by Davalo are:

  Find the weight vector for which the distance from the input vector is the smallest

  Modify the weights using
wjnew = wjold + (lnew - wjold)g(λ,dj), where g(λ,dj) = exp(- dj2/λ) /√

  Reset λ as λ (1 - q)

C++ Implementation of Kohonen’s Approach

Our C++ implementation of this algorithm (described above) is with small modifications. We create but do not destroy neurons explicitly. That is, we do not count the number of consecutive iterations in which a neuron is not selected for modification of weights. This is a consequence of our not defining a neighborhood of a neuron. Our example is for a problem with five neurons, for illustration, and because of the small number of neurons involved, the entire set is considered a neighborhood of each neuron.

When all but one neuron are created, the remaining neuron is created without any more work with the algorithm, and it is assigned to the input, which isn’t corresponded yet to a neuron. After creating n – 1 neurons, only one unassigned input should remain.

In our C++ implementation, the distance matrix for the distances between neurons, in our example, is given as follows, following the stipulation in the algorithm that these values should be integers between 0 and n – 1.

         0 1 2 3 4
         1 0 1 2 3
  d =    2 1 0 1 2
         3 2 1 0 1
         4 3 2 1 0

We also ran the program by replacing the previous matrix with the following matrix and obtained the same solution. The actual distances between the cities are about four times the corresponding values in this matrix, more or less. We have not included the output from this second run of the program.

        0 1 3 3 2
        1 0 3 2 1
  d =   3 3 0 4 2
        3 2 4 0 1
        2 1 2 1 0

In our implementation, we picked a function similar to the Gaussian density function as the squashing function. The squashing function used is:

f(d,λ) = exp( -d2/λ) / √ (2 )

Header File for C++ Program for Kohonen’s Approach

Listing 15.3 contains the header file for this program, and listing 15.4 contains the corresponding source file:

Listing 15.3 Header file for C++ program for Kohonen’s approach

//tsp_kohn.h V.Rao, H.Rao
#include<iostream.h>
#include<math.h>
 
#define MXSIZ 10
#define pi 3.141592654
 
class city_neuron
       {
       protected:
              double x,y;
              int mark,order,count;
              double weight[2];
              friend class tspnetwork;
       public:
              city_neuron(){};
              void get_neuron(double,double);
       };
 
class tspnetwork
       {
       protected:
              int chosen_city,order[MXSIZ];
              double gain,input[MXSIZ][2];
              int citycount,index,d[MXSIZ][MXSIZ];
              double gain_factor,diffsq[MXSIZ];
              city_neuron (cnrn)[MXSIZ];
       public:
              tspnetwork(int,double,double,double,double*,double*);
              void get_input(double*,double*);
              void get_d();
              void find_tour();
              void associate_city();
              void modify_weights(int,int);
              double wtchange(int,int,double,double);
              void print_d();
              void print_input();
              void print_weights();
              void print_tour();
       };

Source File Listing

The following is the source file listing for the Kohonen approach to the traveling salesperson problem.

Listing 15.4 Source file for C++ program for Kohonen’s approach

//tsp_kohn.cpp  V.Rao, H.Rao
#include “tsp_kohn.h”
 
void city_neuron::get_neuron(double a,double b)
       {
       x = a;
       y = b;
       mark = 0;
       count = 0;
       weight[0] = 0.0;
       weight[1] = 0.0;
       };
 
tspnetwork::tspnetwork(int k,double f,double q,double h,
double *ip0,double *ip1)
 
       {
       int i;
       gain = h;
       gain_factor = f;
       citycount = k;
 
       // distances between neurons as integers between 0 and n-1
 
       get_d();
       print_d();
       cout<<”\n”;
 
       // input vectors
 
       get_input(ip0,ip1);
       print_input();
 
       // neurons in the network
 
       for(i=0;i<citycount;++i)
              {
              order[i] = citycount+1;
              diffsq[i] = q;
              cnrn[i].get_neuron(ip0[i],ip1[i]);
              cnrn[i].order = citycount +1;
              }
       }
 
void tspnetwork::associate_city()
       {
       int i,k,j,u;
       double r,s;
 
       for(u=0;u<citycount;u++)
              {
              //start a new iteration with the input vectors
              for(j=0;j<citycount;j++)
                       {
                       for(i=0;i<citycount;++i)
                              {
                              if(cnrn[i].mark==0)
                                      {
                                      k = i;
                                      i =citycount;
                                      }
                              }
 
                       //find the closest neuron
 
                       for(i=0;i<citycount;++i)
                              {
                              r = input[j][0] - cnrn[i].weight[0];
                              s = input[j][1] - cnrn[i].weight[1];
                              diffsq[i] = r*r +s*s;
                              if(diffsq[i]<diffsq[k]) k=i;
                              }
 
                       chosen_city = k;
                       cnrn[k].count++;
                       if((cnrn[k].mark<1)&&(cnrn[k].count==2))
                              {
                              //associate a neuron with a position
                              cnrn[k].mark = 1;
                              cnrn[k].order = u;
                              order[u] = chosen_city;
                              index = j;
                              gain *= gain_factor;
 
                              //modify weights
                              modify_weights(k,index);
                              print_weights();
                              j = citycount;
                              }
                       }
              }
       }
 
void tspnetwork::find_tour()
       {
       int i;
       for(i=0;i<citycount;++i)
              {
              associate_city();
              }
 
              //associate the last neuron with remaining position in
       // tour
              for(i=0;i<citycount;++i)
                       {
                       if( cnrn[i].mark ==0)
                              {
                              cnrn[i].order = citycount-1;
                              order[citycount-1] = i;
                              cnrn[i].mark = 1;
                              }
                       }
 
                       //print out the tour.
                       //First the neurons in the tour order
                       //Next cities in the tour
                       //order with their x,y coordinates
 
                       print_tour();
              }
 
void tspnetwork::get_input(double *p,double *q)
       {
       int i;
 
       for(i=0;i<citycount;++i)
              {
              input[i][0] = p[i];
              input[i][1] = q[i];
              }
       }
 
//function to compute distances (between 0 and n-1) between
//neurons
 
void tspnetwork::get_d()
       {
       int i,j;
 
       for(i=0;i<citycount;++i)
              {
              for(j=0;j<citycount;++j)
                      {
                      d[i][j] = (j-i);
                      if(d[i][j]<0) d[i][j] = d[j][i];
                      }
              }
       }
 
//function to find the change in weight component
 
double tspnetwork::wtchange(int m,int l,double g,double h)
       {
       double r;
       r = exp(-d[m][l]*d[m][l]/gain);
       r *= (g-h)/sqrt(2*pi);
       return r;
       }
 
//function to determine new weights
 
void tspnetwork::modify_weights(int jj,int j)
       {
       int i;
       double t;
       double w[2];
       for(i=0;i<citycount;++i)
              {
              w[0] = cnrn[i].weight[0];
              w[1] = cnrn[i].weight[1];
              //determine new first component of weight
              t = wtchange(jj,i,input[j][0],w[0]);
              w[0] = cnrn[i].weight[0] +t;
              cnrn[i].weight[0] = w[0];
 
              //determine new second component of weight
              t = wtchange(jj,i,input[j][1],w[1]);
              w[1] = cnrn[i].weight[1] +t;
              cnrn[i].weight[1] = w[1];
              }
       }
 
//different print routines
 
void tspnetwork::print_d()
       {
       int i,j;
       cout<<”\n”;
 
       for(i=0;i<citycount;i++)
              {
              cout<<” d: “;
              for(j=0;j<citycount;j++)
                      {
                      cout<<d[i][j]<<”   “;
                      }
              cout<<”\n”;
              }
       }
 
void tspnetwork::print_input()
       {
       int i,j;
 
       for(i=0;i<citycount;i++)
              {
              cout<<”input : “;
              for(j=0;j<2;j++)
                      {
                      cout<<input [i][j]<<”   “;
                      }
              cout<<”\n”;
              }
       }
 
void tspnetwork::print_weights()
       {
       int i,j;
       cout<<”\n”;
 
       for(i=0;i<citycount;i++)
              {
              cout<<” weight: “;
              for(j=0;j<2;j++)
                      {
                      cout<<cnrn[i].weight[j]<<”   “;
                      }
              cout<<”\n”;
              }
       }
 
void tspnetwork::print_tour()
       {
       int i,j;
       cout<<”\n tour : “;
 
       for(i=0;i<citycount;++i)
              {
              cout<<order[i]<<” —> “;
              }
       cout<<order[0]<<”\n\n”;
 
       for(i=0;i<citycount;++i)
              {
              j = order[i];
              cout<<”(“<<cnrn[j].x<<”, “<<cnrn[j].y<<”) —> “;
              }
       j= order[0];
       cout<<”(“<<cnrn[j].x<<”, “<<cnrn[j].y<<”)\n\n”;
       }
 
void main()
       {
       int nc= 5;//nc = number of cities
       double q= 0.05,h= 1.0,p= 1000.0;
 
       double input2[][5]= {7.0,4.0,14.0,0.0,5.0,3.0,6.0,13.0,12.0,10.0};
       tspnetwork tspn2(nc,q,p,h,input2[0],input2[1]);
       tspn2.find_tour();
 
       }

Output from a Sample Program Run

The program, as mentioned, is created for the Kohonen approach to the traveling salesperson problem for five cities. There is no user input from the keyboard. All parameter values are given to the program with appropriate statements in the function main. A scale factor of 0.05 is given to apply to the gain parameter, which is given as 1. Initially, the distance of each neuron weight vector from an input vector is set at 1000, to facilitate finding the closest for the first time. The cities with coordinates (7,3), (4,6), (14,13), (0,12), (5,10) are specified for input vectors.

The tour found is not the one in natural order, namely 0 → 1 → 2 → 3 → 4 → 0, with a distance of 43.16. The tour found has the order 0 → 3 → 1 → 4 → 2 → 0, which covers a distance of 44.43, which is slightly higher, as shown in:

City placement and tour found for TSP.

The best tour, 0 → 2 → 4 → 3 → 1 → 0 has a total distance of 38.54.

Table         gives for the five-city example, the 12 (5!/10) distinct tour distances and corresponding representative tours. These are not generated by the program, but by enumeration and calculation by hand. This table is provided here for you to see the different solutions for this five-city example of the traveling salesperson problem.

Distances and Representative Tours for Five-City Example

Distance

Tour

Comment

49.05

0-3-2-1-4-0

worst case

47.59

0-3-1-2-4-0

 

45.33

0-2-1-4-3-0

 

44.86

0-2-3-1-4-0

 

44.43

0-3-1-4-2-0

tour given by the program

44.30

0-2-1-3-4-0

 

43.29

0-1-4-2-3-0

 

43.16

0-1-2-3-4-0

 

42.73

0-1-2-4-3-0

 

42.26

0-1-3-2-4-0

 

40.00

0-1-4-3-2-0

 

38.54

0-2-4-3-1-0

optimal tour

There are 12 different distances you can get for tours with these cities by hand calculation, and four of these are higher and seven are lower than the one you find from this program. The worst case tour (0 [rarr] 3 [rarr] 2 [rarr] 1 [rarr] 4 [rarr] 0) gives a distance of 49.05, and the best, as you saw above, 38.54. The solution from the program is at about the middle of the best and worst, in terms of total distance traveled.

The output of the program being all computer generated is given below as follows:

d: 0   1   2   3   4
d: 1   0   1   2   3
d: 2   1   0   1   2
d: 3   2   1   0   1
d: 4   3   2   1   0
 
input : 7   3
input : 4   6
input : 14  13
input : 0   12
input : 5   10
 
weight: 1.595769   2.393654
weight: 3.289125e-09   4.933688e-09
weight: 2.880126e-35   4.320189e-35
weight: 1.071429e-78   1.607143e-78
weight: 1.693308e-139  2.539961e-139
 
weight: 1.595769   2.393654
weight: 5.585192   5.18625
weight: 2.880126e-35   4.320189e-35
weight: 1.071429e-78   1.607143e-78
weight: 1.693308e-139  2.539961e-139
 
weight: 1.595769   2.393654
weight: 5.585192   5.18625
weight: 5.585192   5.18625
weight: 1.071429e-78   1.607143e-78
weight: 1.693308e-139  2.539961e-139
 
weight: 1.595769   2.393654
weight: 5.585192   5.18625
weight: 5.585192   5.18625
weight: 5.585192   5.18625
weight: 1.693308e-139   2.539961e-139
 
weight: 1.595769   2.393654
weight: 5.585192   5.18625
weight: 5.585192   5.18625
weight: 5.585192   5.18625
weight: 5.585192   5.18625
 
tour : 0 –> 3–> 1 –> 4 –> 2–> 0
 
(7, 3) –> (0, 12) –> (4, 6) –> (5, 10) –> (14, 13) –> (7, 3)

Optimizing a Stock Portfolio

Development of a neural network approach to a stock selection process in securities trading is similar to the application of neural networks to nonlinear optimization problems. The seminal work of Markowitz in making a mathematical formulation of an objective function in the context of portfolio selection forms a basis for such a development. There is risk to be minimized or put a cap on, and there are profits to be maximized. Investment capital is a limited resource naturally.

The objective function is formulated in such a way that the optimal portfolio minimizes the objective function. There would be a term in the objective function involving the product of each pair of stock prices. The covariance of that pair of prices is also used in the objective function. A product renders the objective function a quadratic. There would of course be some linear terms as well, and they represent the individual stock prices with the stock’s average return as coefficient in each such term. You already get the idea that this optimization problem falls into the category of quadratic programming problems, which result in real number values for the variables in the optimal solution. Some other terms would also be included in the objective function to make sure that the constraints of the problem are satisfied.

A practical consideration is that a real number value for the amount of a stock may be unrealistic, as fractional numbers of stocks may not be purchased. It makes more sense to ask that the variables be taking 0 or 1 only. The implication then is that either you buy a stock, in which case you include it in the portfolio, or you do not buy at all. This is what is usually called a zero-one programming problem. You also identify it as a combinatorial problem.

You already saw a combinatorial optimization problem in the traveling salesperson problem. The constraints were incorporated into special terms in the objective function, so that the only function to be computed is the objective function.

Deeming the objective function as giving the energy of a network in a given state, the simulated annealing paradigm and the Hopfield network can be used to solve the problem. You then have a neural network in which each neuron represents a stock, and the size of the layer is determined by the number of stocks in the pool from which you want to build your stock portfolio. The paradigm suggested here strives to minimize the energy of the machine. The objective function needs therefore to be stated for minimization to get the best portfolio possible.

Tabu Neural Network

Tabu search, popularized by Fred Glover with his contributions, is a paradigm that has been used successfully in many optimization problems. It is a method that can steer a search procedure from a limited domain to an extended domain, so as to seek a solution that is better than a local minimum or a local maximum.

Tabu search (TS), suggests that an adaptive memory and a responsive exploration need to be part of an algorithm. Responsive exploration exploits the information derivable from a selected strategy. Such information may be more substantial, even if the selected strategy is in some sense a bad strategy, than what you can get even in a good strategy that is based on randomness. It is because there is an opportunity provided by such information to intelligently modify the strategy. You can get some clues as to how you can modify the strategy.

When you have a paradigm that incorporates adaptive memory, you see the relevance of associating a neural network:. a TANN is a Tabu neural network. Tabu search and Kohonen’s self-organizing map have a common approach in that they work with “neighborhoods.” As a new neighborhood is determined, TS prohibits some of the earlier solutions, as it classifies them as tabu. Such solutions contain attributes that are identified as tabu active.

Tabu search, has STM and LTM components as well. The short-term memory is sometimes called recency-based memory. While this may prove adequate to find good solutions, the inclusion of long-term memory makes the search method much more potent. It also does not necessitate longer runs of the search process.

Some of the examples of applications using Tabu search are:

  Training neural nets with the reactive Tabu search

  Tabu Learning: a neural network search method for solving nonconvex optimization problems

  Massively parallel Tabu search for the quadratic assignment problem

  Connection machine implementation of a Tabu search algorithm for the traveling salesman problem

  A Tabu search procedure for multicommodity location/allocation with balancing requirements

Summary

The traveling salesperson problem is presented in this chapter as an example of nonlinear optimization with neural networks. Details of formulation are given of the energy function and its evaluation. The approaches to the solution of the traveling salesperson problem using a Hopfield network and using a Kohonen self-organizing map are presented. C++ programs are included for both approaches.

The output with the C++ program for the Hopfield network refers to examples of four- and five-city tours. The output with the C++ program for the Kohonen approach is given for a tour of five cities, for illustration. The solution obtained is good, if not optimal. The problem with the Hopfield approach lies in the selection of appropriate values for the parameters. Hopfield’s choices are given for his 10-city tour problem. The same values for the parameters may not work for the case of a different number of cities. The version of this approach given by Anzai is also discussed briefly.

Use of neural networks for nonlinear optimization as applied to portfolio selection is also presented in this chapter. You are introduced to Tabu search and its use in optimization with neural computing.