[From Bill Powers (971016.0711 MDT)]

Chris Cherpas (971015.1752 PT)]

I've seen a couple of references to "problem solving" in B:CP

and in _Introduction to Modern Psychology: The Control Theory View_,

in particular in relation to the Program Level of HPCT. I was

curious as to whether there are any other PCT treatments. My interest

is in trying to decide whether/how to teach "mathematical problem solving"

as an explicit subject, as opposed to a more implicit treatment. My

understanding is that attempts to teach the methods of Polya, for example,

do not have much impact.

The problem with "problem solving" is that there isn't any way I ever

learned to systematize it. If I give you the problem of taking the square

root of 2, and you know a method for doing it, it's not a problem. If you

don't know a method, it's a problem. So how do you get from not knowing how

to knowing how? _That's_ the problem.

If I know a method, I can demonstrate it for you and you can try to

reproduce it. If you reproduce it literally, then you'll become able to

take the square root of 2, but not the square root of any other number. To

be able to take the square root of any number, you have to get away from

reproducing the specific operations you saw me carry out with specific

numbers, and "generalize" -- see how the "same" operations could be carried

out with _different_ numbers. You have to learn to conceive of the specific

numbers as variables, and look at the form of the operations as distinct

from their specific application.

This is how I think of the program level. The perceptions at the program

level are not "specific numbers" but simply symbols that can be associated

with any specific lower-level perceptions. At the program level, we learn

x1 = (Q/x0 + x0)/2

x2 = (Q/x1 + x1)/2

and so on, where Q is the number we want the square root of and the x's are

trial values that get closer and closer to the real square root. The "and

so on" appeals to our ability to perceive the program, which is

x[n+1] = (Q/x[n] + x[n])/2

So how do we teach people to recognize the program, the "and so on?" The

answer is that we can't. Either they have the ability to recognize a

program or they don't. If they have a working program level they can come

to recognize the program; if they don't have it, we can't give it to them.

This is one reason I think there really is a program level.

But this still doesn't get to the bottom of problem-solving. Newton's

method for taking square roots was, of course, invented by Newton (or at

least by somebody). Whoever really invented it did so without being told

how to do it; without being given examples from which to generalize. We can

imagine that for this person, the problem was easily stated: find a number

which, when multiplied by itself, equals another specified number.

That's a control problem, isn't it? The reference level is Q, the number we

want the square root of. The input is a number R, and the perceptual signal

is R*R -- we want R*R to equal Q. If there is an error, what do we do to

the input R to make the error smaller?

The most basic method is to keep selecting R at random, multiplying it by

itself to get R*R, and seeing if this product equals Q. If it doesn't, try

another value of R at random. At some point you'll get a number that makes

the error small enough for you to say you're satisfied. This is the method

of reorganization: random variation and selective retention. It could take

a long time, and you do not learn any systematic method.

The next step would be to see a _relationship_ between the size of R and

the size of the error. If R is very large, as large as Q, the error is

positive. If R is very small in comparison with Q, the error is negative.

So this leads (somehow) to seeing that the right value of R must be between

the value that gives a negative error and the value that gives a positive

error. If you can't see that, you're stuck. So perceiving this relationship

must also be a built-in ability, or at least an ability that exists before

you try to solve this problem.

Once you see that the error lies between a positive and a negative limit,

you can try various systematic approaches. You might try starting with a

value of R that's way too small, and then increasing it little by little.

Now, if you have the equipment, you can see a relationship between a

_change_ in R and a _change_ in the error. If someone points it out to you

and you still don't see it, you're stuck again.

Obviously (it says here), what you want is for the error to get smaller. So

you start with a small value of R and a large error, and increase R step by

step while the error gets smaller step by step. At some point the error

switches from a small positive value to a small negative value and then

starts getting more negative. The value of R*R closest to Q at the point

where the error changes sign gets you pretty close to the right number.

Now it has to occur to you (somehow) that you can (a) make the steps

smaller and thus get closer to zero error, and (b) reverse the direction of

the steps when the error changes sign so you don't have to start over. More

relationships that you must be able to perceive. This will get you to the

square root much faster than the method of reorganization will, and if you

keep going you can get as small an error as you like.

Ultimately, you may wonder if there isn't some way to choose the step size

so you can get to the point where the error changes sign as soon as

possible, and some way to reduce the step size so you get to the next

approximation as quickly as possible. What you end up with, by trial and

error or systematic algebra or calculus, is Newton's Method.

There are a couple of points I'm trying to make here. One is that there are

certain basic abilities that are required to solve this or any other

problem, and if they don't exist there's no way to teach them. They're part

of the innate structure of the brain; a dog will never learn to take a

square root.

If a person lacks a specific required mental ability, that person will not

be able to solve certain problems no matter what the teaching method. So it

behooves teachers to learn to measure the basic mental abilities needed to

solve particular problems.

The second point is that solving any problem would seem to benefit from

setting it up as a _control_ problem. There has to be some way to know when

the problem has been solved (a reference condition), and there must be some

way to compare the result of the current solution being tried (a

perception) against the reference condition, to tell whether you're getting

closer to or farther from the solution. If you try something new, and can't

tell whether the result is better or worse than the previous result, you're

stuck -- all that's left is trial-and-error reorganization, which is not

likely to get you to the goal very quickly.

This happens to me all the time in mathematics. I know many of the basic

manipulations that are possible. I can transform equations into other

forms. I have a fair idea of the result I want. But the sticking point is

always knowing whether the current mess on the paper is closer to a

solution than the previous mess. It's like being in a maze with a light

directly over the goal visible from anywhere, but with only the nearby

branch-points being visible. I can go forward or back, take this turn or

the other, but there's no indication of whether any move is taking me into

a blind alley, in a circle, or toward the goal. There's no measure of

actual progress.

Here's an example that comes up in solving the control equations. Suppose

(to simplify) we have

o = k*e

e = r - o.

What we want is to solve for o. We substitute the second equation into the

first to eliminate e (first goal), and get

o = k*(r - o)

Now we have a problem. We want something of the form o = (expression), but

what we have is o on _both sides_ of the equal sign. We can't solve for o

unless we already know the value of o (to plug in on the right side). The

immediate goal, therefore, is to transform this equation so that at least o

is found only on the left side. We certainly don't want it on the right side.

A good strategy in general is to expand all parentheses and see what we've

got:

o = k*r - k*o

Clearly, we can leave the k*r on the right, because there's no o in it. Now

we have to get k*o onto the left side and eliminate it from the right side.

Elementary, as they say: adding the same amount to both sides of an

equation leaves it still true. So we add k*o to both sides:

o + k*o = k*r - k*o + k*o

The positive k*o cancels out the negative k*o on the right, leaving

o + k*o = k*r

Now we're closer to the solution because o appears only on the left side.

However, it appears twice, while what we want is for it to appear just

once, if possible. So that becomes the new goal: to have o appear on the

left only once.

This can be done by extracting the common factor o from the expression on

the left:

o * (k + 1) = k*r

Now there is only one o on the left. It is, however, not the only thing on

the left; it is multiplied by something in parenthesis. Since we really

want o all by itself on the left, we have to get rid of that parenthesized

expression. That's the next subgoal.

If we divide both sides of an equation by the same number or expression, we

don't change the truth of the equation. If we divide the left side by (k +

1), we will have o * (k+1)/(k+1), which is o * 1, which is o. So let's

divide both sides by (k+1), to get

o = k*r/(k+1)

And there we are: o is all by itself on the left. We have solved the

equations for o.

From this we can see something new: we have to have available a set of

possible outputs -- actions -- that we can try. Basically, manipulating an

equation into new forms involves doing things to it that don't change the

truth of the equation -- the equal sign must still mean that the two sides

are equal in value. Here we used two such operations: adding the same thing

to both sides, and dividing both sides by the same thing. We could have

done many other things: taking the square root of both sides, taking the

logarithm, raising both sides to the same power, and so on. But each of

those other operations would have resulted in o still appearing on both

sides; we would be no closer to the goal of getting o by itself on the

left. If we had done the division by (1 + k) first, we would have found

o/(1+k) = k*r/(1+k) - k*o/(1+k)

which is more complicated and still leaves an o on the right. The error

would be unchanged by this operation, or if you consider complexity too it

gets larger. It's only when we subtract k*o from both sides that the error

gets smaller: o appears only on the left, although it still appears twice.

Notice, however, that if we _then_ add k*o/(1+k) to both sides, we get

o/(1+k) + k*o/(1+k) = k*r/(1+k).

We can then extract the common o on the left:

o*(1/(1+k) + k/(1+k) = k*r/(1+k),

and the expression on the left in parentheses reduces to (1+k)/(1+k) or 1.

But in getting to the same result, we have made the equation look much more

complicated for a while, and have had to use more complicated

transformations of the expression in parentheses. Even though the second

method is mathematically just as valid as the first, in the end, it

actually leads to greater error for a while on the way, an error measured

in terms of complexity of the appearance of the equations.

Ideally, we would like the errors to decrease or at least never increase.

Whether this actually happens clearly depends on picking the right

operations in the right sequence -- and of course on defining what we mean

by "error" appropriately. In many mathematical derivations and proofs that

I have seen, there is an enormous increase in complexity between the first

and last statements; I wonder whether it might be possible to find either a

measure of error or a sequence of operations that would keep the error from

ever increasing. Complexity is only one measure; maybe it's not the most

useful one. Would it be possible, through studying mathematical derivations

and proofs, to find measures of error that consistently decrease with each

step of the process? Or at least that never increase? If so, this would be

of enormous help in the teaching of mathematics -- as well as to

mathematicians.

I don't know if any of this is relevant to your project.

Best,

Bill P.