SIMULATION

Session 1: Variables, Functions, and Systems

For purposes of simulation, the world does not consist of

objects, but of variables and functions. Let's talk about that

for a while.

VARIABLES

A variable is something that can vary along a scale. We use names

or temporary symbols like x and y to distinguish one scale from

another, like size, position, rotation rate, force, sweetness,

and sadness. The _identity_ of the variable is the name of the

scale. The _value_ of the variable is its position on the scale.

We use the name of the scale, commonly, to refer both to the

identity of the scale and to the present value of the variable on

that scale. Thus the variable F (for "force") refers to some

number that tells us how much force exists, and at the same time

it reminds us that the number 5 belongs on the force scale, and

not, for example, on the sweetness scale. When we say that F = 5,

we are saying that the magnitude of the variable is 5, and also

that it is the variable named F, for Force, that we are talking

about.

The important difference from common usage is that when we name a

scale, such as sadness, we still have to specify a position on

that scale. So we don't just say a person is "sad." We establish

a maximum value of the variable and a minimum value -- say, 100

and 0 -- and then give a number that says where on the scale the

variable is right now: sadness = 37% of the maximum. If someone

says he is sad, and that the amount of the sadness is 2% of the

maximum value, we don't have to feel very sorry for him, not as

sorry as if he said he was 98% sad. Everybody is sad; it's just

that most people are 0% sad, while others are other amounts of

sad.

When we deal with these scales without the concept of variable

amounts, we make them, in effect, into binary scales. The wind is

blowing or it is not. We are sad or we are not. Something is big

or it is small; it is spinning or stationary; sweet or bland. We

turn variables into things and states, as if sadness were a state

that exists or doesn't exist, or as if the only positions that

something could take were near and far.

The reason we deal with variables and not things or states is

that in simulations we're trying to reproduce the _behavior_ of

something: how it acts through time. In the real world, there are

no instantaneous events; all processes, no matter how rapidly

they occur, take time to happen. Every event has a beginning, a

middle, and an end, if you look at it on a fine enough time

scale. In simulations, we want to reproduce not just the

"occurrance" of events, but the particular way in which variables

change during the event. We need to say not just whether

something is near or far, but where it is at every instant.

This means that all processes that we simulate (other than those

on the subatomic scale) must be defined in terms of continuous

changes in the values of variables. "Continuous" means that when

a variable changes from one value to another, it has to pass

through all the intermediate values. When we "run" a simulation,

the variables we're dealing with all obey this rule: to get from

one value to another, they have to pass through all the

intervening values. Sometimes those intervening values become

important.

To represent the world in terms of variables is to recognize its

fundamentally continuous nature.

FUNCTIONS

Another basic aspect of simulations is the way we represent the

effects of some variables on other variables. It's a basic

assumption in simulation that the laws of nature are regular and

reliable. Even if the relationships among variables change, they

change for reasons that can be represented as the lawful and

regular effects of other variables. We represent regular and

lawful effects by saying that an effect-variable is a _function_

of a set of causal variables. More on that shortly.

In physics, at least when I studied it, the objects and things in

nature are treated as if they have _properties_. The properties

of something consist of the way some variables associated with it

change when other variables are changed. For example, when the

temperature of a given weight of water is raised toward the

boiling point (temperature is one variable), the volume of water

increases in a certain way (volume is another variable). That

relation of temperature to volume is a property of water. Water

has many properties, each property being a relationship among the

variables that, together, make up what we call water.

Directionality of cause and effect

One famous property of matter is stated in the formula, F = MA.

If a force (a particular value of one variable on a scale called

F) is applied to a piece of matter, the matter's acceleration (a

particular value of a second variable on its own scale, called A)

will be proportional to the force. Both force and acceleration

are observable variables. The constant of proportionality, M, is

not directly observable in the same way. It represents the way in

which a certain amount of matter converts forces into

accelerations. Calling this constant of proportionality "mass"

makes it seem like something having the same kind of measurable

reality that the force and the acceleration have, but

measurements of mass always require us to observe other

variables, like force and acceleration, and _define_ the mass in

terms of them: M = F/A. Mass is not a variable in the same sense

that force and acceleration are; it's a property of matter. For

any given hunk of matter, if we measure both F and A under many

different conditions, we find that F/A is always the same. That's why we

think of it as a property of that piece of matter -- its

mass. And since it's a very reliable property that doesn't

change, we represent it not as a variable, but as a constant: a

variable with only one fixed value.

By applying a force, we can produce an acceleration. But by

applying an acceleration, can we produce a force? This is a trick

question, because if you ask how you produce an acceleration of

an object, you find that you must have applied a force to do so,

the very force you're trying to produce by producing the

acceleration. We can apply a force by stretching a spring or

heating a container of gas, but we can't apply an acceleration

except by applying a force. In terms of causation, therefore, we

should write A = F/M (putting the effect on the left and the

cause on the right, by convention).

This says that "F = MA" is incorrect in one regard: it implies

that we can produce a force by (magically) making an acceleration

appear. Physicists typically ignore this directionality in

equations like this. In fact, they treat F, M, and A as if they

were all equivalently "real", and F = MA as meaning exactly the

same thing as A = F/M, just because algebra lets you transform

the equations that way. Algebra can't reveal causality.

The expression F = MA really means, for the simulator of systems,

that if we have an acceleration A and a mass M, the applied force

_must have been_ F. Writing the equation in this way makes it

into a deduction of a prior cause, F, from observations of its

effect, A. Writing it the other way, A = F/M, is a _prediction_

of an effect, A, from observation of the cause, F. In both cases

we are describing _the same unidirectional relationship_, in

which the causal arrow runs only from F to A, not the other way

around, despite the fact that according to convention, F = MA

implies that A is the cause and F the effect.

Another example closer to home may help. In the eye, there are

photoreceptors that respond with a neural signal of a certain

frequency S when light of a certain intensity I is absorbed (the

real relation is more complex, but the point is not affected by

that). If the light has an intensity I and the neural signal

representing light has a frequency S, we can say that

S = k*I,

where the asterisk means "times", and k is a constant number set

to make all measurements of S and I consistent with each other.

By artificially creating a light intensity with a magnitude I,

the equaations says, we can produce a neural signal S with a

magnitude equal to k*I.

Clearly, algebra lets us write, just as truthfully as far as

algebra is concerned,

I = S/k.

By the same interpretation, this says that if we artificially

induce a neural signal S, the result will be absorption of light

of intensity I. But that is clearly not what would happen.

Creating light causes a signal to appear, but creating a signal

does not cause light to appear. This is a one-way relationship.

In simulating systems, we must pay attention to the

directionality of the relationship between one variable and

another. In cases where there is true bidirectionality, as in the

relation between the positions of the two ends of a lever, we

represent the relation by two arrows, one going in each

direction. If side B of a lever moves 3 times as much as side A,

then side A moves 1/3 as much as side B. Since we can in fact

push on either end of the lever to move the other end, we can

say, using Y for distance moved,

Yb = 3*Ya, and

Ya = (1/3)*Yb

The direction of causality (that is, which equation to use) has

to be determined in some other way, since this is a truly

bidirectional relationship.

The point I'm making here is a very important one that has been

known to be overlooked. One mustn't just blindly manipulate

mathematical relationships and interpret the results according to

the convention that the effect is always the variable on the left

side of the equation. It's necessary to look at the physical

situation and think through the question of what is actually

causing what.

To help keep causation straight, all relationships represented by

arrows in a diagram of a simulation are taken to be

unidirectional in the direction of the arrow. Any truly

bidirectional relationship is shown by two distinct arrows, one

running in each direction between two variables.

Mathematical functions

As I learned the term 50 years ago, a _function_ is a

mathematical expression describing how ONE variable depends on a

SET of other variables (and possibly itself). This is in contrast

to a _relation_, in which a SET of variables depends on a SET of

other variables. A relation can always be represented as a

collection of functions, each function producing one variable

that is a particular function of all the "input" variables, as

many functions existing as there are "output" variables.

The simplest function is one with a single input variable and a

single output variable, like A = F/M (the causal arrow running to

the left). The two variables are A (the output) and F (the

input), where we say "output" to designate the head end of the

causal arrow.

Suppose that we have applied different amounts of F and have

measured A each time, and that we've written the results in a

list:

F --> A

5 15

8 24

1 3

17 51

55 165

22 66

etc.

Now if we want to predict A from a known amount of F, all we have

to do is haul out the list, look up the value of F, and read off

the value of A. This is called a lookup table. Of course such

tables are usually sorted so the left-hand entries increase from

smallest to largest; this makes the input entry easier to find,

and also allows us to interpolate when there's no value of the

input that exactly matches the one we want to look up. The

simulation program Vensim has a "lookup" function for exactly

this purpose.

The great advantage of the lookup table as a way to describe a

relationship between two variables is that there is no need for

any mathematics. One could put together an entire simulation in

which all relationships between variables were represented by

lookup tables containing only observed values of variables. Of

course this would get awkward when one variable depended on two,

three, or more other variables; the lookup tables would become

two, three, or higher-dimensional tables requiring large amounts

of storage space. But with disk space costing ten cents per

megabyte or less, that's no big problem any more.

This concept of a simulation without mathematics may surprise

some people. There's a tendency to think that the mathematical

forms that show up in simulations are important in themselves, as

if there were hidden meanings in the variables and their square

roots and cosines and reciprocals and so on. But there are no

hidden meanings. The only purpose of the mathematical forms is to

save us the trouble of creating and then referring to large

numbers of very extensive lookup tables. The mathematical form F

= MA can replace a whole long lookup table like the one above,

provided that we find M ( = F/A) to be the same for every entry

(is it?).

If you don't like lookup tables, mathematical forms are a way to

APPROXIMATE the actual relationship. In the table above, it

happens that for each value of F, the measured value of

acceleration is 3*F. So by using the mathematical form A = 3*F,

we can generate the right value of A for any value of F, without

needing the lookup table and extrapolating between its entries.

In this case the mathematical form gives the exact relationship,

but this will not generally be true. The measurements will be somewhat

uncertain, and nature seldom cooperates by relating its

variables in ways that conform _exactly_ to a simple mathematical

form.

The upshot is that most real simulations will not use more than

one or two lookup tables, and will definitely rely on

mathematical forms to represent real relations among variables.

But going through this side-issue has been useful, I hope, in

conveying what these mathematical forms are FOR. Their only

purpose is to allow one variable to be evaluated given the values

of all the other variables on which it directly depends. The

mathematical forms themselves are only a means of doing this

easily and with reasonable accuracy.

When we use a mathematical form to represent the way one variable

is affected by one or more others, we call this form a

_function_, and say that the one variable is a specific function

of the other variables. In the equation z = 2x + 3y - 27, z is a

function of x and y, with the specific function being 2x + 3y -

27. It is possible that some other set of variables might be

related in a way that is the same except for the names of the

variables: we might find that p = 2r + 3s - 27. In that case we

would say that p is a function of r and s, and z is the _same

function_ of x and y. This can make us suspect that there is some

deeper level at which the variables named x,y, and z are

connected to those named p,r, and s -- the suspicion usually

being incorrect. The fact that gravitational attraction and sound

intensity both follow an inverse-square function of distance does

NOT mean that gravity is like sound.

SYSTEMS

The word "system" has been tossed around quite a lot. It's really

just a convenient label for the collection of variables and

functions we have decided to study. We obviously can't study the

entire universe, so we bite off a small chunk of it, making note

of the connections we broke in doing so, and try to understand

how all the variables related by functions in this chunk will

behave when left to themselves. The broken connections may or may

not be important; usually we find they are important when

ignoring them results in a simulation that behaves differently

from the real world.

To set up a system for simulation, we have to go through it and

pick out all the variables that matter. When we first start, we

probably don't know all the variables that matter, but we can

discover them. The way we discover them is that we start building

the diagram of the system, and find that there are variables

unaccounted for -- that is, they aren't functions of other

variables, they haven't yet been set up as arbitrary constants to

be adjusted by the user of the simulation, and they aren't

outputs of the system. The variables that are either arbitrary

constants or outputs form the boundaries of the system, its

connections to the world it was separated from. The internal

variables that haven't been accounted for simply have to be accounted for.

A variable is either an input to the whole system,

an output from the whole system, or some function of other

variables in the system. There are no variables that are "just

there."

This doesn't take care of _omitted_ variables -- variables we

didn't realize are there in the real system and are important. I

don't know of any formula that will bring them to light. Without

them, the simulation won't behave like the real system, so you

have to find them. If I could tell you a formula for doing this,

I would be richer than Bill Gates, and I wouldn't tell you.

Aside from the input and output boundary variables, the other

variables in the system can be connected in an uncountable

(literally) number of ways. First, any variable can be a function

of one or more other system variables, up to the total number of

variables. And second, the forms of the functions can be anything

imagineable that can be accomplished by the physical system in

question. I believe that this takes us beyond the Aleph-null

degree of infinity.

With a menu of systems that is transfinite in size, there is

obviously no point in studying all possible types of systems. I

don't believe that anybody knows how to enumerate or classify

them, much less characterize them. The only useful approach is to

start with the real system, and find a representation of it that

does it justice. This means that in simulating real systems, we

inevitably must join theory with experiment.

The basic reason for simulating real systems is that we can't

understand the real systems in one whole chunk, unless they're so

simple as to be trivial. But we can take one variable, and by

hook or by crook identify the other system variables on which it

depends. We can write a function that completely (as far as we

know) accounts for the variable in terms of the values and

changes in value of other variables.

This amounts to identifying a subsystem within the whole system

that could be chopped out of the whole system without changing

the way the output variable depends on the input variables. In

fact, its boundaries are the very input and output variables we

have identified. If the rest of the system has ANY other way of

influencing the output variable, we simply add that way to the

list of input variables, and keep doing this until this subsystem

is connected to the rest of the system ONLY through its input

variables and its single output variable.

Having accounted for one system variable as the output of a

subsystem, we then proceed to account for ALL the other system

variables in the same way, except the input and output variables

of the whole system. This results in an analysis of the whole

system into a collection of interacting subsystems, each

subsystem being a function that converts a set of input variables

into the state of one output variable.

If we have done this analysis of the real system correctly, we

should be able to represent each subsystem as a block in a

computer simulation, connect the blocks so that the outputs of

some blocks become inputs to others, set up the input boundary

conditions, run the simulation, and see that the behavior of ALL

THE REAL SYSTEM VARIABLES matches the behavior of ALL THE

SIMULATED SYSTEM VARIABLES. Not only the input and output

variables should match; all the other internal variables in the

real system should match their simulated counterparts, too.

The check against the real system variables (as many as possible)

is a vital part of simulating real systems; it's the reality

check. But that's not the main thing we get out of a simulation.

The main result is that we see the system doing things we never

could have predicted using unaided brain power. If we're lucky,

we also come to understand _how_ the behavior of the system

emerges from its organization, which means we come to

_understand_ the system in ways that were never open to us

before.

Thus endeth the lesson. This has been just a very rapid run-

through of the basic ideas behind simulation. The detailed how-

to-do it comes next, to the extent that I and any others who care

to chime in can convey it. This has been very much my own slant

on the principles of modeling through simulation, and others may

see things differently. While I work up the start of the

applications phases of these lessons, maybe we can talk about the

concepts as described so far.

Bill Powers

[Copyright 1998, by William T. Powers]