What happened to CSGnet?

[From Bill Powers (2009.11.12.0905 MDT)]

Martin Taylor 2009.11.11.23.43 --

Would you check the scaling of the two plots of AP and BR? It occurred to me that the range of the SS# is from 0 to 99, which (omitting the zero) gives a range of logarithms (base 10) from 0 to 2. This means that a log plot of residuals will vary only 2% as much as a linear plot if you use the same scale for both. Does your treatment compensate for that effect?

Best,

Bill P.

[From Rick Marken (2009.11.12.0900)]

Martin Taylor (2009.11.11.23.43)–

MT: To test the degree
of
control implied by each proposal, I took the ratio between the
predicted
and actual bids for all 30 data points, and subtracted 1.0 (since with
perfect control the ratio would be exactly 1.0).

1618548.jpg

This seems to be a graph showing the goodness of fit (in terms of ratios) of the two models to the data. The AP model fits the data nearly perfectly while the BR model shows a comparatively large discrepancy. I would sure like to know why this difference occurs. If what you did was similar to what I did in my “control of size” test, then the difference should be a result of using different definitions of the CV in the two cases, all else being equal. That’s why I would like to see your spreadsheet; I would like to see how you came up with these results.

Yes, it would. To answer Rick, my proposal is that what is controlled
is indeed the affected perceptual signal – the perceived magnitude of
the number that is to be made to correspond to the perceived value of
the object.

Ah. So you are assuming that each object has a perceived value, v, and that the bid is being made to match that value. So where does SS# come in? Is your hypothesis that the controlled variable is

CV = v -bid/SS# (1)

or something like that? The SS# must affect the CV in a way that requires a larger bid as SS# increases.

If your model (AP) included an estimate (from the data) of a v value for each object and my model (BR) didn’t, then that would go a long way towards explaining why the AP model fit the data better. To make a fair comparison, v should be included in the BR model as well. Indeed, equation (1) is a better version of the BR model since it takes into account the fact that people do consistently bid different amounts for the different objects.

It’s an ordinary compensatory tracking model with the object’s
perceived value as a reference level, but with a logarithmic transform
for the perceptual magnitude of the bid number (the cursor).

This sounds good. I hope the value reference was included in both models.

The
spreadsheet doesn’t implement the control model as a feedback loop. It
simply says “if X is the controlled perception, the output should
produce these data”, sidestepping the actual process of control. I just
ran it for X = c + kbid/SS# with c and k optimized for each object
separately, and for X = v
scale, with v being the average across
subjects for each object individually and scale estimated for each SS#
band (meaning I was wrong to say this proposal used only 5 df in the
data fit; it actually uses 10df, the same as for the bid-ratio
proposal).

So one controlled variable was: X = c + kbid/SS# and the other was X = vscale? If this is it, then why does v show up in only one controlled variable? I would really like to know what you used for the controlled variables for the AP and BR models? I’d appreciate it if you could send your spreadsheet so I can see what you did.

Thanks.

Best

Rick

···


Richard S. Marken PhD
rsmarken@gmail.com
www.mindreadings.com

[From Bill Powers (2009.11.12.0905 MDT)]

From Richard Pfau, 11/09/2009

The instructions for the Ariely experiment:

[write the] last two digits of your social securiity number at the
top of the [response sheet], and then write them again next to each of
the items in the form of a price. [and then] When you are finished with
that I want you to indicate on your sheets­with a simpple yes or
no­whether you would pay that amount for each of the prooducts.â€? Next,
they were to write down the minimum they would pay for each of the
products.
I believe that last sentence should say “maximum”, not
“minimum.” Were the subjects to say the least they would pay
for something? If so, why wasn’t that zero for all items? If the word was
meant to be minimum, the only way for subjects to make sense of it would
be to interpret the instruction to mean "State the smallest amount
you would expect to have to pay for the item. The term
“bid” has been used, though the above instructions don’t
mention it. Interpreted that way, the instruction as quoted would mean
“what is the least you think you could bid for this item with some
chance of getting it for that price?” That’s not the same as asking
for the smallest amount you would be willing to pay.

We can view the SS# as a price tag put on each item. A control-system
model can be constructed by assuming that there is some price the person
would pay without experiencing any error, and that paying prices higher
than that would have negative effects offsetting the desire to acquire
the item.

We can estimate reference level and slope of the error-output curve using
the first two columns of the data:

Range of last two digits of SS number
Products
00-19 20-39 40-59
60-79
80-99 Correlations
Cordless
Trackball
8.64 11.82
13.45 21.18
26.18
0.42
Cordless
Keyboard
16.09 26.82
29.27 34.55
55.64
0.52
Design
Book
12.82 16.18
15.82 19.27
30.00
0.32
Chocolates
9.55 10.64
12.45 13.27
20.64
0.42
1998
Wine
8.64 14.45
12.55 15 45
27.91
0.33
1996
Wine
11.73 22.45
18.09 24.55
37.55
0.33Consider the cordless keyboard. The person would pay $6.09 more than
a price tag of $10, but $3.18 less than a price tag of $30. The slope of
a straight line between these points is -$9.27/20 or -0.46. This implies
a reference price of $23.14, the price at which the line with this slope
crosses the price-axis.

Here’s my spreadsheet for the gain and reference level values. The
entries are the errors between reference level and price tag for each
item. I haven’t tried to make a model, so this is just something to think
about for now. Gain really just means slope of the line. The ref and gain
were computed using only the first two columns.

19e3ccc.jpg

This should really be done for each individual who took part, not with
group averages. Would the raw data still be available?

.xlt spreadsheet attached.

Best,

Bill P.

Areilly.xlt (8.5 KB)

···

[From Rick Marken (2009.11.13.1620)]

Rick Marken (2009.11.12.0900) to Martin Taylor (2009.11.11.23.43)–

So one controlled variable was: X = c + kbid/SS# and the other was X = vscale? If this is it, then why does v show up in only one controlled variable? I would really like to know what you used for the controlled variables for the AP and BR models? I’d appreciate it if you could send your spreadsheet so I can see what you did.

While I await Martin’s description of the difference between the variables controlled by the AP and BR models, I’ve gone to the trouble of creating my own spreadsheet with the Ariely data and wrote up a quick version of my model (which I think is the BR model which Martin found to do so much more poorly than the AR model). I see that Bill did the same. My model still needs work but I would like to get it into a form where I can compare it to the AR model. So I would like to see what your AR model looks like, Martin.

Best regards

Rick

···


Richard S. Marken PhD
rsmarken@gmail.com
www.mindreadings.com

[From Bill Powers (2009.11.15/0116 MDT)]

Rick Marken (2009.11.13.1620) –

While I await Martin’s
description of the difference between the variables controlled by the AP
and BR models, I’ve gone to the trouble of creating my own spreadsheet
with the Ariely data and wrote up a quick version of my model (which I
think is the BR model which Martin found to do so much more poorly than
the AR model). I see that Bill did the same. My model still needs work
but I would like to get it into a form where I can compare it to the AR
model. So I would like to see what your AR model looks like, Martin.

My model, too, needs work – I realized last night that it doesn’t make
any sense the way I wrote it because I forgot the idea I had previously
come up with, which is probably close to yours.

Look at the first column of data. There are three cases (bold-faced)
where the SS# is lower than the average of the binned “bid” by
the population of subjects in that column. In the other 27 cases, the
mean bid number is lower than the mean SS#. Note that until further
notice I’m treating this table of population data as if it applies to
some average individual. Later I’ll correct that.

Range of last two digits of SS number
Products
00-19 20-39 40-59
60-79
80-99 Correlations
Cordless
Trackball
8.64 11.82
13.45 21.18
26.18
0.42
Cordless
Keyboard
16.09 26.82
29.27 34.55
55.64
0.52
Design
Book
12.82 16.18
15.82 19.27
30.00
0.32
Chocolates
9.55 10.64
12.45 13.27
20.64
0.42
1998
Wine
8.64 14.45
12.55 15 45
27.91
0.33
1996
Wine
11.73 22.45
18.09 24.55
37.55
0.33If the “subjects” (i.e. a fictitious average subject) are
thinking of the SS# as a price tag, then they clearly would purchase
those three items at the asked price because they’re willing to pay more
than the asked price. Notice the term “asked price.” This table
begins to make sense if we think of it as the first step in each of 30
bargaining sessions. The SS# is the opening “asked” price,
which the subject sees before deciding what to do. The “bid”,
in all but those three cases, is the amount the subject offers in return
to end that round of bargaining.

The three bold-faced bids would never be made; if the offering price is
already lower than the amount the subject would bid, the bargaining would
end there and the transaction would occur unless the bid and asked prices
were sealed before the end of each round. And then there would have to be
some prior agreement about what to do if the asked price is less than the
bid – split the difference or whatever. The bargaining would still end
there.

In the other 27 cases, the item would not be purchased because the bid
price is less than the (assumed) asked price, under this way of
conceptualizing the situation.

If this is seen as a bargaining session, in the 27 “live” cases
the next step would be for the seller to lower the asked price and/or the
bidder to increase the bid. If the bidder does not increase the bid and
the seller does not lower the price there is no deal and the item is not
purchased.

A deal is made when the seller lowers the asked price to or below the
buyer’s reference level for the price. So how do we determine the buyer’s
reference level? It might change on every round, so we need a way to
determine its apparent value after every round. Start by considering the
two differences between bid and asked price (from the first two columns,
in this case)

Asked price Mean Bid price
Difference, Bid - Asked

···

10.00
8.64
-1.36

30.00
11.82
-18.18

The question now is, “At what asked price would the difference go to
zero?” Here is a plot of a straight-line relationship between the
asked price A and the difference D between bid and asked, bid - asked,
with an easy geometric solution for the reference level and the
slope:

Emacs!

For the first two entries in the first row of data we have

slope = (-18.18 - (-1.36))/(30 - 10) = 16.82/20

  = -0.841

ref = (30*-1.36 - 10*-18.18)/(-1.36 + 18.18)

  = (-40.8    +

181.8)/16.82

  = 141/ 16.82

  = 8.383

Check: diff = -0.841*(asked - 8.383):

asked = 10; diff = -0.841*1.62 = -1.36

asked = 30; diff = 0.841* 21.617 = -18.18

Rick, I’m sure you can set this up as a spreadsheet a lot faster than I
can. To get rid of the extra step of computing the difference, use

diff = bid - asked,

so the equation becomes

bid = (k + 1)asked - kref

where k and ref are determined as in the graphic above.

You can do this for any two columns.

==========================================================================

To set this up as a simulation, clearly you have to have two parties: a
seller and a bidder, both adjusting their respective asked and bid prices
on each round. And just as clearly, you have to treat each individual and
each trial as a separate case; the population bid and asked prices are
meaningless for individuals, they are of interest only to a seller
Runkeling the data to take advantage of the average properties of a
population.

Runkeling: do not go gentle into that good night; rage, rage against the
dying of the light.

This model should be applied to each experimental run, and each person in
the population has to make repeated bids as the asked price is changed.
The reference levels and slopes have to be calculated for each
individual, not for the averaged measures over the population.

The Arielly data is clearly insufficient to allow any conclusions to be
drawn about individuals; it would be like ranking chess teams by having
each member of one team make one move against just one member of the
other team, then trying to evaluate how good the average move was. I
assume that each participant in the experiment made just one bid on each
item against the last two digits of his own SS#.

It would be somewhat interesting to give one person repeated trials while
the asked price is varied randomly. But the most interesting case would
be to have pairs of subjects playing buyer and seller seeing whether they
can reach a deal on various items, and calculating the reference level
and slope for each person after each round. Another situation would be an
auction with a video camera to record successive bids for the same item
by each person. We’d have to work out the equations for that
model.

We can be quite sure of one thing. The SS# does not “leak” into
the bid number, somehow. That was just a stab in the dark. The meaning of
these data depends entirely on how the subject conceived the situation.
This is not a simple cause-effect phenomenon. We have to try out various
models: price tag, bid-asked, whatever else you can come up with. Each
proposal for how the subjects perceived the situation will lead to a
slightly different model. And of the utmost importance, we should NOT use
population data. If we want to say anything about the population, we
should first determine the parameters for each individuals, then find
their means and distributions.

Best,

Bill P.

[From Dick Robertson,2009.11.15.1137CST]

[From Bill Powers (2009.11.15/0116 MDT)]
Rick Marken (2009.11.13.1620) –

I’m probably about to ask another of my stupid questions, but what the heck…

We are not told the mean SS# of the subjects in each group in Ariely’s data. Since there are only 55 Ss in all, I would think that the distribution of SS#s in a group could vary far from resembling a normal distribution. Thus, wouldn’t there be the possibility that in the three groups you noted – the mean SS# could be above or below the mean bid price? In that case wouldn’t there be the possibility that we can’t know what the Mn s would do about buying the object?

While I await Martin’s description of the difference between the variables controlled by the AP and BR models, I’ve gone to the trouble of creating my own spreadsheet with the Ariely data and wrote up a quick version of my model (which I think is the BR model which Martin found to do so much more poorly than the AR model). I see that Bill did the same. My model still needs work but I would like to get it into a form where I can compare it to the AR model. So I would like to see what your AR model looks like, Martin.

My model, too, needs work – I realized last night that it doesn’t make any sense the way I wrote it because I forgot the idea I had previously come up with, which is probably close to yours.

Look at the first column of data. There are three cases (bold-faced) where the SS# is lower than the average of the binned “bid” by the population of subjects in that column. In the other 27 cases, the mean bid number is lower than the mean SS#. Note that until further notice I’m treating this table of population data as if it applies to some average individual. Later I’ll correct that.
Range of last two digits of SS number

Products 00-19 20-39 40-59 60-79 80-99 Correlations
Cordless Trackball 8.64 11.82 13.45 21.18 26.18 0.42
Cordless Keyboard 16.09 26.82 29.27 34.55 55.64 0.52
Design Book 12.82 16.18 15.82 19.27 30.00 0.32
Chocolates 9.55 10.64 12.45 13.27 20.64 0.42
1998 Wine 8.64 14.45 12.55 15 45 27.91 0.33
1996 Wine 11.73 22.45 18.09 24.55 37.55 0.33 If the “subjects” (i.e. a fictitious average subject) are thinking of the SS# as a price tag, then they clearly would purchase those three items at the asked price because they’re willing to pay more than the asked price. Notice the term “asked price.” This table begins to make sense if we think of it as the first step in each of 30 bargaining sessions. The SS# is the opening “asked” price, which the subject sees before deciding what to do. The “bid”, in all but those three cases, is the amount the subject offers in return to end that round of bargaining.

The three bold-faced bids would never be made; if the offering price is already lower than the amount the subject would bid, the bargaining would end there and the transaction would occur unless the bid and asked prices were sealed before the end of each round. And then there would have to be some prior agreement about what to do if the asked price is less than the bid – split the difference or whatever. The bargaining would still end there.

In the other 27 cases, the item would not be purchased because the bid price is less than the (assumed) asked price, under this way of conceptualizing the situation.

If this is seen as a bargaining session, in the 27 “live” cases the next step would be for the seller to lower the asked price and/or the bidder to increase the bid. If the bidder does not increase the bid and the seller does not lower the price there is no deal and the item is not purchased.

A deal is made when the seller lowers the asked price to or below the buyer’s reference level for the price. So how do we determine the buyer’s reference level? It might change on every round, so we need a way to determine its apparent value after every round. Start by considering the two differences between bid and asked price (from the first two columns, in this case)

Asked price Mean Bid price Difference, Bid - Asked

10.00            8.64                  -1.36 
30.00            11.82                -18.18

The question now is, “At what asked price would the difference go to zero?” Here is a plot of a straight-line relationship between the asked price A and the difference D between bid and asked, bid - asked, with an easy geometric solution for the reference level and the slope:
Emacs!

···

For the first two entries in the first row of data we have

slope = (-18.18 - (-1.36))/(30 - 10) = 16.82/20
= -0.841

ref = (30*-1.36 - 10*-18.18)/(-1.36 + 18.18)
= (-40.8 + 181.8)/16.82
= 141/ 16.82
= 8.383

Check: diff = -0.841*(asked - 8.383):

asked = 10; diff = -0.8411.62 = -1.36
asked = 30; diff = 0.841
21.617 = -18.18

Rick, I’m sure you can set this up as a spreadsheet a lot faster than I can. To get rid of the extra step of computing the difference, use

diff = bid - asked,

so the equation becomes

bid = (k + 1)asked - kref
where k and ref are determined as in the graphic above.

You can do this for any two columns.

To set this up as a simulation, clearly you have to have two parties: a seller and a bidder, both adjusting their respective asked and bid prices on each round. And just as clearly, you have to treat each individual and each trial as a separate case; the population bid and asked prices are meaningless for individuals, they are of interest only to a seller Runkeling the data to take advantage of the average properties of a population.

Runkeling: do not go gentle into that good night; rage, rage against the dying of the light.

This model should be applied to each experimental run, and each person in the population has to make repeated bids as the asked price is changed. The reference levels and slopes have to be calculated for each individual, not for the averaged measures over the population.

The Arielly data is clearly insufficient to allow any conclusions to be drawn about individuals; it would be like ranking chess teams by having each member of one team make one move against just one member of the other team, then trying to evaluate how good the average move was. I assume that each participant in the experiment made just one bid on each item against the last two digits of his own SS#.

It would be somewhat interesting to give one person repeated trials while the asked price is varied randomly. But the most interesting case would be to have pairs of subjects playing buyer and seller seeing whether they can reach a deal on various items, and calculating the reference level and slope for each person after each round. Another situation would be an auction with a video camera to record successive bids for the same item by each person. We’d have to work out the equations for that model.

We can be quite sure of one thing. The SS# does not “leak” into the bid number, somehow. That was just a stab in the dark. The meaning of these data depends entirely on how the subject conceived the situation. This is not a simple cause-effect phenomenon. We have to try out various models: price tag, bid-asked, whatever else you can come up with. Each proposal for how the subjects perceived the situation will lead to a slightly different model. And of the utmost importance, we should NOT use population data. If we want to say anything about the population, we should first determine the parameters for each individuals, then find their means and distributions.

Best,

Bill P.

[From Bill Powers (2009.11.15.1036 MDT)]

Dick Robertson,2009.11.15.1137CST --

We are not told the mean SS# of the subjects in each group in Ariely's data. Since there are only 55 Ss in all, I would think that the distribution of SS#s in a group could vary far from resembling a normal distribution. Thus, wouldn't there be the possibility that in the three groups you noted -- the mean SS# could be above or below the mean bid price? In that case wouldn't there be the possibility that we can't know what the Mn s would do about buying the object?

Yes. The only way to answer detailed questions like this would be to obtain the raw data for each subject -- but even then, we have only one data point per subject for each item. All the manipulations I show in my post would make sense only when applied to data for a single subject presented with different asked prices. The whole experiment needs to be done over again.

What you say is true for all items in the first column. In the second column there are two more cases where the asking price might be lower than the bid price. For all the others, the bid price is below the lowest possible asking price. But all this is moot because these data are meaningless for use in making a model.

Best,

Bill P.

[From Rick Marken (2009.11.15.1050)]

Bill Powers (2009.11.15/0116 MDT)

My model, too, needs work – I realized last night that it doesn’t make
any sense the way I wrote it because I forgot the idea I had previously
come up with, which is probably close to yours.

This is a great post. And your and Dick Robertson’s observations about the problems created by this being average data (not least because we have no idea about the distribution of SS#s in each bin) are, of course, right on target. I would like to emphasize your conclusion:

We can be quite sure of one thing. The SS# does not “leak” into
the bid number, somehow. That was just a stab in the dark. The meaning of
these data depends entirely on how the subject conceived the situation.
This is not a simple cause-effect phenomenon

This is the point I was trying to make in an earlier post (and I will try to make it in a paper some day): what the subject does in a conventional experiment depends on what they take their purpose to be (how they “conceive” the situation, in your terminology).

In every experiment (with humans) subjects are given instruction about what their purpose is to be. But these are verbal instructions and what each subject actually does depends on what perceptual variable(s) they end up controlling. This was true in the Schouten experiment; it’s true in the Ariely experiment and I think it can be shown to be true in every psychological experiment every performed. Indeed, if the subjects don’t adopt a purpose that is somewhat like the one the experimenter wants the subjects to adopt then, as I said, nothing at all will happen in the experiment. For example, if the subject in a simple reaction time experiment (similar to Schouten’s) doesn’t adopt a purpose (control a perception) something like “press bar as soon as possible after light comes on” then, when the light comes on, nothing will happen. People are not automatically caused to press buttons when lights come on; they are not automatically caused to make bids of a certain size when they are shown the name of objects with the last 2 digits of their SS# written over and under them); etc.

Conventional psychology experiments depend on subjects being purposeful and adopting purposes in experiments that are like the one’s the experimenter wants them to adopt. Then, once the data is collected, the experimenter ignores the purposeful nature of the subjects’ behavior and analyzes the data as though it were a cause effect process. And it seems to work (at least statistically) since the experiment is set up so that in order to achieve their purpose the subjects must act (DV) to protect the controlled perception from disturbances created by the IV (when the light comes on in a reaction time experiment, for example, the CV – pressing a button after the light comes on – is kept under control by the subject pressing the button). It looks like the subjects’ behavior (DV) is caused by the IV but, in fact, this is just the subjects carrying out the instructed purpose: acting (DV) to protect a CV from disturbance (IV). Since every subject either controls a slightly different CV or controls similar CV’s slightly differently (such as with different gain), you get variability of behavior across subjects; and thus statistics (my bread and butter) has become the basis of psychological science.

Best

Rick

···


Richard S. Marken PhD
rsmarken@gmail.com
www.mindreadings.com

[Martin Taylor 2009.11.16.10.27]

[From Bill Powers (2009.11.11.1424 MDT)]

Martin Taylor 2009.11.11.13.46 –

MT: To test the degree
of
control implied by each proposal, I took the ratio between the
predicted
and actual bids for all 30 data points, and subtracted 1.0 (since with
perfect control the ratio would be exactly 1.0). The standard
deviations
of these residual error ratios were .152 for BR and 0.097 for AP.
Interestingly, although neither explicitly uses the average values over
objects at any stage in the analysis, there is a big difference in how
well the two proposals fit those averages. This figure shows the
deviation of the average residual error ratios across objects for the
different SS# bands under the two proposals.

BP: My hat is off to you, Martin. This is a beautiful job of teasing a
startling regularity out of data in which it is well hidden.

1618548.jpg

Bill, I think you should put your hat back on. This has been bugging me
in the back of my mind all weekend, although I haven’t had time to work
on it. I am almost sure that the smoothness of the AP trace has to be a
non-obvious artifact of the algorithm. Why do I believe this? Because
it’s too good to be true. If the standard deviation of the individual
data points is about 0.1, the standard deviation of the mean of 6 of
them should be about 0.1/sqrt(6), or about 0.04, whereas the standard
deviation of the measures plotted is about .003. That makes no sense
statistically, and it would be an enormous fluke for these numbers to
come out that way by chance.

I think that the average values must be getting into their own
calculation in a way that is not yet obvious to me. I’ll get back to
it, perhaps later this week, and report back if I find the problem. I
don’t think the same comment applies to the BR values, the model for
which requires a different and more obvious algorithm.

Rick had asked for the spreadsheet. I had intended to clean it up
before sending it, adding notations as to what leads to what, and
eliminating sections that turned out to be dead-ends, but I haven’t had
time to do that, and now I’d rather wait until I (with luck) have found
the artifact and corrected it.

Martin

[From Bill Powers (2009.11.16.1519 MDT)]

Martin Taylor 2009.11.16.10.27 –

Bill, I think you should put
your hat back on. This has been bugging me in the back of my mind all
weekend, although I haven’t had time to work on it. I am almost sure that
the smoothness of the AP trace has to be a non-obvious artifact of the
algorithm.

Did you see a later post by me about the scaling? The logs of variables
will have a much smaller range than the original variables. If one
variable has a range from 1 to 100 (avoiding zero for obvious reasons),
the log-base-10 will have a range from 0 to 2. I think you have to scale
up the log plot by a factor of 50 to normalize it to the range of the
variable.

I’ve been following Rick’s suggestion and working on a control-system
model. It is going to work. I’m getting a correlation (in spite of what
I’ve been saying about correlations) between predicted and actual values
of the “bid” variable between 0.86 and 0.97. It’s just a linear
equation, but it fits the data very well. Each row has a different gain
and reference level.

The Areilly experiment is basically flawed, in that all the people are
assumed to be organized exactly the same, so the result is supposedly the
equivalent of one person judging each of the six items with five
different values of the “price” or SS$ variable. I can’t
swallow that, but it’s possible that because this is a linear model, the
population averages might superimpose to give a virtual controller with
properties that someone like a retail store could rely on, even though it
doesn’t resemble any individual.

I seem to have lost a step or two in my mathematical prowess. Working out
the details has taken two days so far and I still don’t have all the
loose ends tied up. I’ll keep at it until it’s all neat, then write it up
for CSGnet.

Best,

Bill P.

[Martin Taylor 2009.11.17.00.20]

[From Bill Powers (2009.11.16.1519 MDT)]

Martin Taylor 2009.11.16.10.27 –

Bill, I think you should
put
your hat back on. This has been bugging me in the back of my mind all
weekend, although I haven’t had time to work on it. I am almost sure
that
the smoothness of the AP trace has to be a non-obvious artifact of the
algorithm.

Did you see a later post by me about the scaling? The logs of variables
will have a much smaller range than the original variables. If one
variable has a range from 1 to 100 (avoiding zero for obvious reasons),
the log-base-10 will have a range from 0 to 2. I think you have to
scale
up the log plot by a factor of 50 to normalize it to the range of the
variable.

Yes, I was preparing a response to that one when I decided that the
probable artifact had to be sorted out before I did so.

I’ve been following Rick’s suggestion and working on a control-system
model. It is going to work. I’m getting a correlation (in spite of what
I’ve been saying about correlations) between predicted and actual
values
of the “bid” variable between 0.86 and 0.97. It’s just a linear
equation, but it fits the data very well. Each row has a different gain
and reference level.

Yes, I know that the linear model fits pretty well. I included a graph
to show that in an earlier post. The same graph showed that the linear
model has a characteristic and consistent departure from the data
curve. It tends to give values that are too low at the ends of the SS#
scale and too high in the middle. Not by much, which is why you get
good correlations, I suspect.

My intention was to suggest using The Test in the same way Rick did for
the geometry of rectangular shapes, comparing the residual errors for
different proposed controlled variables, all of which have some face
validity. I stand by the calculation of the standard deviations for the
bid-ratio and the scaling proposals, which suggests that the scaling
proposal is appreciably closer. On the other hand, a good part of the
variance associated with the bid-ratio proposal is due to the
characteristic failure to fit the shape of the data curves, so a
bid-ratio proposal that used a scale of perceived magnitude of the SS#
stretched at its ends might fit even better.

The Areilly experiment is basically flawed, in that all the people are
assumed to be organized exactly the same, so the result is supposedly
the
equivalent of one person judging each of the six items with five
different values of the “price” or SS$ variable. I can’t
swallow that, but it’s possible that because this is a linear model,
the
population averages might superimpose to give a virtual controller with
properties that someone like a retail store could rely on, even though
it
doesn’t resemble any individual.

I think we agree. If I may quote from myself to Rick [Martin Taylor
2009.11.10.12.06]:
Is there something in the reported data that would lead you to
guess
that any subject is controlling for a relationship of any kind between
the SS# and the bid? Each subject has only one SS#, so how could any
subject control for such a ratio? All a subject can do is to bid on the
six different objects. It’s the relationship among bids across
different subjects that leads to the reported correlations. As Bill P.
has pointed out many times, it’s quite possible that the correlation
for each single subject might be in the opposite sense.

However, we went ahead and analyzed the situation knowing of this
limitation, knowing that the results we get would make sense only under
the conditions you state.

I can’t imagine, however, how one could redo the experiment in a
satisfactory way. It wouldn’t make any sense to get one subject to bid
multiple times on the same object after having been exposed to
different “priming” numbers, because you would never know whether they
were remembering their earlier bid on the object. In other words, you
couldn’t vary the disturbance to look for control.

I seem to have lost a step or two in my mathematical prowess. Working
out
the details has taken two days so far and I still don’t have all the
loose ends tied up. I’ll keep at it until it’s all neat, then write it
up
for CSGnet.

Maybe you could make a scaling control model as well, and compare the
two proposals as to how well they fit the data by that route?

Martin

[From Bill Powers (2009.11.17.0621 MDT)]

Martin Taylor 2009.11.17.00.20 –

Yes, I know that the linear
model fits pretty well. I included a graph to show that in an earlier
post. The same graph showed that the linear model has a characteristic
and consistent departure from the data curve. It tends to give values
that are too low at the ends of the SS# scale and too high in the middle.
Not by much, which is why you get good correlations, I suspect.

Yes. I’m not speculating about why the the correlation gets as low as
0.86 for one of the items, though it’s 0.89 for two other items and
higher than that for the rest. With different individuals generating each
data point for an item, and only five points available for each item
being evaluated, it’s kind of a miracle that the data make any sense at
all.

The control model I’m using is

R R = Reference signal

v

—>Comp—

G G = gain

      >
      v

B<----------O O = output

^ B = bid

A A = asked (SS#)

The output is not observable since it’s whatever the person does
internally to keep the bid from being completely determined by the asked
price, as in a store where there is no choice. So I have had to use the
controlled variable, B, as the basis for computing gain and reference
level. For really tight control systems this usually exaggerates small
errors and gives a poor fit. fortunately, the gains are relatively low
here.

The control equations are

(1) B = O + A

(2) O = G(R - B)

Solving for B we get

A + GR

(3) B = --------

1 + G

Using two sets of values of B and A from the data for one item (row), we
can calculate G and R:

A1 - A2

     G = --------- - 1

B1 - B2

(G+1)B1 - A1

     R = --------------

G

These values of G and R are then used to predict all the values of B from
the asking price A for one item, using equation (3). All this is done for
each row of the data. The plots are made by a Delphi program, the Unit
from which is attached (read as a text file).

I would appreciate an independent check that I have done all this right.
A pretty good check is found in the bid predictions, where there is
always an exact match for the two data points used to calculate G and
R.

Since B = O + A, the real O can be deduced from the data by O = B - A. I
could get higher correlations if I deduced the invisible output from the
data, as bid - SS#. Adding a second correlation (and showing three
decimal places for the correlations instead of two), we get the last
column on the right:

6cf995.jpg

As expected, the model’s output correlates much better with the real
output – a pity we can’t observe the real output directly, and don’t
even know what it is. At least we can say that the gain and reference
parameters are obtained only from observable data.

Richard Pfau, what are you making of all this? Would it be of interest to
the behavioral economics people?

Best,

Bill P.

AriellyUn.pas (4.08 KB)

[From Richard Pfau (2009.11.17.1332 DST)]

From Bill Powers (2009.11.17.0621 MDT)]

Richard Pfau, what are you making of all this? Would it be of interest to the behavioral economics people?

I can’t say for sure if the CSGNET ideas and work being done with the Dan Ariely data would be of interest to behavioral economics people, since the closest I’ve gotten to behavioral economics is the Ariely book “Predictably Irrational” (2009) that I’m presently reading.

However, I presume that Ariely (Professor of Behavioral Economics at Duke University) would be interested that his data has generated so much thought and discussion on the CSGNET. Perhaps he might be contacted by someone involved in the modeling efforts going on, to brief him on what’s occurring and offering to share all or some of the more relevant e-mails generated, past and future, starting with the one that set off our discussions [Dick Robertson, 2009.11.08.1428CDT].

As for myself, I’m assimilating ideas being presented and don’t have more to add at the present time.

With Regards,

Rich Pfau

[From Rick Marken (2009.11.17.1125)]

Bill Powers (2009.11.17.0621 MDT)–

The control model I’m using is

R R = Reference signal

v

—>Comp—

G G = gain

      >
      v

B<----------O O = output

^ B = bid

A A = asked (SS#)

Very nice! So the observed bid is the controlled variable. Nifty.

I can see why the references would be different for each product but why the gains? It would be nice if we could get a model that would work as well as this but with only the references differing across products. To do that the gain would have to be incorporated in some way into the model. I’ll go off and see if I can think of anything that might work (gosh, now I’m staring to sound like Martin;-). But this is really nice work; I think this will help a lot with my article on the purposeful nature of behavior in conventional psychological experiments.

Best

Rick

···


Richard S. Marken PhD
rsmarken@gmail.com
www.mindreadings.com

[From Bill Powers (2009.11.17.1240 MDT)]

Rick Marken (2009.11.17.1125) --

I can see why the references would be different for each product but why the gains?

I don't know. But one could guess the gain determines how important it is to bid the correct amount. If it wasn't important at all, the gain would be zero and the composite person would just pay the asking price. The lowest gains are for the trackball and the keyboard, which makes the bid amount depend more on the price tag. Maybe this group just didn't care if they paid the right amount because they weren't going to buy either one or had no idea what they were. They were the book-chocolate-wine crowd. It would really be embarrassing to be caught paying the wrong amount for wine, apparently.

It would be nice if we could get a model that would work as well as this but with only the references differing across products.

If the gain were constant, the data would show that. The fact is that if this is the right model, the gain isn't constant. It could have been constant; I didn't forbid that result. I just accepted whatever the gain turned out to be.

To do that the gain would have to be incorporated in some way into the model.

It is. It's the G in the diagram of the model and in the equations.

Best,

Bill P.

[Martin Taylor 2009.11.17.17.17]

[From Bill Powers (2009.11.17.0621 MDT)]

The control model I'm using is

                    R R = Reference signal
                    >
                    v
               --->Comp--- > > > G G = gain
              > >
              > v
              B<----------O O = output
              ^ B = bid
              >
              A A = asked (SS#)

The output is not observable since it's whatever the person does internally to keep the bid from being completely determined by the asked price, as in a store where there is no choice. So I have had to use the controlled variable, B, as the basis for computing gain and reference level. For really tight control systems this usually exaggerates small errors and gives a poor fit. fortunately, the gains are relatively low here.

The control equations are

(1) B = O + A

(2) O = G(R - B)

Solving for B we get

              A + GR
(3) B = --------
              1 + G

Using two sets of values of B and A from the data for one item (row), we can calculate G and R:

              A1 - A2
         G = --------- - 1
              B1 - B2

              (G+1)B1 - A1
         R = --------------
                   G

These values of G and R are then used to predict all the values of B from the asking price A for one item, using equation (3). All this is done for each row of the data. The plots are made by a Delphi program, the Unit from which is attached (read as a text file).

This is a third proposal. That's nice. It differs structurally from both Rick's and mine, which is even nicer.

Rick's controlled variable is (bid number)/(SS#), which should be controlled to the same reference value for each individual object. (Actually, in my Excel modelling, I took the controlled variable to be (bid for SS#=50) + k*(bid number/(50 - SS#)), since using the simple ratio gave wildly wrong results. In that formula, SS# is the mid-number of the range).

My controlled variable is "perceived bid magnitude" with a reference value of "perceived object value". In mine, the disturbance (SS#) is a multiplicative factor on the bid number, making "perceived bid magnitude" be (bid number)*f(SS#). I did not make any assumption about f(), but estimated the scale factor separately for each SS# range.

Your controlled variable is (bid number)+(f(SS#)), which has a reference value of the perceived value of the object. I'm not clear where A comes from, given the data with which we are working. Your wording in the diagram seems to say that it is the SS#, but that doesn't make sense. That's why I labelled it f(SS#).

Incidentally, at the moment I can't run your programs, since I recently upgraded my PC virtualizer software, and the process seems to have corrupted the virtual PC in which I was doing the programming. If I can't recover it, I'm going to have to start all over again.

Martin

[From Rick Marken (2009.11.17.2140)]

Bill Powers (2009.11.17.1240 MDT)–

Rick Marken (2009.11.17.1125) –

It would be nice if we could get a model that would work as well as this but with only the references differing across products.

If the gain were constant, the data would show that. The fact is that if this is the right model, the gain isn’t constant. It could have been constant; I didn’t forbid that result. I just accepted whatever the gain turned out to be.

Yes. Of course.

To do that the gain would have to be incorporated in some way into the model.

It is. It’s the G in the diagram of the model and in the equations.

Yes, I know. I said it wrong. I meant that perhaps what’s being seen as different gain could be picked up, in a different version of the model, by something else, like the nature of the CV.

I’ll play around with this in a spreadsheet, which, in this case, can do the calculations done by your pascal model much more easily, I think.

Best

Rick

···


Richard S. Marken PhD

rsmarken@gmail.com
www.mindreadings.com

[From Bill Powers (2009.11.18.0625 MDT)]

Martin Taylor 2009.11.17.17.17 --

MT: Your controlled variable is (bid number)+(f(SS#)), which has a reference value of the perceived value of the object. I'm not clear where A comes from, given the data with which we are working.

BP: A is the SS#.

Since B = SS# + output in this model, it follows that output = B - SS#.

MT: Your wording in the diagram seems to say that it is the SS#, but that doesn't make sense.

That depends on what interpretation you give to the variables. My interpretation is that the SS# represents an asking price (the instructions said, according to Dick R., that the SS# was to be written "as a price"). The reference signal represents the price the person thinks the item is really worth, which is what he would pay if it were offered at that price. If the asking price is higher than that, the person does some sort of internal calculation like "OK, if they want me to pay more, I'll pay more, but not that much more." The person comes back with an offer for some smaller amount, the "bid", whatever amount strikes a balance between not having the object and not wanting to pay more. How much smaller the amount is depends on the gain. I haven't tried to fill in what the other control systems are that don't want to pay so much.

···

--------------------------------------------------------------------------

Now I have a really confusing result -- nothing I did seemed to alter those correlations. I put sliders in so I could manually adjust gain and reference, and changing them didn't alter the correlations one bit. I sort of guessed at this result before, when I commented that maybe with a linear control system, the correlations wouldn't depend on the parameters of the model, and that has turned out to be exactly right. The goodness of fit of the model to the real behavior was obviously changing: I could see the numbers for the bid values changing. But the way they changed didn't alter the correlations. Well that's screwy, I thought, not wanting to believe it. I've spent hours looking for the bug in the program, but I guess there isn't one.

But if that's the truth, it follows that correlations don't tell us anything about goodness of fit. They may tell us if we have the right architecture, but not if we have the right parameters. The only way to find the right parameters is to look at the RMS differences between model and real data, which is how all the PCT models have been fit to experimental data, so far. Dumb luck.

So now I have to modify the program one more time to calculate (and minimize) the RMS difference between the real and modeled bid numbers. I'm afraid this is going to turn out to be nothing more than a least-squares fit of a straight line to the data, with the control-system model being just one way to get the straight line. If I'm making some dumb mistake here, I hope someone hurries up and tells me what it is, because I'm about to waste another day on this.

This means that saying a control system is at work in the AReilly experiment (do I finally understand the spelling?) is no more justified by the data than saying the SS#, the asking price, affects the bid price directly. We can't observe the output of the proposed control system, so there's no way to prove a control system is there instead of some "priming" phenomenon. But because the control-system model explains the data so well, we can't rule priming in, either.

It would be easy to set up the experiment with a mouse so the bid price was displayed and affected by the mouse and by the asking price. Then we could see the output and that would cinch it. This is a very nice demonstration of the fact that the Test for the Controlled Variable can't be done just by disturbing the controlled variable and finding that it doesn't change as much as it should. You also have to establish WHY it doesn't change -- because the supposed control system produces an measurable output that counteracts the effect of the disturbance. If you can't measure that output, the door is still open to other interpretations.

Best,

Bill P.

[Martin Taylor 2009.11.18.10.36]

[From Bill Powers (2009.11.18.0625 MDT)]

Martin Taylor 2009.11.17.17.17 --

MT: Your controlled variable is (bid number)+(f(SS#)), which has a reference value of the perceived value of the object. I'm not clear where A comes from, given the data with which we are working.

BP: A is the SS#.

Since B = SS# + output in this model, it follows that output = B - SS#.

MT: Your wording in the diagram seems to say that it is the SS#, but that doesn't make sense.

That depends on what interpretation you give to the variables. My interpretation is that the SS# represents an asking price (the instructions said, according to Dick R., that the SS# was to be written "as a price"). The reference signal represents the price the person thinks the item is really worth, which is what he would pay if it were offered at that price. If the asking price is higher than that, the person does some sort of internal calculation like "OK, if they want me to pay more, I'll pay more, but not that much more." The person comes back with an offer for some smaller amount, the "bid", whatever amount strikes a balance between not having the object and not wanting to pay more. How much smaller the amount is depends on the gain. I haven't tried to fill in what the other control systems are that don't want to pay so much.

I guess I didn't read Dick R's original message carefully enough. I've now re-read it after reading the above. I hadn't realized that they started by asking whether the subject would pay the SS# for the object. I was working on the assumption that they simply had to contemplate the SS# and then offer a price for the object. That changes the face viability of the different proposals.

It's interesting that you have a problem with your control analysis that is analogous to my problem with the (probably) artifactual uniformity of the mean error averaged across objects within SS# bands. I have no answer to either, at the moment.

Martin

[From Dick Robertson,2009.11.18.1053,CST]

[From Bill Powers (2009.11.18.0625 MDT)]

BP: A is the SS#.

Since B = SS# + output in this model, it follows that output = B

  • SS#.

MT: Your wording in the diagram seems to say that it is the
SS#, but
that doesn’t make sense.

That depends on what interpretation you give to the variables.
My
interpretation is that the SS# represents an asking price (the
instructions said, according to Dick R., that the SS# was to be
written “as a price”). The reference signal represents the price
This means that saying a control system is at work in the
AReilly
experiment (do I finally understand the spelling?)

No, it’s Ariely.

Best,

Dick R