[From Bill Powers (2010.07.06.0851 MDT)]
David Goldstein (2010.07.06.04:50 EDT) --
DMG: A subject was asked to judge the length of a line. In one condition, a subject would call out the length of the line in inches to the nearest one-quarter of an inch. In another condition, a subject would move two markers so that the distance between them was the same as the length of the line.In one condition, the target line was the shortest one. In another condition it was the longest one. The "context effect" was the difference in judgement of the target line length made by being the shortest or longest line in the set. A context effect was obtained for all conditions involving verbal estimates, but not for all conditions involving matching. How a person makes the judgement made a difference in the judged length. The context effect is not a pure perceptual experience.
I don't understand. What is the "target line"? Are there two lines? You say "shortest in the set." In what set? Was there a whole bunch of lines? I guess I just have to read the paper.
OK, I've read most of it. One by one, each line in a set of seven horizontal lines with different lengths was presented briefly (for 0.2 second or 1 second) on a projection screen. A subject estimated each length verbally, in inches and quarter inches, in one set of trials, or set the visual distance between two sliding markers to be the same as the length, in another set of trials. Two different sets of lines were used, a long set and a short set, with one medium-length line (called the "trace" line) being of the same length in both sets. The trace line was the longest line in the short set, and the shortest line in the long set. It was always the last one projected, while the other 6 lines were presented in random order.
The judgements of the length of the trace line were used to measure a "context" effect: the trace was found to be estimated as shorter when it appeared in the short set than when it appeared in the long set, but only for the verbal estimates.
Eighty subjects were assigned to 8 groups in a 2 x 2 x 2 factorial design.
This enormously complex experiment, with several unnoticed experimental variables that were not investigated, produced practically no information. "Main effects due to mode of response and stimulus duration were not reliable. Only the Context by Mode of Response interaction reached significance." Nothing was said about the presentation of the trace stimulus as the last element in every series, or the fact that the judgement always had to be made about a stimulus that was no longer present, or that in the length-comparison judgement, one stimulus was present but had to be compared with a different one that was being remembered.
The difference between the two methods of judging length was striking in several dimensions. Here are the results for the longer stimulus presentation, 1 second, showing the mean minus standard deviation, mean, and mean plus standard deviation in each case. The actual length of the trace line was 216 mm (8-1/2 inches).
VERBAL Short set 196.2 206.5 216.8
Long set 137.1 148.3 159.5
MATCHING Short set 198.1 204.1 210.3
Long set 199.1 205.0 210.9
Subjects were asked to give their verbal estimates to the nearest quarter-inch, or 6.3 mm, which is just about the size of the standard deviation of the matching data. One wonders if all the data were recorded to the nearest quarter inch and then converted to the nearest tenth of a millimeter (!),
The matching measures are so consistent that one has to wonder why they consistently underestimated the length of the trace line by 11 to 12 millimeters on the average, which is 2 standard deviations. A hint might be obtained from the fact that the movable markers used to indicate length in the matching case were 6.4 mm wide. If most subjects used the outer edges of the two markers to indicate length, while the scale (not visible to the subject) was based on the inner edges of the markers, the measurements would be 12.8 millimeters low, just about the observed underestimation. But who knows? There's no report of asking the subjects how they were making the comparisons. subjects were instructed to use the inner edges, but did they?
On the other hand, the verbal estimates are also consistently low, and in both kinds of estimates the estimates are lower when the stimulus is presented more briefly. There was no measurement of the elapsed time between the presentation and the time the estimate was given, so the rate of decay of the remembered line length is unknown. The presentations were always given at 15-second intervals; no investigation was made of the effect of shortening or lengthening the intervals, or of varying them at random.
For an undergraduate study this is probably the best your brother could have done, given the orientation of his teachers. But it is a meaningless experiment. Even Fig. 1 is contaminated by the presentation once known as the Time Magazine chart: Time Magazine used to present charts showing huge-looking changes in variables like rainfall, with headlines saying "drought threatens!", with a vertical scale that ran from 10 inches of rainfall at the bottom to 10.5 inches at the top. At least your brother's chart has a couple of marks on the vertical scale (which runs from 150 to 200) reminding the reader that the zero of the vertical axis is about where Table 1 starts near the bottom of the page. These effects are not very big.
Also, as you mention, the results are group averages, so they say nothing about any individual's characteristics. If you want to use groups of eight people to make estimates of lengths, this experiment can tell you some useful things about one of the conditions. The main message is, "Don't use verbal estimates to determine lengths." But since we only use them when accuracy isn't important, and use rulers when it is, we haven't learned much.
Relevance to the current thread is minimal, since no attempt was made to determine the relationship between actual lengths and estimated lengths
Best,
Bill