Test of compression method

[From Bill Powers (940418.1155 MDT)]

Martin Taylor (9404xx and previous) --

OK, I couldn't stand it so I reloaded the disturbance tables and did
the test. The amount of compression achieved by sending the
difference between real and model handles instead of just the real
handle positions is not as dramatic as I had thought it would be.

                   not handle difference improve-
               compressed compressed compressed ment

Diff 7: 0.8 Hz 7362 4132 3587 13%

Diff 3: 0.2 Hz 7362 2961 2177 27%

All files include the header of 162 bytes.

I checked that the reconstruction error is exactly 0.

Best,

Bill P.

<Martin Taylor 940418 15:30>

Bill Powers (940418.1155 MDT)

The amount of compression achieved by sending the
difference between real and model handles instead of just the real
handle positions is not as dramatic as I had thought it would be.

That may well relate to the compression algorithm or to the model or to
the system noise. All matter, but we can't tell which is more important
directly.

The compression algorithm is related to the kind of stuff the algorithm
designers anticipated--what redundancies they look for. Sometimes, with
waveforms, the most effective compression algorithms are wildly different
from those that work best with text.

One thing that makes a big difference is the ability to reduce the size
of the numbers from two bytes to one. If the compression algorithm works
initially by looking at a sequences of bytes (as a text-compression
algorithm is likely to do), then strings of 001 135 001 135 ... will not
be well compressed even though they are the same number.

Waveform compression algorithms may work quite differently from text
compression algorithms. For one thing, they have to work with numbers.
They could start by reducing the data into principal component form,
or by differentiating it, or both (using principal components on the
values and the derivatives simultaneously). Or they might use the new
fractal decomposition techniques. But they wouldn't use the sequential
statistics of English! With waveforms, the peak sample-to-sample difference
is likely to be much smaller than the peak values of the waveform, and
one might get a two-byte to one-byte compression simply by presenting
a starting value and coding the rest as sucessive differences before
compression.

However, if the compression algorithm isn't the reason for the small
improvement, there are two other possibilities: system noise, or model
failure. If the system noise is truly Gaussian, there's nothing much
any compression algorithm can do. But if there is some improvement in
the model that would reduce the noise level appreciably, it would be
worthwhile wasting quite a few bytes of model description to do so.

So, don't be discouraged by a reduction of only 13% in the amount of data
needed after compression. There are lots of reasons why this might happen
that do not reflect on the value of the underlying concept.

Martin