3.3 Version 3 resource storage
by Lance Ewing
<be@ihug.co.nz>
Last updated: 27 January 1998
AGIv3 stores resources in a slightly different way from AGIv2.
The first significant difference is in the length of the resource
header which is now seven bytes.
________ ________ ________ ________ ________ ________ ________
| | | | | | | |
| 0x12 | 0x34 | VOLNum | UncompSize | CompSize |
|________|________|________|________|________|________|________|
VOLNum: Bits 0-3 = VOL file number. Bit 7 = this resource is a
PICTURE
UncompSize: Uncompressed resource size [LO-HI]
CompSize: Compressed resource size [LO-HI]
Instead of one resource size as in AGIv2, there are now two
sizes. Most of the resources in AGIv3 games are compressed with a
form of LZW. Some of them are not though. The interpreter
determines whether the resource is compressed by comparing the
values of the two sizes given in the header information. If they
are equal, then it knows that the resource is stored
uncompressed. However, if the sizes do not match, this does not
mean that the file is compressed with LZW. If the file is a
PICTURE file, then it is stored with its own limited form of
compression. This is why the top bit of the third byte in the
header is used to tell the interpreter that the resource is a
PICTURE file, otherwise it would think that the resource was
compressed with LZW.
As far as I can tell, none of the PICTUREs are compressed with
LZW. This may well be possible though. It could also be possible
for the PICTURE to be totally uncompressed (i.e. it wouldn't use
the PICTURE compression method), but I havn't seen any examples
of either of the above two cases.
LZW COMPRESSION
The compression used with version 3 games is an adaptive form
of LZW. The LZW algorithm is not explained here, but it basically
compresses data by representing previous strings by single codes.
When these strings are encountered again, the code can be stored
instead. The following information states how the AGIv3 algorithm
differs from the standard LZW algorithm. There are plenty of
places on the net where you can find a description of the LZW
algorithm if you are not familiar with it.
AGIv3 uses an adaptive form of LZW that starts by using 9 bit
codes and when the code space is full, it progresses on to 10
bits and so on. As with normal LZW, codes 0-255 represent the
standard ASCII characters. The next two codes have a special
meaning:
- 256 is used as a start over code. The table is cleared, the
number of bits set back to 9, and the process begins again with
the next code being 258.
- 257 tells the interpreter that it has reached the end of the
resource.
Code 256 seems to be the first code stored in all compressed
resources. This is probably just to make sure everything is
initialized for beginning the compression process. As was
mentioned above, the first code used for the LZW table itself is
code 258. From there it stores pairs of prefix codes and appended
characters for each table entry until it reaches code 512 at
which stage it switches to storing the codes using 10 bits and
then 11 and so on. It appears that it will never get to 12 bits
because code 256 always seems to turn up just before it needs to
switch up to 12 bits, i.e. when code 2048 is required. Carl
Muckenhoupts decrypt routine for SCI games specifically prevents
it from switching to 12 bits anyway. Whether there is ever a case
where code 256 does not intervene, it has not yet been
determined.
Note: I should point out that Carl and myself both arrived at
the above algorithm independently which confirms that the
compression used in the early SCI games was identical to that
used in AGIv3.
PICTURE COMPRESSION
Pictures in AGI version 3 use a simple form of compression to
shrink their size my a tiny amount. It was obviously recognised
by the interpreter coders that four bits were being wasted for
picture codes 0xF0 and 0xF2. These are the two codes that change
the visual and the priority colour respectively. Since there are
only 16 colours, there need not be a whole byte set aside for
storing the colour. All the picture compression does is store
these colours in 4 bits rather than 8.
Example:
Original picture codes: F0 06 F8 12 45 F0 07 F2 05 F8 14 67
...
Compressed picture code: F0 6F 81 24 5F 07 F2 5F 81 46 7 ...