The Blog

L. Spiro Image (.LSI) Format

Overview

On low-end devices such as iPhone and Android, small files are essential.  On high-end devices with large disc capacity, small files at first seem not to particularly matter, but since the target of L. Spiro Engine is next-generation machines, it is worth assuming that developers will have a substantial amount of resources to put into their games, so small files are once again essential.  In addition, a non-lossy format is good for general texture quality but essential for certain types of images, such as normal maps.  With this in mind I developed a proprietary image format called L. Spiro Image (.LSI).

 

Compression

An average .LSI file is ~50% the size of the image in .PNG format.  Cartoon images average below 5% the size of the same .PNG image (I have gotten 0.05% in some cases), and photographic images are ~100% the size of the same .PNG image, though can be a bit larger.  All examples here are assuming 32-bit R8G8B8A8 format (lossless, full alpha).

.LSI files are designed to be used with games, so .LSI also supports (8-bit)[A8R3G3B2], (16-bit)[R5G6B5R4G4B4A4R5G5B5A1], (24-bit)[R8G8B8], (64-bit)[R16G16B16A16], (64-bit floating-point)[R16G16B16A16F], and (128-bit floating point)[R32G32B32A32F].  8-bit, 16-bit, and 24-bit images are always smaller than the same .PNG image, averaging ~2% of the .PNG size.

There are 4 types of compression routines used by .LSI.  It compresses its image data with each type and then selects the smallest result.

 

#1: Index Compression

This is usually the one that results in the smallest file.  A color table is built for every color in the image.  It can be of any size, but will logically be larger depending on the number of colors in the image.  Once the table is built, the size of each index into that table is determined by the sizeof the table.  For example, if the table size is 256, 8 bits are needed to access any entry in the table.  Likewise, a table of 2,536 entries requires 12 bits per index.  The index table is then built using that number of bits per index.

Finally, both the color table and the index table are individually compressed using a customized LZW algorithm, reducing both of their sizes by over half.  The result is stunning on cartoon images, where the color table is likely to be below 64 entries, and each index only 6 bits.  This alone reduces a 32-bit image to just over 18.75% of its original size, but the additional LZW compression, which works well on repeating data (which will happen frequently in cartoon images), will take the image size down to ~2% of its original size.  I have had some images reach 0.05% of their original sizes, and once again this is 32-bit lossless.

 

#2: RLE Compression

This is a standard run-length encoding scheme, except that each channel is RLE-compressed individually.  This is because the individual channels (red, green, blue, and alpha) across an image change less frequently than the actual colors do.  Each channel is the compressed with my modified LZW compression routine and packed together one after the other.  For photographic images, this actually makes the file size marginally larger after compression, and it usually does not beat #1.

 

#3: Channel Compression

This is designed to take advantage of the fact that the LZW compression algorithm is extremely effective on repeating data.  As mentioned above, the individual channels in an image change less frequently than the actual colors.  Although the colors may be changing as you move across a row in the image, the red channel, for instance, may not be changing at all; the changing colors are only due to the green and blue channels.

This compression strips each channel individually, storing all of the reds, greens, blues, and alphas in separate buffers, which are each then compressed with my modified LZW routine.  This can in some cases compete with #1.

 

#4: Raw Compression

The entire image buffer is compressed using my modified LZW routine.  This sometimes wins in photographic images.

 

Lossy Compression

The above routines are all lossless, and even with 8-bits per channel almost always beat .PNG in file size.  If lossy images are acceptable, a new layer of compression is applied over the lossless ones that can always guarantee a file size less than half that of .PNG (usually ~2%).

First, a histogram of the 32-bit (or 24-bit) image is made for each channel.  Usually only a small range of each channel is used.  The little-used high and low values are clipped, and the range with the largest number of values is expanded to fill the gap.  For example, we have an 8-bit red channel, with most of the red values being from 32 to 189.  We want to use this range of values to fill all 8 bits, so we move that range down by 32, giving us a new range of values from 0 to 157.  Then we scale this range up to 255 by multiplying each value by 1.6242.  The values that were spaced evenly between 0 and 157 are now spaced evenly between 0 and 255.

This is done for each channel.  The resulting data is then converted to one of the 16-bit formats.  If there is no alpha, R5G6B5 is used.  If there is a wide range of alpha values, R4G4B4A4 is used.  Otherwise R5G5B5A1 is used.  After this conversion, the resulting image data is fed into the above-mentioned lossless routines in order to further decrease the file size.

The shift and scale for each channel is also stored so that, during unpacking, the approximate original colors can be obtained.

 

Why the Shift and Scale?

Going back to our example of a range between 32 and 189 in 8-bit format, consider what happens to the data if we simply convert it to 4 bits (for the R4G4B4A4 format).  After the conversion, 32 will become 2 and 189 will become 11.  The original data had a range of 157 unique values, but the converted data has a range of 9 unique values.  That means when we convert back to an 8-bit format later, there will only be 9 unique values between the original 32-189 range: 32, 48, 64, 80, 96, 112, 128, 144, 160, and 176.

By shifting and scaling we are able to use all 4 bits to express the valid range, giving us 15 unique values instead of 9.  The new decompressed data becomes: 32, 41, 51, 61, 71, 81, 91, 100, 110, 120, 130, 140, 150, 160, 169, and 179.  This is a lot nicer than a naive downgrade to 16 bits.

In L. Spiro Engine, this decompression is performed in the shader.  Not only does a shader allow us to retain the fractions (the second value is actually 41.85098, but when stored in integral format it will be truncated to 41) for more precision, but it also allows us to store the actual texture using 16 bits per channel instead of 32.  This literally doubles the number of textures available to the developer during run-time with only a slight loss in quality.  The developer will still be able to use any texture format desired, however, so crisp 32-bit textures are still fully usable.

 

 

L. Spiro

About L. Spiro

L. Spiro is a professional actor, programmer, and artist, with a bit of dabbling in music. || [Senior Core Tech Engineer]/[Motion Capture] at Deep Silver Dambuster Studios on: * Homefront: The Revolution * UNANNOUNCED || [Senior Graphics Programmer]/[Motion Capture] at Square Enix on: * Luminous Studio engine * Final Fantasy XV || [R&D Programmer] at tri-Ace on: * Phantasy Star Nova * Star Ocean: Integrity and Faithlessness * Silent Scope: Bone Eater * Danball Senki W || [Programmer] on: * Leisure Suit Larry: Beach Volley * Ghost Recon 2 Online * HOT PXL * 187 Ride or Die * Ready Steady Cook * Tennis Elbow || L. Spiro is currently a GPU performance engineer at Apple Inc. || Hyper-realism (pencil & paper): https://www.deviantart.com/l-spiro/gallery/4844241/Realism || Music (live-played classical piano, remixes, and original compositions): https://soundcloud.com/l-spiro/

No comments yet.

Leave a Comment

Remember to play nicely folks, nobody likes a troll.