Includes an MSVC 7 project, compiled exe and lena.gif.
Wavelet specificsThe wavelet transform used is the lifting version of the Daubechies 4 wavelet. The lifting wavelet transform operates on a 1D array of doubles and separates the data into a low frequency half and a high frequency half. This allows us to disregard the (mostly) non-important high frequency information in the images.
To acquire the 2D wavelet transform we run the 1D transform on all the rows of the image, then transpose the image and run the wavelet transform on the rows (which are the columns of the original image).
The forward transform produces a 2D array of wavelet transforms which are arranged into sub-bands. These sub-bands are organized in a recursive square pattern with the square being subdivided into 4 squares and the top-left corner of that square being recursively subdivided into squares. Each sub-band is independently quantized as each requires a different amount of precision to accurately reproduce the original image. The largest top-right, bottom-left and bottom-right sub-bands can lose the most information while still allowing us to reproduce the picture accurately.
QuantizationFor quantization of the values we use a simple uniform-step-size quantizer which is given the minimal and maximal values that occur in the sub-band and the number of discrete levels we will quantize to. Then the quantized value is determined using the following formula.
q(v) = (v – minimum) / ((maximum – minimum) / (levels – 1))
Where q(v) is the quantized value. The inverse is determined by the following formula.
i(v) = minimum + (q(v) * ((maximum – minimum) / (levels – 1)))
Where i(v) is the inverse value. Once the value has been quantized to an integer it is then arithmetic coded using a simple order-0 predictor and standard integer based arithmetic coder.
Colour dataColour data is slightly more difficult to deal with as each colour must be treated separately and the traditional RGB representation is inefficient to encode. Instead an approximation to the RS0S1 transform is used that makes two of the components vary much less on average. This transform is represented as follows.
i0 = 0.299 * r + 0.587 * g + 0.114 * b
i1 = -0.16875 * r – 0.33126 * g + 0.5 * b
i2 = 0.5 * r – 0.41869 * g + 0.08131 * b
The inverse transform is as follows.
r = i0 + 1.402 * i2
g = i0 – 0.34413 * i1 – 0.71414 * i2
b = i0 + 1.772 * i1
|Image information||Compression ratio||PSNR||Image|
|Lena colour (high)||786432 to 546975 - 69.5515%||19.538|
|Lena b/w (high)||65536 to 42027 - 64.1281%||31.475|
|Peppers (high)||786432 to 428271 - 54.4575%||39.568|
|Lena colour (low)||786432 to 182263 - 23.1759%||19.142|
|Lena b/w (low)||65536 to 12975 - 19.7983%||28.266|
|Peppers (low)||786432 to 107367 - 13.6524%||19.549|
These results subjectively compare well to JPEG’s results which is a DCT based compression scheme. However, this simplified scheme does not perform well comparatively against the state of the art wavelet coders, such as SPHIT, which shows impressive visual results even at 0.5bpp (4,096 bytes) while our coder begins to degrade around 1bpp. This is quite possibly due to the appearance of square like effects in the decoded image caused by improper handling of border cases in the wavelet transform. Also, the quantizer is quite simple and uses constant values for determining the “weight” to give to a particular sub-band. If the assignment of weights to sub-bands were to be improved, the quantizer better adapted to the data and perhaps a Daubechies 9/7 transform were used these results could be much improved.