Interactive Selection of JPEG Quantization Tables for Digital X-Ray Image Compression
L. E. Berman, Babak Nouri, Bautam Roy, L. Neve
IS&T/SPIE, San Jose, CA
Feb. 1-4, 1993.
Selecting an appropriate quantization table for Joint Photographic Exploitation Group (JPEG) data compression of a class of images can be an arduous task. We have designed a graphical user interface to study the effects of quantization on compression ratio and the resulting image quality. The tool calculates several measures of the difference between the original and lossy compressed image. Some of these measures are entropy, mean square error, and normalized mean square error. These measures aid the user in selecting the optimal quantization values with respect to image fidelity and compression ratio for a particular class of images.
Workstations for a project named Digital X-ray prototype workstations Linked via Internet (DXPNET) are being built so that radiographs can be downloaded over the Internet from an electronic archive at the National Library of Medicine (NLM). These images, derived from the National Health and Nutrition Examination Survey (NHANES), currently include cervical and lumbar spine radiographs. Remote evaluators, including radiologists and researchers, will grade the imagery and transmit their evaluations back to NLM for eventual inclusion in a national database.
One of DXPNET's system objectives is to meet the evaluators' real time analysis requirements. Coupling heavy point to point traffic on the Internet with the size of the images (5Mbytes-10Mbytes), and limited archive capacity (72Gbytes) has forced us to consider data compression. Previous analysis of the entropy of the cervical image set suggests that lossless compression will not provide the necessary savings to meet the system objective.[1] Therefore, several experiments will be conducted using the JPEG lossy compression algorithm, selected partly because of its emergence as an international standard. However, lossy data compression is a viable alternative only if it does not interfere with the evaluator being able to detect disease and conditions specific to the image class. The question is: Is there a compression ratio consistent with the required image quality[2]?
JPEG lossy data compression has several processing steps. First the image is broken down into a stream of 8x8 blocks which in turn are transformed into the frequency domain using a Forward Discrete Cosine Transform (FDCT).[3]
Following the FDCT, the resulting basis-signal amplitudes are uniformly quantized with a 64 element quantization table (QTABLE). Each scaled coefficient is than rounded off to an integer (see equation 2). Each element of the QTABLE can range from 1 to ((2^12) - 1) (12 bit images). This step delivers the greatest amount of compression, but results in the greatest source of pixel reconstruction error in JPEG.
Current efforts to compress digital imagery with the JPEG standard have emphasized statistical means for adjusting the quantization table.[4] In contrast, our efforts have relied first on perceptual and subjective analysis and second mathematical analysis of lossy compressed images to manipulate the quantization table. We have made this step in the process interactive by supplying the user with a graphical user interface to manipulate the QTABLE. By choosing which coefficients to emphasize and de-emphasize, one can quickly determine the impact of QTABLE manipulation.
2.0 JPEG EVALUATION TOOL (JET)
2.1 JET man machine interface
The man-machine interface (MMI) has several tools that aid the user in studying the effects of quantization on compression ratio and the resulting subjective image quality. The MMI starts up with a JPEG Compression/Decompression Tool window as shown in Figure 1. This window has several sections. First, there are file chooser buttons: Compress File Chooser and Decompress File Chooser. Using a mouse to select either button will result in a pop-up window. In these windows a user can select the name of the file by depressing a mouse button when the mouse pointer is covering the desired file. Optionally, a user can type in the name of the file. After selecting the file to be compressed, the user provides a name for the resulting lossy compressed image file. In the upper right side of the window, the compressed file size gets displayed.
There are sixty four quantization coefficients (q00 - q77). q00 is the DC term in the spectral frequency domain. q01 is a low frequency term and q77 is a high frequency term. All other quantization coefficients are related to spectral components in the spatial frequency domain. Each quantization coefficient can be individually manipulated by using the button fields in the bottom of the MMI. Alternatively, if the user wants all cofficients to be scaled with the same step size, he can click the mouse pointer on the buttons labelled 2's, 4's, 8's,...,4096's in the lower left hand corner. To select the quantization coefficient values for a particular row or column, the following buttons on the lower right-hand side of the window are used: Row, Column, Row Value, Column Value . We set coefficients q02, q12,..., q72 to 4095 and coefficients q07, q17,..., q77 to 32.
After deciding on the values of the quantization coefficients, there is an option for saving these values. Conversely, a previously stored table can be read in and used. This eliminates the effort required to select all the quantization coefficients interactively. The buttons which perform these utilities are labelled Save Qtable and Read Qtable. The values for the quantization coefficients are saved in a user-defined file name.
After selecting the values of the quantization coefficients, we can compress the selected file by clicking on the Start Compression button. After compressing the file we can later decompress the file by clicking on the Start Decompression button. But before we proceed to decompress the file, we have to give it a name if we want to save the lossy uncompressed file. There are several image processing utilities: Entropy, Compression Ratio, Display Histogram, Root Mean Square Error and Normalized Mean Square Error . These features are useful following compression and the ensuing decompression of the lossy file in analyzing the loss incurred by compression. Entropy measures the average lossless entropy encoding that is possible for an image. The higher the entropy, the more difficult it will be to compress the image in a lossless manner. The compression ratio plays an integral part in the study of optimizing the QTABLE in that it facilitates the comparison of file sizes and the corresponding fidelity of lossy compressed versions of the same image. Moreover, this function can serve the general purpose of measuring how efficiently the images can be compressed according to the needs of the user.
Another useful feature is the histogram function. The histogram function computes a distribution of the number of times a specific pixel intensity occurs. This function is used both before compression of the original image and also following decompression of the lossy files. This shows the user how the energy levels have been redistributed following the compression/decompresion process. A change in the shape of the histogram indicates a change in contrast of the image. Such a change, either positive or negative, is again left to the user's judgement.
The last two features essentially provide the same information. Both the root mean square error and the normalized mean square error are used to measure how effectively the image was compressed. These features measure the amount of pixel reconstruction error. Finally, there is a quit button that exits the MMI gracefully.
2.2 Image viewing
A critical stage in evaluating lossy compression is viewing the image. JET contains two ways of displaying an image. The first method is via the Megascan feature. If the workstation is equipped with a Megascan monitor (2048 x 2560 x 8 bits/pixel), the megascan yes box may be highlighted and the image viewed accordingly. If a Megascan monitor is not available, then the user can display the images on a SUN workstation 8-bit color monitor under OpenWindows 3.0. We have developed a program, Imview, for this purpose. This program has evolved into a tool that is valuable in analyzing the digitized radiographs and in optimizing the JPEG compression algorithm for cervical radiographs.
The Imview program not only displays images but includes certain functions basic to our study. Two functions in particular are the variance and the FDCT. The variance function calculates the deviation of the pixel energy levels from the average inside a user defined rectangular region in a specified image. The FDCT function computes the energy of the signal across the frequency spectrum. If the variance is low the FDCT shows that the signal energy primarily comprised of DC and low frequency components. Conversely, high variance suggests that the signal energy spans more of the energy spectrum and contains mid and high frequency components. This information can then be applied to the QTABLE for effective compression of the specified class of images. Imview has another useful feature that allows the user to choose a rectangular region anywhere within the image and calculate the mean, median, and center of mass. Finally, Imview can plot a histogram on all or part of the image.
3 DISCUSSION
In this section we discuss how the MMI may be used to select effective quantization values with respect to image fidelity and compression ratio for a specific class of images, i.e., how this tool can be used by mixing the tool operation with actual data derived from the cervical radiograph image set. Although the most prominent feature in the MMI is the QTABLE, there are several tools within the package that are used before actual manipulation of the QTABLE becomes necessary. The first is the entropy function. This calculates the average expected lossless compression rate. Measuring the entropy provides the impetus to either employ lossless entropy encoding techniques or investigate lossy compression schemes.The entropy for a particular image may be expressed as[5]:
Using equation \ref{eq:ent}, the measured entropy for the test set of cervical radiographs has been calculated to be 10.82 bits/pixel. This results in a lossless compression ratio of 1.47:1. Such a compression ratio is not suitable for our project, because the space saved employing lossless techniques would not be sufficient with respect to transmission speed and traffic. Hence the use of lossy compression techniques and the JPEG lossy compression algorithm. Once we decided against using lossless compression we began to investigate properties of the FDCT. Imview was useful in this analysis since it allows the user to compute a FDCT and variance on any rectangular region in the image. In particular we were interested in 8x8 blocks. Using Imview it was possible to study the relationship between the DCT and variance in areas of pure signal, pure noise, and areas where both signal and noise are present. The goal of this approach is to determine whether the FDCT of different areas yields concentrations of energy in different regions of the frequency domain. If this is the case then the QTABLE can be manipulated to preserve the coefficients containing most of the signal energy. The net effect would be a lossy image with signal preserved and noise reduced.
Segmenting the signal from noise and achieving high compression ratios, based on energy concentration in the frequency spectrum, was not possible. Regardless of the type of region analyzed the results show that most of the energy resides in the DC and several low frequency components. This occurs because all regions had low variances, implying generally homogeneous 8x8 blocks throughout the image. We investigated this result further by setting the QTABLE so that certain columns and rows had step sizes relative to energy concentration. As previously mentioned the majority of the energy in a DCT of the cervical radiographs resides in the DC and the low frequency components. Thus, the last four rows and columns within the QTABLE were set to 255 (the maximum threshold value at the time). This effectively eliminated the existence of any mid or high frequency components in the spatial frequency domain. This method provided only minimal compression since so much of the energy was concentrated in the region that we preserved. Therefore, it did not lead to an effective scheme for JPEG compression of the cervical radiographs.
As stated before, the predominant feature within the MMI is the QTABLE. In our experiment the QTABLE values were set in conjunction with a subjective study of the information content in each bit plane. For example, if it is determined that the first b bits in a pixel carry little or no information pertinent to the image then these bits can be shifted out of the pixel without a loss of information. Since a shift right of b bits is equivalent to division by 2^b, the QTABLE factors are accordingly set to 2^b. Similarily, multiplying by 2^b, equivalent to a left shift by b bits, would bring back the relative magnitude. Consider the case where the first bit plane within an image was the only bit plane not carrying any information. In this case all coefficients would be set to 2. Likewise, if the first 8 bit planes carried little or no information all coefficients levels would be set to 256 and the user would click the appropriate button. In the previously mentioned study,[1] it was determined that the first five bit planes carry no information pertinent to the image. Consequently, all the coefficients in the QTABLE could be set equal to 32. It should be noted that had the bits carrying little or no information not resided in the lowermost contiguous bits, preprocessing of the bit planes would be needed before the JPEG tool could be employed in such a manner.
There are several tools that are quite useful in analyzing the compressed and uncompressed lossy images. These features include compression ratio, histogram, and the normalized and root mean square errors. The histogram function provides a frequency distribution of the number of times a specific pixel energy level occurs. This function is used before compression of the original image and following decompression of the lossy files. The histogram shows the user how the energy levels have been redistributed following the compression/decompresion process. In specific, a change in the shape of the histogram for the cervical radiographs would indicate a change in contrast of the image. The effect could vary from being detrimental to actually enhancing the image. The last two features essentially provide the same information. Both the root mean square error (RMSE) and the normalized mean square error (NMSE) are used to measure how effectively the image was compressed. These measures play an integral part in the study of optimizing the QTABLE in that it facilitates the comparison of file sizes and the corresponding error for lossy compressed versions of the original image. For a typical cervical radiograph the RMSE increases linearly with the step size and compression ratio as expected. However, at the highest step size used in this experiment the RMSE decreased. This is not yet fully understood but will be studied further for a better understanding of this effect. These tools can serve the general purpose of measuring how efficiently the images can be compressed according to the needs of the user. Finally, in one study, compression ratios of up to 40:1 resulted in no detectable difference between the original image and the uncompressed lossy image.[1]
4 SUMMARY AND CONCLUSIONS
The tool described in this paper is a useful means for selecting coefficients in the JPEG QTABLE. This tool can be used in either of two ways. First, the user can quantize coefficients in a prescribed manner based on some measureable criteria. Second, the user can iteratively change the step sizes in a way that has the greatest impact on visual perception. In both cases some mathematical analysis is provided to determine the difference between the original and lossy compressed image. As shown in section 3, this should always be used in conjunction with visual perception, not as a substitute for it.
Future versions of this tool will include means for tabulating experimental results for a group of images and some pre-processing steps (such as thresholding and filtering) to determine the effects on data compression.
5 REFERENCES
1. L. E. Berman, R. Long, S. R. Pillemer, "Effects of Quantization Table Manipulation on JPEG Compression of Cervical Radiographs," Society for Information Display, 1993 International Symposium, Seminar, & Exhibition, Seattle, WA, May 16-21, 1993.
2. G. R. Thoma, L. R. Long, and L. E. Berman, "Access to a Digital Xray Archive over Internet," Proc. SPIE, Enabling Technologies for High-Bandwidth Applications. Vol. 1785, Sept. 8-11, 1992, pp. 79-86, Boston, MA.
3. G. Wallace, "The JPEG Still Picture Compression Standard," Communications of the ACM, Vol. 34, Number 4, pp. 30-44, April 1991.
4. H. Lee, Y. Kim, A. H. Rowberg, E. A. Riskin, "Statistical Distributions of DCT Coefficients and Their Application to an Interframe Compression Algorithm for 3-D Medical Images," under revision for the IEEE Trans. on Medical Imaging.
5. J.S. Lim, "Two Dimensional Signal and Image Processing", Ch. 10, Prentice-Hall, Englewood Cliffs, 1990.









Equation 1:
Equation 2:
Figure 1.
Figure 2.
Figure 3.
Figure 4.
Figure 5.
Figure 6.
Equation 3:
Figure 7.
Table 1.