Design Issues for a Digital Xray Archive Accessed over Internet

Thoma GR.,Long LR.,Berman LE.
"Design Issues for a Digital Xray Archive Accessed over Internet."
Proceedings of SPIE: Storage and Retrieval for Image and Video Databases II. San Jose, CA: 1994
Vol. 2185, 129-138.


The design of an electronic archive of digitized images of thousands of xrays collected as part of nationwide health surveys has raised several issues related to user interface design, image presentation and image compression. The project involves developing an image archive implemented with an optical disk jukebox, and user workstations that allow Internet access to the images. This paper describes: the physical layout design of the workstation screens; desirable image processing functions contributing to better viewing and minimizing artifacts; interface design factors contributing to ease-of-use and speed of task completion; and work toward the selection of a suitable image compression technique.


The system described here is designed as part of a project designated Digital Xray Prototype Workstations linked via InterNET, or DXPNET. This project aims to create an electronic image archive of digitized xrays and provide access to it over the wideband wide area network, the Internet[1]. It is a collaborative effort among the National Library of Medicine (NLM), the National Center for Health Statistics (NCHS), and the National Institute of Arthritis, Musculoskeletal and Skin Diseases (NIAMS). The Lister Hill Center, the R&D division of NLM, serves as the technical developer of the systems in the project.

The impetus for the DXPNET project is a periodic nationwide survey of public health conditions, the National Health and Nutrition Examination Survey (NHANES). The second such survey, NHANES II, yielded a broad spectrum of information on each of approximately 20,000 participants including such data as age, sex, dietary habits and blood chemistry analyses. In addition, a subset of the participants received a detailed examination that included radiographs of the cervical and lumbar spine. This resulted in a collection of approximately 17,000 films. The third survey, NHANES III, currently in progress, is expected to produce an estimated additional 10,000 films of hands, wrists and knees.

The goal of the DXPNET project is to provide access to the digital images of these xrays over the Internet initially to radiologists who will view them and produce standardized readings, which are scored visual assessments of various physical attributes. Eventually, the images will be linked to other NHANES data and provided to a larger scientific community, e.g., for epidemiological research. The availability of the existing Internet network and the promise of its eventual upgrade to Gigabits/second speeds, offers the opportunity to meet such requirements with a network solution.

The xrays are digitized by the Radix Corp. using a Lumiscan He-Ne laser scanner at a scan density of about 146 pixels per inch and 12 bits per pixel. Films 14x17 inches in dimension result in images 2048 x 2487 pixels in size. The large (lumbar spine) films result in 10 Megabyte files, and the small (cervical spine) films in 5 Megabyte files. Radix supplies the images on standard erasable optical disks which are stored in an Electronic Xray Archive (EXA) consisting of a high capacity optical disk jukebox controlled by a Sun 670MP.


In the system's client-server architecture, the EXA is the server while the client is the Standardized Readings Workstation (SRW). The SRW will enable radiologists to retrieve images, view them, manipulate them as required, and allow the keying in of standardized readings. These readings are then transmitted back over the Internet to a database in the EXA for storage.

2.1. Hardware and Software Platform

The SRW is the user workstation. Its current version uses a Sun 4/260 platform with a Sun 3 cgtwo video card, color monitor, 16 MB of RAM and a hard drive with 227 MB of formatted capacity. The images are displayed on an E-Systems Megascan monitor (portrait mode) with a spatial resolution of 2048 x 2560 pixels. This high resolution monitor is supported by two VME 9U video cards in the Sun 4/260. E-Systems provides a proprietary low-level frame-copy library to move images from program memory to display memory. The Sun platform runs OS 4.1.1 and the OpenLook Window Manager. The SRW screens described later in this paper are implemented by GUIDE, an Xview GUI development toolkit.

The current SRW design relies on an older platform and a high resolution monitor supported only by a VME interface, but the next version of the SRW will be based on an open systems approach to supplant the current mixture of open and proprietary components. Its design will conform with cross-platform compatibility as defined by the Common Open Software Environment (COSE) and the COSE Desktop Environment (CDE). It will use a Sun Sparc10 as the platform with 32 MB of RAM and a 1.2 GB disk drive. This platform offers three Sbus slots one of which will be used by a high resolution (at least 2048 x 2048 pixels) graphics card that supports MIT's x11r5 windowing system and the Motif 1.2 GUI. The GUI is to be developed using the Transportable Applications Environment Plus (TAE+) toolkit. Graphics cards from E-Systems as well as from Dome Imaging Systems and Tech Source, Inc., all of which support at least 2048 x 2048 pixels spatial resolution and eight bits of contrast resolution, will be evaluated.

2.2. Design factors promoting ease of use

A basic consideration in the design of the SRW is the use of a graphical user interface (GUI) offering the type of controls (buttons, sliders, text entry fields, mouse interaction, etc.) which have become standard in human-computer interfaces.

A second consideration is to graphically separate major functions, and to group together related subfunctions. Of the two major screens provided, one (titled Standardized Readings Workstation) is dedicated almost entirely to the work required to produce the standardized readings; the second (Enhance X-ray Image) is designed to enable the user to alter the image for better viewing in terms of more contrast, convenient viewing orientation and greater spatial detail. Moreover, the five contrast-enhancing functions are grouped together, as are the two functions which reorient the pixels on the screen (Flip and Rotate).

Third, the options on the screens have been kept simple, by providing only essential controls. Should more advanced controls be required as a result of field evaluation of the current prototype system, they will be provided, perhaps on hidden pop-up screens accessible if needed.

Finally, context-sensitive help is provided. By placing the mouse pointer over any control and pressing the HELP key on the keyboard, the user may pop up a window of information relevant to that control. This implementation keeps unnecessary and possibly confusing text off the main screens, while allowing access to help at any time.

It may be noted that, in order to view the processed image on the high- resolution monitor (such as "through" a digital magnifying glass), the user must interact with the image tablet (a small low-resolution version of the image) on the Sun monitor. This implementation is a consequence of technical limitations to working directly on the high-resolution monitor. The next generation of this workstation is expected to support direct user interaction on the high-resolution monitor.

2.3. Physical layout design of the SRW screens

The user interface for the current SRW is presented on the Sun monitor, the Megascan monitor used only to display the radiograph at full resolution. Functions which control the appearance of the radiograph on the Megascan, such as magnification or contrast enhancement, are invoked through the Sun monitor user interface, with the results appearing on the Megascan monitor.

Figure 1 shows the first user interface screen as it appears on the Sun monitor. This screen accepts the user's login i.d. and password.

Figure 1.

Following a successful login request, the main user interface screens appear. Figure 2 shows the screens when a cervical spine image is requested. The left screen, titled Standardized Readings Workstation, supports functions related to user requests for images and user entry of the standardized readings data. The right screen, titled Enhance X-ray Image, offers the user image processing functions to alter the appearance of an image which is already being displayed. As shown, a reduced copy of the image also appears on the right screen.

Figure 2.

The user must first interact with the left screen to request an image to be read (and to be presented with a data collection template to record the readings; the template is appropriate to the type of xray, either cervical or lumbar spine). Then the right screen may be used to enhance the visual appearance of that image as an aid to the reading.

The template, in matrix form, accepts integer-valued inputs, ranging from 0 (interpreted as "normal" or "absence of condition of interest") to 4 (interpreted as "maximum degree of condition of interest"). Each row corresponds to a particular location in the cervical spine; for example, the row labelled "C2-3" corresponds to the region between and including the C2 and C3 vertebrae. Each column corresponds to a particular medical condition of interest; for example, the column labelled "Osteophytes" refers to the condition of having the bony protuberances known by that name. The user may also enter free-form comments in the "Remarks" section; these are also stored in the database.

A description of each of the buttons and fields on the left screen follows. The right screen functions are discussed in the next section.


Do New Reading When pressed, the system provides the user with the next image to be read, along with a data recording template appropriate for that particular image type. The request for a new image is sent via TCP/IP from the SRW to the EXA server at NLM, which responds by sending via TCP/IP the next image requiring a reading. In line with the type of image, the user is presented a data recording template on the left screen. The progress of the incoming image data appears on a "thermometer-like" gauge graphic. When the image has been received at the SRW, the user interface shows a reduced copy of the image which simultaneously appears at full resolution on the Megascan monitor.

Save Reading The standardized readings data entered into the template by the radiologist is transmitted via TCP/IP to NLM and stored in the system database.

Modify Previous Reading The user is prompted for the image identification number of a previously-read image; that image is fetched by the EXA server and re-transmitted to the SRW. The previous standardized reading is also retrieved from the system database by the server and transmitted to the SRW to be viewed by the user.

Change Password The user may enter a new password.

Help When pressed, a short message is displayed which tells the user how to get the context-sensitive help provided by the system.

Quit The session is terminated, and all SRW windows are closed.


Last Reading Taken Each reading is assigned a unique identification number.

Remaining Readings Each radiologist has an assigned, predetermined number of images to read. This number reflects how many are yet to be read.

2.4. Image Processing Functions

Available to the user on the right screen of the interface is a group of contrast-enhancement functions, a function to flip the image about a central vertical axis, a function to invert the gray-scale sense of the image pixels, a movable, digital magnifying glass, and a function to restore the image to the system default presentation. Two of these functions (Region-Based Histogram Equalization, and the digital magnifying glass) require the use of a mouse to draw a rectangle on the Sun monitor screen. In both cases, the rectangle is drawn on the small copy of the image, not on the high-resolution monitor. The results of the image processing operations, however, are displayed at full resolution on the Megascan monitor.

Since the image data is 12 bits per pixel, while the monitor can only display 8 bits per pixel, a look-up table (LUT) is used to map the input data values to displayable values. The user may control the mapping used by selecting one of the five contrast enhancement functions. A description of each of the buttons on the Enhance X-ray Image screen follows.

Histogram Equalize (HE) When pressed, the system does a global histogram equalization on the image data. First a histogram is computed for all 12-bit pixels in the image; this histogram data is used to compute the cumulative distribution function (cdf) for the image, and the cdf values are placed directly into a display look-up table. Conceptually, the process follows an algorithm described in the literature[2]. This function enhances the contrast of the image in the global sense. However, contrast in particular regions of interest (ROI) may be adversely affected by the pixel distributions in other, irrelevant regions. For example, HE might lower the contrast in a small ROI if the image contains large regions that are predominantly white or black.

Region-Based HE The user must first draw a rectangular region on the tablet before using this button. Then, pressing this button causes the system to compute a histogram for the 12-bit values located within the rectangle. At this point, the algorithm is the same as for global HE, except that only the histogram of the region is used, rather than the histogram for the entire image. The new 12-bit values are mapped to displayable values by a linear mapping. This function is intended to enhance the contrast of a particular ROI in the image. By limiting the histogram used to that of the ROI, the pixel mapping is not affected by the distribution of pixel values outside the ROI. Note that the entire image is changed, however; hence, the overall contrast of the image may be degraded, even though the ROI is enhanced.

The next three buttons control the mapping of the 12-bit image data to the 8- bit displayable values; the three mappings available are discussed below and are illustrated in Figure 3.

Figure 3.

Min-Max maps the minimum actual image data value to 0, and the maximum actual data value to 255; intermediate image data values are mapped to the range 0-255 using a linear map. Note that this mapping uses the full dynamic range (0-255) of the monitor, even though the image data may not span its full potential range of values (0-4095).

Linear This mapping takes an image data value of 0 to displayable value 0, image data value of 4095 to displayable value of 255, and uses a linear relationship for taking intermediate image data values to the range 0-255. If the input data does not span the range 0-4095, the full dynamic range of the monitor will not be used. This is the current system default.

Min-Max w/Average This produces a map which is the average of the Min- Max and Linear maps. If the input data does not span the range 0-4095, the full dynamic range of the monitor will not be used.

Flip This feature flips the image around a central vertical axis. A left-facing image becomes a right-facing image, and vice versa.

Rotate A 90o rotate capability is planned.

Invert This changes the gray-scale sense of the pixels to produce a negative image (i.e., white appears as black).

Restore This takes the image back to its default presentation (a linear map of the 12-bit pixels to displayable values) but will maintain the current flip orientation.

Zoom This is a slider tool which allows the user to select the degree of magnification (up to 10x) in the digital magnifying glass.

Measure To be implemented in the next version of the workstation, this feature will be in the form of a digital ruler providing measurements in the most relevant units (e.g., centimeters). It may also measure specific physical structures such as the spacing between vertebral joints.

2.5. Design factors promoting speed

The consequences of speed-limiting factors in the current system are: (1) time to retrieve data from the optical archive (tens of seconds); (2) time to transmit the data across the Internet (minutes); and (3) time to display and do image processing functions on the image at the local SRW, after the image has been received (seconds).

As shown in Table 1, "instantaneous" response time is not possible across the current Internet for the images discussed here (time estimates in the table are based on preliminary tests). Indeed, the achievable delay between requesting an image and seeing the image appear on the local SRW display is on the order of minutes. Yet, dramatic shortening of the response time is possible, even on today's Internet, if lossy compression could be used. Also, a significant reduction in response time may be achievable, even with uncompressed images, by optimizing the transmission mechanism by sending the image over multiple sockets.

Table 1. Estimates of Response Times (Seconds) on Current Internet Base on Time to Transmit One Cervical X-ray Image

Retrieval Transmit Display
Standard TCP/IP,
Optical Storage,
Sun 4/260 computer
20-30 approx. 300 10
Optimized TCP/IP Transmission,
Fast Magnetic Storage,
Sparc 10 computer
1-2 approx. 100* 5
Same as above, with 20:1
compressed images
1-2 approx. 5 5

*Based on experiments with image transmissions using five sockets.

Furthermore, retrieval time for images which may be predictably requested may be improved by staging the images to fast magnetic storage. Investigation of a disk-striping system as a front-end to the optical storage is planned. Also, display and image-processing times are expected to improve as the system is migrated from the existing hardware (Sun 4/260) to its target platform (Sun Sparc10).

Discussed below are two approaches currently being pursued to increase speed, viz., multiple sockets and image compression.

Multisocket approach

Internet transmission times will always be limited by the effective end-to-end speed, determined by the backbone speed available as well as the type of link to the institution. Performance of the FTP protocol which operates on the application level of the ISO Open System Model will be evaluated, as well as an application level protocol based on the Berkeley sockets mechanism. This latter protocol is being developed inhouse and is customized for the xray image transmission application. The evaluation will consist of a comparison of access time, ease of use and level of security. However, work is also underway to determine improvements possible by a multisocket approach, i.e., by dividing the image data into multiple segments and transmitting each segment over its own socket interface.

In this scheme, the image is conceptually divided into N segments, and each segment is transmitted across its own dedicated connection; a separate system process is created to transmit each image segment; likewise, on the receiving end, a separate system process is created to receive each segment. A master process which has access to each received segment handles the image reconstruction. Preliminary tests conducted show a throughput increase of two- to three-fold when five socket connections are used. The increased throughput is possibly due to the overall reduction of delays between packet transmissions in the multiple socket approach. In a single socket transmission, all packets to be transmitted may be thought of as being lined up for transmission; the packet at the head of the line is transmitted when the network protocol determines that its turn has come; this is determined by the receipt of an acknowledgment of successful delivery of some previously- sent packet or packets; all of the lined-up packets must wait for transmission until the acknowledgment arrives. In contrast, in the multiple socket approach, one socket may be transmitting packets while another is waiting for acknowledgments of previously-sent packets. Investigation of this approach is under way.

Image Compression

To counteract the low speed of retrieving and displaying the images, both lossy and lossless compression techniques are being investigated by a consortium of researchers consisting of inhouse researchers at NLM, at Stanford University, the National Center for Research Resources at the National Institutes of Health (NIH/NCRR), the Canada-France-Hawaii Telescope Corporation (CFHTC), and Monash University (Australia).

Lossless techniques are being investigated at Monash University and CFHTC. At Monash, the technique used is context-sensitive DPCM. DPCM is used to produce a prediction error which is then encoded by arithmetic encoding. Results have been reported for 29 test images (14 cervical and 15 lumbar spine xrays)[3]: for cervical images, the compression ratio (CR) is 1.99 on the average, and for lumbar images, CR is 1.93.

The CFHTC group has developed a two-step process, the first step being to partition image data into a low order partition (lop) and a high order partition (hop) based on the first order entropy of successive bitplanes. The lop is assumed to be mainly noise. Following partitioning, the hop is compressed (using any compression routine, though for the purpose here a modified Lempel-Ziv-Walsh lossless technique was used), while the lop is stored in uncompressed form. For a test image set of 62 xray images, equally divided between cervical and lumbar, the CFHTC group reported CR of 2.03 for cervical, and 2.19 for the lumbar images[4].

Since lossless compression yields poor CR, on the order of 2, the focus of the other members of the consortium has been on lossy compression. At NLM, the JPEG standard has been investigated. Based on visual observation as well as mathematical analyses, it was shown that the lower four bits in the images consist of noise, i.e., there is no discernible structure. Using this information, CR of 30 to 40 were achieved with no perceptible loss of information[5]. In support of this effort, a program called JPEG Evaluation Tool (JET) was developed at NLM[6]. JET is an Xview based GUI that accepts point-and- click manipulation of the DCT quantization matrices which the user can modify and view the results seconds later. JET also computes the mean square error and entropy.

Studies of other lossy techniques include the following: the Stanford group[7] has investigated vector quantization and its variants; and the NIH/NCRR group[8] is developing a technique combining pyramidal decomposition with wavelets and spline functions. These techniques will employ the same image test sets used already.


The outcome of this project will be the development and field evaluation of a prototype image database accessible via Internet, and workstation hardware and software to use the database.


1. Thoma GR, Long LR, Berman LE. "Access to a Digital Xray Archive over Internet." Proc. SPIE, Enabling Technologies for High-Bandwidth Applications. Vol. 1785, Sept 1992, pp. 79-86.

2. Gonzales RC, Woods RE. Digital image processing. Addison-Wesley, New York: 1992; pp. 171-80.

3. Tischer PE, Worley RT, Maeder AJ, and Goodwin M. Context-based Lossless Image Compression. The Computer Journal. Vol. 36, No.1, 1993, pp. 68-77.

4. Veran J-L, Wright J. Private communications. 1993.

5. Berman LE, Long LR, Pillemer SR. "Effects of Quantization Table Manipulation on JPEG Compression of Cervical Radiographs." SID 93 Digest. Vol. XXIV, May 1993, Seattle WA, pp. 937-41.

6. Berman LE, Nouri B, Roy G, Neve L. "Interactive Selection of JPEG Quantization Tables for Digital Xray Image Compression." Proc. SPIE. Vol. 1913, Feb 1993, pp. 217-28.

7. Riskin EA, Gray RM. A greedy tree growing algorithm for the design of variable rate vector quantizers. IEEE Trans. Signal Proc. Vol. 39, No. 11, Nov 1991, pp. 2500-7.

8. Unser M, Aldroubi A, Eden M. Polynomial spline signal approximations: filter design and asymptotic equivalence with Shannon's sampling theorem. IEEE Trans. Info. Th. Vol. 38, No. 1, Jan 1992, pp. 95-103.

Communications Engineering Branch
Last Update: