Vol. 14 •Issue 6 • Page 44
image standardization and/or feature normalization is needed
Steve Rogers, PhD, Ken Bauer, PhD and Matt Kabrisky, PhD
Due to an increasing number of data acquisition devices, the emergence of telemedicine and increased acceptance of picture archiving and communications systems (PACs), electronic databases are becoming increasingly heterogeneous. This trend poses a significant challenge to computer-aided detection (CAD) for digital mammography, and requires an extendable CAD approach. Our answer is to develop a sensor-independent (“plug-n-play”) CAD system for digital mammography applications.
Although standards imposed by digital imaging communications in medicine (DICOM) protocols will ensure critical sensor parameters are accessible, strict standards will not likely be imposed on raw image data. This is significant since CAD systems estimate statistical parameters of measurements from the imagery, and commonly employ features and/or thresholds that are functions of absolute or relative gray scales and/or morphological characteristics. Therefore, for a CAD system to interpret imagery from multiple digital mammography vendors, image standardization and/or feature normalization is needed.
We are developing techniques to support a vendor-independent CAD system and believe sensor independence is possible over the range of sensor parameters typically employed in mammography. As part of our digital mammography CAD development efforts, we measured CAD performance using three different digital mammography sensor systems. In all cases, the resulting CAD performance was consistent with our baseline Second Look™ film CAD system.
Figure 1 shows an overview of the CAD system used in this analysis. As shown, the system accepts data from a sensor, transforms data to a common image space and processes along two parallel paths: one path for density and one path for clustered micro-calcifications. Although not explicitly shown, several preprocessing steps are completed prior to the first detection stage, including re-sampling (density only), breast segmentation, and pectoral muscle segmentation. The initial breast segmentation eliminates the film background while the pectoral muscle segmentation further isolates pixels of interest (that is, breast pixels). Detections within the pectoral muscle area are possible if the region’s characteristics are significantly different from surrounding tissue. Each step, left to right, along the two paths uses features with increasing discriminatory power to isolate suspicious tissue–both mass densities and micro-calcifications–while reducing the number of false markings.
Every digital mammography system maps the X-ray attenuation of irradiated tissue to numerical values, where the pixel value is generally a non-linear function of the breast transmittance. Typically, higher attenuation results in greater pixel numerical values. The relationship between the tissue attenuation and numerical pixel values will vary from vendor to vendor under equivalent image acquisition conditions. This variability in response is generally a non-linear function of the device pre-amplifiers, analogue-to-digital (A/D) converter quantization function, bit resolution and system noise.
In addition to different amplitude re-sponses, systems’ spatial sampling characteristics, as defined by the focal plane pixel size and spacing, will vary from vendor to vendor. This is significant because CAD systems estimate statistical parameters of measurements from the imagery and commonly employ features and/or thresholds that are functions of absolute or relative gray levels and/or morphological characteristics.
The following sections discuss a proposed methodology for image standardization and/or feature normalization, and provide results for three different direct digital image acquisition devices.
Figure 2 shows the image standardization process. The image is initially re-sampled to a common representation, ~ 45 mm. Following segmentation, the probability density function (pdf) is estimated. To ensure consistency in amplitude, the breast pixels are normalized and re-quantized to match the desired pdf. The resulting pdfs are aligned and the required mapping is derived based on the current input image. Sampling Transforms
Many of the features used in the CAD algorithms are based on morphological or shape features and as a result are sensitive to sampling. In general, two sampling effects exist. The first and more problematic is the detector size, since it, as well as others, defines the sensor modulation transfer function (MTF). In general, the effects of the MTF are difficult to recover, since compensation requires a de-convolution of the system’s spatial frequency response. This in turn requires measurements using well-defined sources that may not be available in a clinical setting. Therefore, although accurate MTF compensation would certainly result in improved performance, it may not be practical in all cases and as such, is not considered. We prefer to select features that are as robust as possible to system noise and the system MTF.
The second sampling effect is pixel spacing. Pixel spacing can be compensated for via re-sampling using appropriate interpolation or decimation techniques. In this case, the objective is to ensure physical measurements on objects of interest are consistent across different sensors. Although up-sampling does not recover lost information, it does provide consistent measurements on extended objects such as masses and distributed micro-calcifications. Fur-thermore, this approach should yield re-sults consistent with an approach that retains physical information on the feature distributions and rescales “on the fly” based on sensor sampling characteristics.
Pixel Amplitude Transforms
The amplitude response curve provides a mapping from optical density (OD), in the case of film digitizers, or breast transmittance (in the case of direct digital devices), to a numerical pixel value. Consider, for example, the characteristic curves of a Howtek and RDI film digitizer shown in Figure 3. The mapping from OD to pixel value is approximately linear in OD for the Howtek device, plotted in ‘+’s in the upper plot, whereas the characteristic curve of the RDI digitizer, plotted in ‘o’s, is approximately linear in transmittance. Given knowledge of these mappings, it is possible to transform pixel values to a desired image space. For example, given knowledge of the characteristic curve, pixel values can be mapped to the film’s optical density, which is one example of a digitizer invariant amplitude space.1
An alternative to using response curves based on step wedge calibration targets is to specify a de-sired global pdf, and map incoming pixel data to a desired pdf, a standard technique in image processing.2 This approach has the advantages of not requiring an amplitude response function a priori and self-adapting to time varying sensor responses.
The process for this particular application is shown in Figure 4. A critical first step is to segment the breast pixels to allow proper normalization across varying pixel bit depths. Therefore, the breast segmentation algorithm must be robust to the system response functions. As part of ongoing efforts to develop a CAD approach that is vendor-independent, we have developed breast segmentation algorithms that provide robust segmentation for data with bit depths of 12 or more. Following segmentation, a histogram of the breast pixels is computed. A target histogram is also provided based on the ensemble pixel statistics of a native database used for algorithm development.
In this case, the target histogram is calculated based on breast pixels from training imagery. The mapping G-1(p) is equivalent to a look-up table specifying how to reassign input pixel values such that the histogram of the input image is most like the target histogram. Note that each image input to the method will use a unique mapping, since the transformation is a function of the pixels within that input image.
Experiments were performed to test the feasibility of using the proposed image standardization methods as mechanisms for achieving sensor independence. The baseline CAD algorithms were not modified; rather, imagery from the following sensors was transformed using the techniques discussed previously:
• Fischer digital mammography system
• Trex digital mammography system
• GE digital mammography system
Digital Sensor Normalization
To test the usefulness of the re-sampling and direct histogram specification image standardization methods, a series of experiments was carried out. Usefulness is judged by evaluating CAD performance on standardized imagery. The CAD system Second Look™ film is used to test the standardization methods. Since the CAD was developed with data acquired with digitizers, the approach is to convert the digital imagery to the training database distribution.
Sample Intermediate Results
Initial efforts to use the data “as is,” with the exception of re-sampling, proved unsuccessful due to the significant variation in sensor characteristics amplitude response functions.
Figure 5 shows an example of the impact of the process on a GE image. The image on the left is a raw GE image that has been multiplied by a segmentation mask, a mask consisting of ones in the breast areas and zeros elsewhere. The image on the right is the image after applying the above transformation process. Note that the image has been up-sampled, resulting in a 4:1 increase in the number of pixels. The additional internal detail is also apparent.
Figure 6 shows typical results before and after application of the transformation. Although detections were achieved at consistent locations, the resulting regions grown are dramatically different.
In the first case, the regions are significantly under-grown, whereas in the latter case, accurate boundary estimates are obtained. Furthermore, after applying the transformations, a correct true positive was labeled, as shown on the right. Prior to the transformation, the region was rejected as a false positive due to the poor boundary estimates.
CAD performance on transformed input imagery is shown in Tables 1 and 2. Table 1 shows system performance after screening classifiers on the digital mammography datasets. The input data are re-sampled and transformed using the direct histogram specification method. Micro-calcification lesion sensitivity is 86 percent and density sensitivity is 89 percent at averages of 3.5 and 6.6 CalcMarks™ and MassMarks™ per image.
Table 2 shows system performance after feature classifiers on the digital mammography data. Per-image micro-calcification sensitivity across all vendors is 86 percent (18 of 21) with an average of 0.3 false indications per image. Per-image density sensitivity across all vendors is 74 percent (14 of 19) with an average of 0.6 false indications per image.
Image standardization and/or feature normalization is needed for a CAD system to achieve acceptable performance with multiple digital mammography systems. As part of this effort we have demonstrated the feasibility of developing a common CAD system for three digital sensors. Future efforts will refine this approach to include feature distribution normalization and parameter selection in accordance with sensor parameters. n
1. Highnam R, Brady M. Mammographic Image Analysis. New York: Kluwer Academic Publishers; 1999.
2. Gonzalez R, Wintz P. Digital Image Processing. New York: Addison Wesley; 1987.
The authors are with Qualia Computing Inc., Beavercreek, Ohio.