| Skip navigation |
||||||||||||||||||||||||||||||
| |
||||||||||||||||||||||||||||||
|
|
Identifying image structures for content-based retrieval of digitized spine x-raysL. Rodney Longa,
Daniel M. Krainakb,
George R. Thomaa ABSTRACT We present ongoing work for the computer-assisted indexing of biomedical images at the Lister Hill National Center for Biomedical Communications, a research and development division of the National Library of Medicine (NLM). For any class of biomedical images, a problem confronting the researcher in image indexing is developing robust algorithms for localizing and identifying anatomy relevant for that image class and relevant to the indexing goals. This problem is particularly acute in the case of digitized spine x-rays, due to the projective nature of the data, which results in overlapping boundaries with possibly ambiguous interpretations; the highly irregular shapes of the vertebral bodies, sometimes additionally distorted by pathology; and possible occlusions of the vertebral anatomy due to subject positioning. We present algorithms that we have developed for the localization and identification of vertebral structure and show how these algorithms fit into the family of algorithms that we continue to develop for our general indexing problem. We also review the indexing goals for this particular collection of digitized spine x-rays and discuss the use of the indexed images in a content-based image retrieval system. Keywords: digitized x-rays, content-based image retrieval, spine, image segmentation 1. OVERVIEW 1.1 Indexing digitized spine x-rays Our goal is the indexing of a collection of 17,000 digitized lateral cervical and lumbar spine x-rays collected by the National Center for Health Statistics (NCHS) as part of the second National Health and Nutrition Examination Survey (NHANES II), and the creation of an advanced query capability for accessing the indexed images that supports query by image example. The key biomedical features of interest for indexing have been determined by two workshops conducted at the National Institutes of Health, and include anterior osteophytes, disc space narrowing, and subluxation. For economic reasons alone, computer-assisted indexing is highly desirable. A hierarchical approach to such indexing that we have described elsewhere consists of high-level region classification, spine region localization, vertebra localization and identification, vertebral segmentation, and classification of the vertebrae by presence/absence of the biomedical features above. In this paper we describe approaches to anatomy identification in the images by means of automated and by means of interactive, computer-assisted methods. We compare the approach and results of two new algorithms with previous work1. Some of the same algorithms used for indexing the images may also be used to support query-by-example, since the example image requires the extraction of its relevant index features in order to search the database for images with similar features. 1.2 Image indexing in a content-based image retrieval (CBIR) system The use of effective indexing algorithms is twofold: (1) the algorithms may be directly used to derive indexing information from the set of images in the database; this information then becomes part of the database, and becomes itself part of the data that may be queried; and (2) the algorithms may be used to derive indexing information "on the fly" from images that are input to the system as examples used to define the current query. An example of a CBIR system for a small database of NHANES II x-ray images has been created5 as an initial test of data retrieval by any combination of health survey text data, vertebral dimensional data, and vertebral shape. This system is a small prototype suggesting the interface characteristics and functionality of a CBIR system for digitized x-rays. In this prototype, a MATLAB graphical user interface aids in the query for different aspects of vertebral data. Health survey data may be queried in combination with features such as anterior/posterior height ratio, anterior height, posterior height, and disc space. Query-by-example and query-by-sketch is supported. The query results are ranked by similarity to the example image or sketch. Query result images and other user-selected images may be displayed side-by-side for easy visual comparison. The system allows a sketch of the vertebra or an example vertebra including up to nine critical points: the six common morphometric points, the anterior midpoint, and the locations of anterior osteophytes. Shape similarity is based upon a least squares best fit model independent of rotation, translation, and scaling. Future work will include making the system independent of the nine critical points marked by radiologists and the inclusion of more complex shape matching algorithms. 2. ANALYSIS By automated image content analysis, we refer to computer processing that outputs image content description with zero user intervention; when the user is allowed to input information to guide the analysis of image content, we refer to the processing as semi-automated, computer-assisted, or interactive. 2.1 Automated image content analysis The type of image content of interest in this paper is the high-level semantic content corresponding to the mapping of specific image regions to descriptors used in the domain of biomedicine to label components of human skeletal anatomy. A sucessful automated analyzer would produce results that could be considered comparable to a human viewing the image and saying, "That [image object] is the C1 vertebra", "That [image object] is the spinous process on the back of the C3 vertebra." We take the point of view that labeling may meaningfully precede segmentation in the images, i.e., that we may meaningfully label parts of the image when the regions containing these parts are only localized by coarse boundaries. The uses of such labeling of regions include (1) the localization of regions of interest for further processing to refine object boundaries; and (2) the marking of significant regions to direct the user's attention in image display systems, such as medical education systems to display labeled anatomy. 2.2 Automated approach to anatomy localization As remarked above, the process of anatomy localization for identification may be considered as distinct from, or only a rudimentary type of, segmentation: for our purposes, localization of a vertebral structure S means the computation of a simple closed polygon P that contains the structure, and identification means the association of a label with the polygon that correctly corresponds to the contained structure. Localization corresponds to the understanding of the image objects at a coarser level than segmentation, where object boundaries are determined in detail. Such localization/identification involves overcoming a number of difficulties, including detection of whether the computed P in fact contains the target structure S, contains S partially, contains other target structures partially or completely, and the problem of distinguishing, for the purpose of the correct assignment of anatomical labels, very similar and proximate structures. Our approach consists of (1) an initialization step to identify a beginning search curve fixed to the image anatomy at the back of the skull and extending to the top of the spine, and (2) curve analysis to locate and identify the first spine structure, the spinous process on the C1 vertebra, and (3) sequential localization and identification of the other vertebral structures of interest, including the vertebral faces. Example output from the algorithm is shown in Figure 1 for four test cases. For these tests, the bounding polygon P was taken as a rectangle. Figure 2 is an example of a curve traced along the skull for one image, and illustrates the technique used.
2.3 Interactive image analysis In an interactive image content analysis system, if the required user interaction is frequent, complex, and/or requires expert skills, the value of the computer assistance is diminished. The principle that we have followed is to attempt to mimimize the frequency and the complexity of the user assistance required, so that the algorithm may be successfully used with the only required human intervention being support at the level of a trained technician. 2.4 Interactive approach to anatomy localization This algorithm has the goal of identifying the spinous processes of the vertebra and vertebral faces with the aid of user assistance. The specific instruction that we have for the user is, simply, "Draw a line from the skull area to the shoulder area that passes through the spinous processes." We believe that this instruction with a small number of examples of line drawing that show a line passing through the main part of the bodies of the spinous processes and including the dark gaps between the adjacent processes, will suffice to capture the required data. 2.4.1 Summary of algorithm to locate the spinous processes The algorithm to locate and label the spinous processes is (1) compute the image grayscale profile along the user-defined line; The algorithm input and output are illustrated in Figures 3 and 4. In Figure 3, the image is shown with the user-drawn line. This is the total user interaction with the image content analysis; after getting the input of the user line, the algorithm is completely automatic. In Figure 4, the algorithm output is shown: spinous processes 1-6 have been labeled, along with vertebrae C4-C6.
2.4.2 Details of the algorithm: identification of the inter-spinous process points In this section we provide step-by-step details of the algorithm, beginning at the point that the user line has been input. The algorithm is explained with reference to the sample image shown in Figure 3. (1) The image grayscale profile along the user line is computed. For the image and user line in Figure 3, this profile consists of N grayscale values, where N = 1000. From this profile of raw grayscale values, two smoothed profiles are computed; for each smoothed profile, the ith profile value consists of the mean grayscale value in a σ - neighborhood of i (i.e. 2σ + 1 grayscale values centered at i, where σ is a positive integer). Profile 1 is smoothed with a small σ in order to smooth out the effects of grayscale amplitude variations that occur over very small inter-pixel distances, due to noise, or very small variations in image structures that occur over this small scale, but which are irrelevant to the features that we want to detect, namely, the local minima due to the separation of the spinous processes; profile 2 is smoothed with a large σ in order to smooth out the all of the extrema (local minima and maxima) due to the spinous process themselves: hence profile 2 is intended to capture the large scale trend of the grayscale along the user line, and preserves only the average behavior of the grayscale at this large scale. Figure 5 shows profiles 1 and 2 as they were computed for the user line from Figure 3. (2) The local minima are computed along profile 1. Inspection of the profile 1 curve in Figure 5 shows a number of local minima; some appear in the skull region, some are on the bodies of the spinous processes (see the local minimum on spinous process 5, for example; this is marked with a ** in Figure 5); and some are in the spaces between the spinous processes. Since we are attempting to detect the spaces between the spinous processes, it is this last category that is of interest to us. (3) We use the a priori assumption that the local minima corresponding to the spaces between the spinous processes will lie below the values of the locally-averaged grayscale (profile 2), while the local minima which correspond to grayscale "dips" that occur within the spinous process bodies will lie above this locally-averaged grayscale. Again, this separation is illustrated by spinous process 5 in Figure 5: profile 2 is seen to separate the local minimum within the body of spinous process 5, from the local minima that border spinous process 5. The local minima that remain as candidates for inter-spinous process points are those that lie below profile 2. These are marked with circles in Figure 5. ("Below" profile 2 means having a grayscale value less than profile 2 - Δ1, where Δ1 is a parameter of the algorithm). Two technical difficulties remain: (a) this set of local minima is may include local minima that are not between the spinous processes at all, but are actually in the skull region. (The skull region in these images may have significant grayscale variability, while the shoulder region is homogeneous to a high degree, and we do not have the same phenomenon of spurious local minima occurring there.) (b) also, the local minima that do correspond to inter-spinous process points are not unique: there may be more than one local minimum detected between a given pair of adjacent spinous processes. Both of these situations are illustrated in Figure 5.
(4) To deal with problem (a), a combination gradient and grayscale threshold test using a dynamically-computed grayscale threshold is employed to eliminate local minima in the skull region. First the location of the skull region is estimated by a two-step process. A mean grayscale value on the user line for the skull region is estimated by sampling grayscale on the user line from the region that with high probability corresponds to skull. Relying on the assumption that the user has drawn the line from skull to shoulder, the maximum grayscale value in the first half of the user line is found, and points in a small neighborhood of this maximum are assumed to lie in the skull region. The mean value of the grayscale for these points is the estimate SRG for skull region grayscale. Then, the most negative gradient on the smoothed profile 1 line is found: this should denote a strong edge pixel on profile 1. We search from the point in the direction of the beginning of profile 1 for the first zero gradient point. If this point passes a reasonability test for lying in the skull region, we take this point as the skull boundary. The reasonability test that we use is that the zero gradient pixel should have a grayscale value greater than SRG - Δ2, where Δ2 is a program parameter; if the reasonability test is not passes we iteratively search profile 1 for the next largest gradient, search from that point to the beginning of profile 1 for the next zero gradient point, and test that point for grayscale reasonability. Once the skull boundary point has been determined, we discard any local minima that lie on the "skull side" of this point, as not belonging to an inter-spinous process region. To deal with problem (b) a second test is employed to eliminate multiple local minima between spinous processes, based on a priori assumptions about reasonable bounds on the distances between collinear points lying in distinct inter-spinous process spaces. If the Euclidean distance between two local minima is less than Δ3, also a program parameters of the local minima is eliminated; the local minimum kept is the one for which the quantity that lies the furthest below profile 2. Figure 6 shows the image with the user line and the final set of local minima marked with circles. These are the local minima determined by the algorithm to correspond the inter-spinous process points. Note that spinous process 7 is visible in the image in Figure 6; in Figure 5 the part of profiles 1 and 2 that correspond to this spinous process has been manually marked, but Figures 5 shows that only one border of this spinous process has been detected; the local minimum lying closest to the shoulder is too small to be detected by the algorithm. Hence, spinous process 7 is undetectable by the algorithm for this image.
2.4.3 Details of the algorithm: identification of the vertebral faces Following the identification of the spinous processes, the vertebral faces are identified. The approach that we took was to use the user line to orient a set of grids that we placed across the spine region. Then, for each of the local minima found above to lie between a pair of spinous processes, to search the grid for the paths of minimum grayscale, with the expectation that for the lower cervical spine vertebrae, which typically have visible background in the interveterbral spaces, we would be able to place a path that runs through this interveterbral space. These paths were expected to mark the separation between vertebrae and, since each path originates in an inter-spinous process space where we have labeled the spinous processes, we would be able to use the path information to label the particular vertebra. Figure 7 illustrates the grid concepts for the single grid associate with one inter-spinous process point. Figure 8 shows the grids for each of the inter-spinous process points for the example image from Figure 3. The parameters of the grids (width, height, and density) were set by informal methods, and are the same for all grids. The grid width ("across the spine") parameter was set based on sampling of typical spine widths in the images; similarly, grid height was based on inspecting image samples to determine heights that appeared likely to enclose the target area-the intervertebral space, when the grid is positioned with one edge along the user-defined line, with the center of that edge on an inter-spinous process point. Grid density was selected by running the algorithm over a range of densities and observing the resulting paths. The density used for the illustration in Figure 8 is much coarser than the densities actually used, and is shown here for illustration only. The grid width is set to an a priori value and then dynamically updated with information from the image: the a priori value was obtained by inspecting a small sample of images and manually estimating the distance across the spine, from the mid-spinous process region to the front of the spine. Once the grid is formed, this distance is refined, as follows: If the grid is of size p x m, where the grid rows run parallel to the user-defined line, and the grid columns run normal to the user-defined line, then, for each grid k, and for each row i in that grid, we calculate the sum of gradient values
over the m columns of that row. We then sum Ski over all of the grids to get
Finally, we calculate row i* as the row for which Si is maximized. The gradients are approximated as simple grayscale differences in the direction normal to the rows and computed so that a dark-to-light transition when moving in the direction of the arrow shown in Figure 9 results in a gradient with positive sign, i.e.
where gij is the grid's grayscale value at coordinates (i,j). We expect that the sum of the gradients calculated in this manner will be maximized for the grid line that lies closest to the front edge of the spine, where the dark-to-light grayscale transition is visually the strongest. Figure 9 shows the results of calculating this front edge line for the example image. Knowing this line provides an updated estimate of the search grid width that is based on actually data from the particular image being analyzed, and provides a linear estimate of the location of the front of the spine. For each local minimum on the user line that we have identified as corresponding to an inter-spinous process point, we then calculate a path across the rows of the grid, beginning at the grid row (on which the local minima lie) and ending on row ƒ (the row that we have identified as nearest the front edge of the spine). The calculation of a path is posed as an optimization problem where the objective function is
where the path is specified as an ordered set of coordinates,
P is constrained to begin on the first grid row, at one of the local minima points (inter-spinous process points), i.e. (i1,j1) = (1,jq), where jq is the grid column corresponding to the qth local minimum, and to end on the front edge grid row, i.e. iƒ = ƒ. The goal of the optimization is to minimize O(P)over all possible paths on the grid. The first term on the right in the expression for O(P) is just the sum of grayscale values along the path P; gij is the grayscale value at grid point (i,j) and is the cost associated with that grid point. The term is δij,kl the transition cost associated with going from grid point (i,j) to grid point (k.l). We define this to be the Euclidean distance between the two points, i.e.
The term &lambsa; is a program parameter that we set to weight the relative importance of the two terms. The intent of the formulation is to find the path with the smallest cumulative grayscale while maintaining some degree of smoothness by weighting the second term to restrict the total path length. Figure 7 illustrates the concept of the optimal path determined by minimizing O(P). The optimization problem over this grid is suited for a dynamic programming solution, and that is what we have implemented. Figure 10 shows solution curves computed for the lower inter-spinous process points (points 4-7) of the example image. The final step in the algorithm is the localization of the vertebral faces. To achieve this, the posterior edge boundaries of the vertebrae are estimating by finding the grid row of maximum grayscale value, i.e. the brightest grayscale line. The procedure followed is essentially the same as for finding the grid row corresponding to the front edge of the spine, except that grayscale value, rather than gradient value is maximized. Figure 11 shows the bright edge line (maximum grayscale line) that was computed for the example image. Then, each pair of optimal paths determined to run through the interveteral spaces, the intersection of these paths with the front edge and bright edge lines are calculated; this yields four points that determine "corners" of the coarse vertebral segmentation. The diagonals determined by these corners are calculated, and the interesection of the diagonals is the point at which the vertebral label is placed. Figure 11 illustrates the diagonals and the placement of the intersection point.
3. IMPLEMENTATION AND RESULTS We have implemented the localization/identification algorithms using MATLAB 5.3 on a Windows Pentium-class PC. 3.1 Automatic algorithm The algorithm has currently been tested through steps (1) and (2) on 136 images of the cervical spine; for these images, P was taken as a rectangle. The results were evaluated by displaying the images with the region P as marked by the algorithm and answering the question, "Does P contain most of the body of spinous process 1?" With this procedure, 88% of these cases P were evaluated as containing the target shape, the spinous process on the C1 vertebra. A number of the failure cases were determined to be due to unexpected local minima in the search curve determined by the skull boundary, which in turn were due to small concavities in the skull shape. A difficulty with the current algorithm is that it does not incorporate any self-evaluation capability to estimate its success, and it does not estimate the target orientation within P, information that may be critical in used the results to move to other anatomy within the image. We are now researching smoothing methods to remove these irregularities, and are extending the algorithm to step (3), the full localization and identification of structures in the cervical spine. In this paper we provide complete results on the performance of the algorithm on all three steps. This sequential localization and identification of structures is based on location of previously-determined structures, and a prior statistical data on the relative geometries of the vertebral structures. We have vertebral geometric data recorded under supervision of a board-certified radiologist for computation of the a priori statistics. 3.2 Interactive algorithm For the initial evaluation that we have done, we have approached the evaluation of the interactive algorithm's performance by focusing on the three "trainable" variables below: Δ1 = factor to discriminate minima that lie below the local curve average ("true" inter-spinous points) from minima that lie above the local curve average (points that lie within the body of a spinous process) Δ2 = factor to determine whether a point meets grayscale reasonability criteria for lying in the skull region Δ3 = factor that sets the minimum allowed distance between two neighboring inter-spinous process points The interactive algorithm was "trained" on these three parameters using a set of 20 images by repeatedly processing the 20 images while varying the parameters independently. In the first test run, a user line was manually drawn on the images; in subsequent runs, the same user line was used. The runs were evaluated by displaying the labelled images and visually judging whether the placement of the spinous process and vertrebra locations were within the visual boundaries of those respective objects, and whether the labeling was correct. Only labeling of C4 -C7 was attempted. The results of these training were as follows: 13 (65%) of the images were processed correctly (all expected vertebrae and spinous processes were found and located correctly; 5 (25%) were processed partially correctly (two were correct, except for spinous process 6 being missed; 3 were correct, except for slight misplacements of the vertebrae locations); and 2 (10%) were failures (one failed due to incorrect identification of the skull edge; the other due to mis-identifying a local minimum that lay within the body of a spinous process as an inter-spinous process point). The cases where spinous process 6 were missed were all cases where the shoulder-side boundary of this structure is only weakly visible in the image, resulting in a very small local minimum on the profile 1 curve; the detection of this miminmum is below the sensitivity of the current implementation. The cases where there were slight misplacements of the vertebrae locations are likely due to the characteristics of the grid model used to constrain the search for the coarse vertebral boundaries: we use a simple rectangular grid that does not model the geometry of the user line/spinous process/vertebral body configuration. The two failure cases require a refinement in the skull edge detection procedure and in the process of discriminating inter-spinous process points from local minima that may occur with the bodies of large spinous processes. The major step that we anticipate in improvement of the algorithm is in incorporating a grid model that reflects where the intervertebral spaces lie with respect to the inter-spinous process spaces. If we may then couple the user line with this anatomy-based grid, we expect to locate vertebrae more accurately, and to test the locatability of higher vertebrae (C1-C3) with this method. We have done only informal testing on aspect of the algorithm: the variability of placement of the user line, but in the tests run, the range of variability tolerated on the input line that produces visually invariant results in structure location and labeling are good. The main requirements are that the user line begin in the skull, end in the shoulder, and intersect the spinous processes in such a way that the inter-spinous process spaces are intersected. In all except one image, the lines could be drawn causally, subject to these constraints, with no geometric difficulties; in the one exception, the spine presentation had a curvature that required more careful line placement in order to meet the constraints. Rigorous testing with lines generated at random lengths and orientations, subject to these constraints, should expose any unforeseen difficulties due to user line placement. The interactive algorithm is currently in a training and development phase which we will follow with testing on a set of images on which it has not been trained, to assess its expected performance.
4. CONCLUSION We conclude by placing the work in developing these algorithms in the context of our total research and development work in indexing and retrieving the x-ray images. These algorithms, in combination with algorithms to accurately segment the spine vertebrae and to classify those vertebrae according to specific biomedical features of interest, form the basis for a projected future system that, with the proper biomedical validation, is intended to provide a method for the computer-assisted indexing of the entire 17,000 image collection. To date, our own efforts and those of our collaborators have resulted in (1) the establishment of an on-line, publicly accessible 17,000 image collection of digitized x-rays, (2) the creation of an ongoing research effort into the use of Active Shape Modeling for the vertebral segmentation of these images,2,3 (3) the creation of an ongoing research effort into the classification of the vertebrae by biomedical feature, based on shape characteristics,4 (4) our continuing development of high-level algorithms for image analysis, and anatomical localization and identification, and (5) the creation of a prototype system for the retrieval of the images by vertebral shape, based on query-by-example and query-by-sketch.5 REFERENCES
| |||||||||||||||||||||||||||||
CEB Home | CEB Projects | Related Work | Publications |
Repositories | NHANES | Site Index
URL: http://archive.nlm.nih.gov/pubs/long/spie-sd2002/spie-sd2002.php
|
||||||||||||||||||||||||||||||