D. You, S. Antani, D. Demner-Fushman, G. Thoma
U.S. National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, United States
Biomedical images are invaluable in establishing diagnosis, but can also significantly improve clinical decision support (CDS) and informatics applications. However, the images needed for CDS appear in biomedical articles and are often not indexed. Authors often use text labels and pointers overlaid on figures and illustrations in articles to highlight regions of interest (ROI). These annotations are often referenced in the caption text or figure mentions in the article text. Identifying the annotations can assist in extracting relevant image content at regions within the image that are likely to be highly relevant to the discussion. Image regions can then be annotated using extracted image (visual) features and biomedical concepts extracted from the text. These (text and image) features can be indexed for improved (hybrid) information retrieval to aid CDS and other applications. We have developed a new image annotation detection algorithm based on Markov random fields (MRF) that shows robust recognition performance for various types of pointers used in biomedical images. Results from a pilot image retrieval experiment will be presented.