1. Technical Field
The present invention relates to a method, a system and a computer program product for the classification of breast density from mammographic imagery. Specifically, the invention relates to an automated content-based image retrieval (CBIR) method, system and computer program product for the classification of breast density in mammogram images.
2. Description of the Related Art
The “background” description provided herein is for the purpose of generally presenting the context of the disclosure. Work of the presently named inventors, to the extent it is described in this background section, as well as aspects of the description which may not otherwise qualify as prior art at the time of filing, are neither expressly or impliedly admitted as prior art against the present invention.
In a study covering data population from 1975-1988, the US National Cancer Institute (NCI) estimates that the overall lifetime risk for developing invasive breast cancer is approximately one in eight (approximately 12.6 percent) among American women (U.S. Cancer Statistics Working Group. United States Cancer Statistics: 1999-2008 Incidence and Mortality Web-based Report. Atlanta: U.S. Department of Health and Human Services, Centers for Disease Control and Prevention and National Cancer Institute; 2012. Available at: www.cdc.gov/uscs—incorporated herein by reference in its entirety). Aiming to increase the survival time for women with breast cancer, mass-screening mammography programs are developed and adopted as an effective method. The integration of Computer-Aided Detection (CAD) tools with these screening programs is an interesting avenue worth exploring. Recent advances in CAD techniques and systems have focused on the detection of calcifications and the detection of mammographic masses. Although various degrees of success have been achieved in the above-mentioned detection problems, the accurate identification of breast cancer from digital mammogram images still remain a challenging and daunting task. Based on mammogram images, the mammographic appearance of the breast widely varies which constitutes a real challenge for the radiologist exploring and/or interpreting a benign mammogram.
There exist various types of radiographically-visible density including: 1) Ducts; 2) Lobular elements; and 3) Fibrous connective tissue. The fibrous connective tissue is further classified into: 1) Intralobular tissue; and 2) Extralobular tissue. The high variability in breast density reported from mammograms is mainly due to the extralobular tissue.
The interpretation of a mammogram images depends heavily on the breast density. In fact, the breast density affects the early detection of malignancy and large cancers especially in case of considerable density. In such cases, the mammogram background is not uniform and, therefore, it is very difficult to locate ill-defined cancers. The American College of Radiology (ACR) Breast Imaging Reporting and Data System (BIRADS) adopts a standard breast density classification system. In this system, the breast density is classified into four (04) major categories according to the recommendations of the American College of Radiology (ACR) Breast Imaging Reporting and Data System (BIRADS) (U.S. Cancer Statistics Working Group. United States Cancer Statistics: 1999-2008 Incidence and Mortality Web-based Report. Atlanta: U.S. Department of Health and Human Services, Centers for Disease Control and Prevention and National Cancer Institute; 2012. Available at: www.cdc.gov/uscs—incorporated herein by reference in its entirety): 1) Extremely dense; 2) Heterogeneously dense; 3) Fat with some fibroglandular tissue; and 4) Predominantly fat.
FIG. 1 illustrates examples of the above-mentioned breast densities.
It has been a widely accepted fact that dense tissue indicates a much higher risk of developing breast cancer than a fatty tissue (D. Kopans, Breast imaging, 3rd Edition, Lippincott-Raven, Philadelphia, 2006—incorporated herein by reference in its entirety). On the other hand, the presence of breast cancer is often masked in a mammogram having a dense tissue which increases the likelihood of missing the presence of breast cancer. Therefore, the challenge is doubled for women by being at higher risk of the disease and higher risk of cancer misdiagnosis by the mammographic approach. However, a recent study published in the Journal of the National Cancer Institute (U.S. Cancer Statistics Working Group. United States Cancer Statistics: 1999-2008 Incidence and Mortality Web-based Report. Atlanta: U.S. Department of Health and Human Services, Centers for Disease Control and Prevention and National Cancer Institute; 2012—incorporated herein by reference in its entirety. Available at: www.cdc.gov/uscs) revealed that, among women with breast cancer, those with fat breasts do not have a lower risk of dying of the disease than those with denser breasts. In this study, 9000 breast cancer patients were followed for an average period of 6½ years. During that time, 889 of these women died of breast cancer. There was no difference in the death rate of women with the densest breasts on mammography versus those with less dense (fattier) breasts. In some U.S. states, mammography facilities are required by state law to notify their patients if they have dense breasts. In such situations, patients are advised to inquire about whether they should undergo additional screening with ultrasound or MRI. This additional screening may detect breast cancer cases missed by the mammography procedure. However, it should be noted that additional screening will also greatly increase the likelihood of false alarms leading to unnecessary biopsies and the overall cost of screening approach.
Automated classification of breast density can be classified into: 1) Matrix factorization; 2) Global histogram; and 3) Texture analysis methods. Matrix factorization techniques factorize the mammogram images into a product of several factor images according to specific constraints. Consequently, the mammographic images, known for their high dimensionality, undergo a drastic dimensionality reduction where only dominant features are kept. Oliver et al. (A. Oliver, X. Lado, E. Perez, J. Pont, J. Denton, E. Freixenet, and J. Marti., “Statistical approach for breast density segmentation. Journal of Digital Imaging,” vol. 23, no. 5, pp. 55-65, 2009—incorporated herein by reference in its entirety) proposed a two-class breast density classification. Image segmentation is used as a pre-processing step. Then, features are extracted using principle component analysis (PCA) and linear discriminant analysis (LDA) techniques to classify the mammogram images into fatty and dense types. LDA is also sometimes known as Fisher Linear Discriminant (FLD). Features extracted using 2D-PCA are proposed by DeOlivera et al. (J. E. E. de Oliveira and A. de Araujo. Mammosyslesion: A content-based image retrieval system for mammographies,” in 17th International Conference on Systems, Signals and Image Processing (IWSSIP 2010), pp. 408-411, 2010—incorporated herein by reference in its entirety) to build a two-class (fatty and dense) content-based image retrieval (CBIR) system. A support vector machine (SVM) with Gaussian kernels classifies image features represented by the first four principle components (PC). Reported results indicate that 2D-PCA outperforms the standard PCA in terms of classification accuracy. Using the same features, proposed in DeOlivera et al. (J. E. E. de Oliveira and A. de Araujo. Mammosyslesion: A content-based image retrieval system for mammographies,” in 17th International Conference on Systems, Signals and Image Processing (IWSSIP 2010), pp. 408-411, 2010—incorporated herein by reference in its entirety), Thomas et al. (T. M. Deserno, M. Soiron, J. E. E. de Oliveira, and A. de Araujo, “Towards computer-aided diagnostics of screening mammography using content-based image retrieval,” in 24th Conference on Graphics, Patterns and Images (Sibgrapi 2011), pages 1754-1760, 201—incorporated herein by reference in its entirety) consider 4 density classes according to the BI-RADS lexicon using a similar classifier. DeOliveira et al. (J. E. E. de Oliveira, G. Camara-Chavez, A. de Araujo, and T. M. Deserno, “Mammosvd: A content-based image retrieval system using a reference database of mammographies,” in 22nd IEEE International Symposium on Computer-Based Medical Systems, pp. 1-4, 2009—incorporated herein by reference in its entirety) propose a CBIR system, called MammoSVD, where image features are extracted using the singular value decomposition (SVD) algorithm. It is noteworthy that MammoSVD system is a binary classifier (fatty and dense tissue) based on an SVM learning machine. The SVD-based features provide a good characterization of the mammographic texture. MammoSVD system achieves 90% classification accuracy. In DeOliveira et al. (J. E. E. de Oliveira, G. Camara-Chavez, A. de Araujo, and T. M. Deserno, “Content-based image retrieval applied to BI-RADS tissue classification in screening,” World Journal of Radiology, vol. 3, no. 1, pp. 24-31, 2011—incorporated herein by reference in its entirety), a 4-class model, called MammoSVx is proposed with features are represented using the largest 25 singular values of the SVD decomposition of the mammogram images. Using an SVM learning model with polynomial kernel against a mammographic database containing 10000 images, a classification accuracy of 82.14% is achieved by MammoSVx.
Disclosed embodiments of the present invention relate to a method, a system and a computer program product for the classification of breast mammographic images according to the breast type identified on the basis of the underlying texture of the breast which is highly correlated with the breast density. Then, based on this classification, the disclosed method, system or computer program product generates a new mammogram image which is automatically categorized into one of the density classes. This automation mitigates subjectivity introduced by the manual process carried out by radiologists. Moreover, further image handling and process is applied based on this classification. From an image processing viewpoint, processing algorithms are used according to the breast density of the underlying mammogram images. In the same time, “hard” cases can be singled out for further processing or double screening as per the BIRADS recommendations. (G. L. Gierach, L. Ichikawa, K. Kerlikowske, L. A. Brinton, G. N. Farhat, P. M. Vacek, D. L. Weaver, C. Schairer, S. H. Taplin S H and M. E. Sherman, “Relationship between mammographic density and breast cancer death in the breast cancer surveillance consortium,” Journal of Natl. Cancer Inst., Vol. 104, No. 16, pp 1218-1227, August 2012—incorporated herein by reference in its entirety).