The anatomy of the female breast changes over age. During the reproductive years, the breast consists mainly of ductal, glandular, and fat tissue. This is interspersed with fibrous tissue providing support and attachment to the chest wall. Glandular and fibrous tissues are jointly called the fibro-glandular tissue. The breast glandular tissue is called the breast parenchyma and consists of 20 . . . 25 lobules (glands), responsible for milk production and are drained towards the nipple by numerous tiny tubes (ducts) that come together to form bigger ducts. Each milk-producing lobule contains a cluster or ring of cells. The sections of lobules and ducts are surrounded by fat for protection, and supported by the fibrous tissue. With age, ductal and glandular elements undergo atrophic changes and are increasingly replaced by fatty tissue. The breasts are held in place by ligaments that attach the breast tissue to the muscles of the chest. Breasts are covered by ordinary skin everywhere except the nipple and the aureoles around it.
The fibrous, ductal and glandular tissues appear as dark or “dense” on the X-ray mammogram. Fat, on the other hand, has a transparent, or lucent, appearance. The terms mammographic density (MD) and mammographic pattern are widely used to describe the proportion of dense/lucent areas in the breast presented on the mammogram. In the past, different methods of classification of mammographic parenchymal patterns have been proposed such as the Nottingham classification (5 patterns such as Normal (N), Glandular (G2, G1, G0), Dysplasia (DS-slight, DM-moderate, DY severe), prominent ducts (P1, P2) and indeterminate (IND)), Wolfe classification (4 categories) and Tabar-Dean classification (5 patterns).
Two major categories of breast cancer are lobular and ductal carcinoma.                Lobular carcinoma in situ (LCIS) is a condition of sharp increase of the number, appearance, and abnormal behaviour of cells contained in the milk-producing lobules of the breast. The term “in situ” refers to an early stage of cancer and is used to indicate that abnormal cancer cells are present but have not spread past the boundaries of tissues where they initially developed. Though LCIS is not considered a cancer, women who are diagnosed with LCIS (also called lobular neoplasia) are at a higher risk of developing breast cancer later in life.        Ductal carcinoma in situ (DCIS) is the most common condition of early cancer development in the breast. Again “in situ” describes a cancer that has not moved out of the area of the body where it originally developed. With DCIS, the cancer cells are confined to milk ducts in the breast and have not spread into the fatty breast tissue or to any other part of the body (such as the lymph nodes). DCIS may appear on a mammogram as tiny specks of calcium (called micro-calcifications).        
Both LCIS and DCIS may develop into invasive cancers (infiltrating lobular carcinoma or infiltrating ductal carcinoma) where cancer spreads into the fatty breast tissue or to any other part of the body (such as the lymph nodes), called metastases.
Mammography has become by far the most used and the most successful tool in the detection of early symptoms of breast cancer, which can often be signalled by the presence of micro-calcifications or masses. However, visual analysis—as performed by radiologists—remains a very complex task and many Computer-Aided Detection/Diagnosis (CAD) systems have been developed that support their detection and classification. Indeed, the impact of a computer aided detection (CAD) system on the detection efficiency of an experienced and respectively non-experienced radiologist has been investigated e.g. in C. Balleyguier, K. Kinkel, J. Fermanian, S. Malan, G. Djen, P. Taourel, O. Helenon, Computer-aided detection (cad) in mammography: Does it help the junior or the senior radiologist?, European Journal of Radiology 54 (2005) (2005) 90-96. In both cases the CAD system has proven an effective support tool for the detection, though, its autonomy remains in doubt. Therefore, due to the complexity of the problem, automatic or semi-automatic systems still play only the role of a signalling tool for the radiologist.
In the CAD environment, one of the roles of image processing would be to detect the Regions of Interest (ROI) that need further processing for a given screening or diagnostic application. Once the ROIs have been detected, the subsequent tasks would relate to the characterization of the regions and their classification into one of several categories.
Examples of ROIs in mammograms are (a) calcifications, (b) tumours and masses, (c) the pectoral muscle, (d) the breast outline or skin-air boundary. Segmentation is the process that divides the image into its constituent parts, objects or ROIs. Segmentation is an essential step before the detection, description, recognition or classification of an image or its constituent parts, e.g. mammographic lesions, can take place.
A radiation image such as a mammogram typically consists of three main areas:                The diagnostic area comprises pixels corresponding to patient anatomy e.g. the breast. In general, the outline of this imaged area may take any shape.        The direct exposure area is the image region that has received un-attenuated radiation. Although this region has constant intensity corrupted by noise only, inhomogenities in incident energy (e.g. X-ray source Heel effect) and receptor (e.g. varying storage phosphor sensitivity in computed radiography) may distort this pattern. In European patent application 1 256 907 a method is disclosed to estimate these global inhomogenities retrospectively from the diagnostic image and flatten the response in all image parts in accordance with an extrapolated background signal.        The collimated areas—if any—appear on the image as highly attenuated pixels. The shape of these areas typically is rectilinear, but circular or curved collimation shapes may be applied as well.        
Between these main areas in a radiation image, three different area transition types may be considered: diagnostic/direct exposure, diagnostic/collimated area, and direct exposure/collimated area boundaries.
Segmentation algorithms aim at detecting and separating of the set of pixels that constitute the object(s) under analysis. These techniques may be broadly classified according to the type of processing applied to the image. Region-based algorithms group pixels in the image according to suitable similarity criteria. In European patent application EP 887 769 a region-based algorithm is disclosed to segment direct exposure areas by grouping pixels according to centroid clustering of the grey value histogram. Edge-based algorithms separate image pixels in high contrast regions in the image according to grey value differences of neighboring regions. In European patent application 610 605 and European patent application 742 536 an edge-based algorithm is disclosed to detect and delineate the boundaries between collimated areas and diagnostic areas on a single or multiply exposed image. Either in region-based and edge-based approaches, models may be used to restrict the appearance or shape of the segmented image areas to obey predefined photometric or geometric constraints. Examples of this paradigm are the so-called Active Appearance and Active Shape Models (AAM and ASM).
Because an analysis of the inner breast region is aimed at, neither of these techniques are applicable as (a) they are overall segmentation techniques not adapted to the specific content of breast structures and lesions and only yielding major entities such as the breast outline, or (b) assume specific geometric or photometric constraints that are not applicable to the large variability of mammographic appearance of breast structures and lesions. Therefore, in the current patent application, the focus goes to techniques that reliably segment the structures and densities inside the breast skin outline. In particular, a method to detect and segment candidate mammographic masses is described.
The detection step, relying on the segmentation step is generally considered to be a very complex task. For example, masses are groups of cells clustered together more densely than the surrounding tissue and can be represented on a mammogram by a relatively small intensity change. Furthermore, a mammographic scan records all the structures present in a breast, the structures of which vary in size, homogeneity, position and medical significance. Finally, digital mammography, by its very nature, is inherently characterised by the error introduced as a result of the conversion between real-world and the digital representation, i.e., the quantisation noise. All these three factors increase the complexity of the task of breast cancer diagnosis.
In EP 6122899 A1, a segmentation method that deals with the general inhomogeneity of a digital mammogram is outlined by applying a clustering algorithm. In general, a clustering scheme divides an input set (an image with its intensity values) into a set of groups—called clusters—by assigning a value, which represents a class, to each pixel. The goal of efficient clustering is to emphasise the boundaries between the objects present in the image, regardless of the size, position or level of visibility. One of the most investigated and well known techniques of clustering is K-means as described in general textbooks such as R. O. Duda, P. E. Hart, D. G. Stork, Pattern classification, John Wiley and Sons, Inc., 2001. In EP 6122899 A1, a Markov Random Field model was developed and applied as an extension to the general K-means method. A spatial dependency term was introduced that reduces the inhomogeneity of the segmentation result. The method however still leaves room for improvement.
First, the algorithm operates on the original unprocessed mammographic data. Because of the specific breast composition of fibrous, ductal and glandular tissues interspersed with fatty tissue, the mammographic appearance may be described by fairly large scale structures that account for mammographic densities on black backgrounds, and smaller scaled structures that give the tissue the appearance of texture, into which mammographic lesions are embedded. The model that will be proposed here will be based on the decomposition into a mean background signal and a texture detail image that characterizes the fluctuations of the detail around the mean. So, instead of operating on the original image data, the segmentation step will operate on a texture detail image.
Second, the number of classes is an important parameter in the EP 6122899 A1 disclosure. However, there is no method given to determine the number of classes either beforehand or automatically. To get the correct segmentation of the potential masses searched, different settings of the number of classes are needed resulting in a number of segmentations, including the one comprising the searched for segmentation. Hence, the different segmentations would have to be further processed in parallel by the CAD algorithm to eliminate the false positive candidates.
A number of prior art disclosures are also directed to techniques for correcting non-uniform breast thickness. These techniques are mainly focusing on the application of enhancing image quality for visualization in digital mammography. The non-uniform breast thickness arises from the specific compression geometry in mammography. On the one hand, the image area in the vicinity of the breast edge is slowly increasing when going from the breast edge towards the thorax side, where normally constant compression thickness is reached. On the other hand, due to tilt of the compression plate, the inner breast area may as well exhibit a non-flat cross-sectional profile.
In U.S. Pat. No. 7,203,348, a method and apparatus for correction of mammograms for non-uniform breast thickness is disclosed. The method comprises classifying pixels as either likely fat or non-fat. The method further comprises identifying a candidate distortion and calculating a histogram of the likely fat and likely non-fat pixels at the candidate distortion. The method further comprises evaluating a quality of the candidate distortion based on the features of the histograms of pixel values in the fat and dense tissue classes. The distortion may be a two-dimensional tilt of the plates or a three-dimensional deformation of the plates as in a bend. The algorithm is iterative in nature and optimizes among a set of candidate tilts.
A global parameter model of the compressed breast is also fitted to the mammographic data in R. Snoeren, N. Karssemeijer, Thickness correction of mammographic images by means of a global parameter model of the compressed breast, IEEE Trans. on Medical Imaging, vol. 23, no. 7, July 2004, pp. 799-806. Here, virtual tissue is added to the resulting thickness map so as to equalize tissue thickness resulting in an image that appears more homogeneous, and that has distribution of the pixels more concentrated in a smaller dynamic range. The proposed method is signalled as a useful tool for pre-processing raw digital mammograms that need be visualized, where gains are to be expected to be maximal for the breast area near the breast outline, where the breast bulges out.
In X. H. Wang et al., Automated assessment of the composition of breast tissue revealed on tissue-thickness-corrected mammography, American Journal of Roentgenology, Vol. 180, January 2003, pp. 257-262, the mammographic data are adjusted for tissue thickness variations before estimating tissue composition. For each pixel inside the breast a multiplicative thickness correction coefficient is computed as a function of the distance to the skin line. Features derived from histograms of corrected pixel values are then used in the classification in one of a number of archetypical breast classes (almost fat, scattered fibro-glandular densities, heterogeneously dense and extremely dense). Because of correction, the histogram represents a more accurate representation of the actual attenuation of breast tissue.
A sophisticated model for simulating and calibrating the mammographic imaging process has been described in Highman and Brady, Mammographic Image Analysis, Dordrecht, The Netherlands, Kluwer Academic, 1999. Based on a physics model-based approach, a measure hint is computed that represents the thickness of ‘interesting’ (non-fat) tissue between the pixel and the X-ray source. The representation allows image enhancement through removing the effects of degrading factors, and also effective image normalization since all changes in the image due to variations in the imaging conditions have been removed. The breast thickness turned out to be a key parameter in the computation of hint.