IHC Color Histograms for Unsupervised Ki67 Proliferation Index Calculation
Automated image analysis tools for Ki67 breast cancer digital pathology images would have significant value if integrated into diagnostic pathology workflows. Such tools would reduce the workload of pathologists, while improving efficiency, and accuracy. Developing tools that are robust and reliable to multicentre data is challenging, however, differences in staining protocols, digitization equipment, staining compounds, and slide preparation can create variabilities in image quality and color across digital pathology datasets. In this work, a novel unsupervised color separation framework based on the IHC color histogram (IHCCH) is proposed for the robust analysis of Ki67 and hematoxylin stained images in multicentre datasets. An “overstaining” threshold is implemented to adjustforbackgroundoverstaining,andanautomatednucleiradiusestimatorisdesigned to improve nuclei detection. Proliferation index and F1 scores were compared between the proposed method and manually labeled ground truth data for 30 TMA cores that have ground truths for Ki67+ and Ki67− nuclei. The method accurately quantified the PI over the dataset, with an average proliferation index difference of 3.25%. To ensure the method generalizes to new, diverse datasets, 50 Ki67 TMAs from the Protein Atlas were used to test the validated approach. As the ground truth for this dataset is PI ranges, the automated result was compared to the PI range. The proposed method correctly classified 74 out of 80 TMA images, resulting in a 92.5% accuracy. In addition to these validations experiments, performance was compared to two color-deconvolution based methods, and to six machine learning classifiers. In all cases, the proposed work maintained more consistent (reproducible) results, and higher PI quantification accuracy.