Objective: Glomeruli are among the most extensively studied components in kidney histopathology. Researchers and clinicians often depend on quantitative measures for the assessment of glomeruli. Historically, these are obtained through manual counting or classical image processing techniques. These methods possess limited reproducibility, are insufficiently robust to inter-laboratory variations, and are infamous for their tedious nature. As an alternative, we trained a convolutional neural network (CNN) to detect, segment, and count healthy and sclerotic glomeruli in digitized Periodic acid-Schiff (PAS) stained tissue sections. Methods: A CNN was trained using exhaustively annotated structures in rectangular regions in 50 whole-slide images (WSIs) of renal transplant biopsies. This resulted in annotations of 182 healthy and 18 sclerotic glomeruli. 40 WSIs were used for training and validation. Segmentation was assessed by calculating the Dice-coefficient on an unseen test set of 10 WSIs. To assess the network's ability to detect glomeruli in a larger composition of varying structures, we applied the CNN to 15 fully annotated nephrectomy WSIs. We calculated Pearson's correlation coefficients for glomerular counting (healthy and sclerotic glomeruli combined) in 82 renal transplant biopsies manually performed by three renal pathologists and the quantification by the CNN. Results: We found a Dice-coefficient of 0.95 for healthy glomeruli and 0.62 for sclerotic glomeruli in the renal transplant biopsy test set. The CNN detected 93.4% of 1747 annotated healthy glomeruli in the nephrectomy samples, with 8.4% false positives. The CNN detected 76.4% of 72 annotated sclerotic glomeruli, with 45.5% false positives. Pearson's correlation coefficient for glomerular counting on 82 transplant biopsies of the CNN versus the pathologists was 0.924, 0.930 and 0.937 for pathologist 1, 2, and 3, respectively. The CNN counted on average 1.7 glomeruli more than the pathologists. The pathologists differed on average 0.78 glomerulus. Conclusion: The network can accurately detect and segment healthy glomeruli. The CNN performs moderately well on segmenting sclerotic glomeruli, most probably due to the low amount of training data that was available for this class. The CNN's higher glomerular count can partly be explained by possible false positive detections of sclerotic glomeruli. Also, partially sampled glomeruli located at biopsy's edges are not counted by the pathologist, while they are included by the network. More training data for sclerotic glomeruli and additional post-processing techniques are needed to resolve this.
Glomerular detection, segmentation and counting in PAS-stained histopathological slides using deep learning
M. Hermsen, T. de Bel, M. den Boer, E. Steenbergen, J. Kers, S. Florquin, B. Smeets, L. Hilbrands and J. van der Laak
Dutch Federation of Nephrology (NfN) Fall Symposium 2018.