Detect Nuclei in Large Whole Slide Images Using Cellpose
This example shows how to detect cell nuclei in whole slide images (WSIs) of tissue stained using hematoxylin and eosin (H&E) by using Cellpose.
Cell segmentation is an important step in most digital pathology workflows. The number and distribution of cells in a tissue sample can be a biomarker of cancer or other diseases. Digital pathology uses digital images of microscopy slides, or whole slide images (WSIs), for clinical diagnosis or research. To capture tissue- and cellular-level detail, WSIs have high resolutions, and can have sizes on the order of 200,000-by-100,000 pixels. To facilitate efficient display, navigation, and processing of WSIs, the best practice is to store them in a multiresolution format. This example addresses two challenges to segmenting nuclei in a WSI:
Accurately labeling cell nuclei — This example uses a pretrained model from the Cellpose Library [1] [2], which provides deep learning models for cell segmentation. You can configure Cellpose models in MATLAB® by using the
cellpose
object and its object functions.Managing large images — This examples uses a blocked image approach to process the WSI without loading the full image into system memory. You can manage and process images block-by-block by using the
blockedImage
object and its object functions.
This example requires the Medical Imaging Toolbox™ Interface for Cellpose Library. You can install the Medical Imaging Toolbox Interface for Cellpose Library from Add-On Explorer. For more information about installing add-ons, see Get and Manage Add-Ons. The Medical Imaging Toolbox Interface for Cellpose Library requires Deep Learning Toolbox™ and Computer Vision Toolbox™. For a similar example that uses only Image Processing Toolbox, see Detect and Count Cell Nuclei in Whole Slide Images (Image Processing Toolbox).
Download Sample Image from Camelyon17 Data Set
This example uses one WSI from the Camelyon17 challenge. Run this code to download the tumor_065.zip
file from the MathWorks® website, then unzip the file. The size of the data file is approximately 1.3 GB.
zipFile = matlab.internal.examples.downloadSupportFile( ... "image","data/camelyon17/tumor_065.zip"); filepath = fileparts(zipFile); unzip(zipFile,filepath)
Specify the filename of the TIF file.
fileName = fullfile(filepath,"tumor_065.tif");
Create Blocked Image
Create a blockedImage
(Image Processing Toolbox) object from the sample image. This image has 11 resolution levels. The finest resolution level, which is the first level, has a size of 220672-by-97792 pixels. The coarsest resolution level, which is the last level, has a size of 512-by-512 pixels and easily fits in memory.
bim = blockedImage(fileName); disp(bim.NumLevels)
11
disp(bim.Size)
220672 97792 3 110592 49152 3 55296 24576 3 27648 12288 3 13824 6144 3 7168 3072 3 1577 3629 3 3584 1536 3 2048 1024 3 1024 512 3 512 512 3
Set the spatial referencing for the blocked image by using the setSpatialReferencingForCamelyon16
helper function, which is attached to this example as a supporting file. The helper function sets the WorldStart
and WorldEnd
properties of the blockedImage
object using the spatial referencing information from the TIF file metadata.
bim = setSpatialReferencingForCamelyon16(bim);
Display the blocked image using the bigimageshow
function.
bigimageshow(bim);
Test Cellpose Model in Representative Subregion
In this example, you use a representative region of interest to test a Cellpose model and tune its parameter values. You then apply the final model to all regions of the image that contain tissue.
Identify and display a representative region in the image.
xRegion = [52700 53000];
yRegion = [98140 98400];
xlim(xRegion)
ylim(yRegion)
title("Region of Interest")
Extract the region of interest at the finest resolution level by using the getRegion
(Image Processing Toolbox) function.
imRegion = getRegion(bim,[98140 52700],[98400 53000]);
Measure Average Nuclei Diameter
To ensure accurate results, specify the approximate average diameter of the nuclei, in pixels. For this example, assume a diameter of 20 pixels.
averageCellDiameter = 20;
Alternatively, you can measure the approximate nucleus diameter in the Image Viewer (Image Processing Toolbox) app. Open the app by entering this command.
imageViewer(imRegion)
In the app, measure the nucleus of several cells to estimate the average diameter in the whole image. In the Viewer tab of the app toolstrip, click Measure Distance. In the main display pane, click on the boundary of the nucleus you want to measure and drag to draw a line across the nucleus. When you release the button, the app labels the line with a distance measurement, in pixels. You can reposition the endpoints or move the whole line by dragging the ends or middle portion of the line, respectively. This image shows an example of several distance measurements in the app.
Configure Nuclei Cellpose Model
Segment the image region by using the pretrained nuclei
model from the Cellpose Library. This model is suitable for detecting nuclei. To learn more about other models and their training data, see the Cellpose Library documentation.
The nuclei
model expects input images that have one channel in which the nuclei are bright relative to other areas. To use this model, convert the RGB input image to an inverted grayscale image.
img = im2gray(imRegion); imgc = imcomplement(img);
Create a cellpose
object for the nuclei
pretrained model.
cp = cellpose(Model="nuclei");
Segment the input image using the Cellpose model. Specify the average diameter of the nuclei to the segmentCells2D
function by using the ImageCellDiameter
name-value argument. The function returns a 2-D label image that labels pixels of the background as 0 and the pixels of different nuclei as increasing values, starting at 1.
labels = segmentCells2D(cp,imgc,ImageCellDiameter=averageCellDiameter);
Display the original RGB image, grayscale image, inverted grayscale image, and RGB image overlaid with the predicted labels as a tiled montage. The predicted labels image shows a different color for each detected nucleus, corresponding to the value of the associated pixels in labels
.
figure tiledlayout(1,4,TileSpacing="none",Padding="tight") nexttile imshow(imRegion) title("RGB Image") nexttile imshow(img) title("Grayscale Image") nexttile imshow(imgc) title("Inverted Image") nexttile imshow(labeloverlay(imRegion,labels)) title("Predicted Labels") linkaxes(findobj(gcf,Type="axes"));
Interactively Explore Labels Across WSI
In the previous section, you configured a Cellpose model based on one representative region. Before you apply the model to the full WSI, which takes several minutes, verify that the model works well for other representative regions in the slide.
Copy, paste, and run this code in the Command Window to call the exploreCamelyonBlockedApply
helper function, which is attached to this example as a supporting file. A figure window opens, showing the whole slide with a rectangular ROI on the left and zoomed in views of the ROI on the right. The function applies the model to only the current ROI inside the rectangle, which is faster than processing the full image.
exploreCamelyonBlockedApply(bim, ... @(bs)labeloverlay(bs.Data, ... segmentCells2D(cp,imcomplement(im2gray(bs.Data)), ... ImageCellDiameter=averageCellDiameter)), ... BlockSize=[256 256],InitialXY=[64511 95287])
In the figure window, use the scroll wheel to zoom in on the slide until the rectangular ROI is clearly visible. Drag the rectangle to move the ROI. When you release the button, the zoomed in views automatically update. For this example, you can see that the model works well because it has labelled the dark purple nuclei in the plane of the microscope, and the majority of touching nuclei have been correctly labeled as separate instances. If the model does not produce accurate results for your data, try tuning the model parameters. For more information, see the Refine Cellpose Segmentation by Tuning Model Parameters (Medical Imaging Toolbox) example.
Apply Cellpose Segmentation to Full WSI
Once you are satisfied with the Cellpose model in some representative regions, apply the model to the full WSI. For efficiency, apply the Cellpose model only to blocks that contain tissue, and ignore blocks without tissue. You can identify blocks with tissue using a coarse tissue mask.
Create Tissue Mask
In this example, you create a tissue mask by using the createBinaryTissueMask
helper function, which is attached to this example as a supporting file. The code for the function has been generated by using the Color Thresholder (Image Processing Toolbox) app. This image shows the app window with the selected RGB color thresholds.
Optionally, you can segment the tissue using a different color space, or select different thresholds by creating your own createBinaryTissueMask
function. For example, to select different thresholds, open a low-resolution version of the image in the Color Thresholder app by entering this code in the Command Window. After you select new thresholds, select Export > Export Function from the app toolstrip. Save the generated function as createBinaryTissueMask.m
.
iml9 = gather(bim,Level=9); colorThresholder(iml9)
Use the apply
(Image Processing Toolbox) function to run the createBinaryTissueMask
function on each block and assemble a single blockedImage
. This example creates the mask at the ninth resolution level, which fits in memory and follows the contours of the tissue better than a mask at the coarsest resolution level.
maskLevel = 9; tissueMask = apply(bim,@(bs)createBinaryTissueMask(bs.Data),Level=maskLevel);
Display the mask overlaid on the original WSI.
figure
h = bigimageshow(bim);
showlabels(h,tissueMask,Colormap=[0 0 0; 0 1 0])
title("Tissue Mask")
Select Blocks with Tissue
Select the set of blocks inside the tissue mask by using the selectBlockLocations
(Image Processing Toolbox) function. Include only blocks that have at least one pixel inside the mask by specifying the InclusionThreshold
value as 0
.
bls = selectBlockLocations(bim,BlockSize=[512 512], ...
Masks=tissueMask,InclusionThreshold=0);
Configure Cellpose Model
Create a new cellpose
object to process the full WSI using the nuclei
pretrained model. Use the canUseGPU
function to check whether a supported GPU and Parallel Computing Toolbox™ are available. By default, a cellpose
object processes images on a GPU if one is available. If a GPU is not available, enable OpenVINO acceleration for best performance on the CPU.
if canUseGPU cp = cellpose(Model="nuclei"); else cp = cellpose(Model="nuclei",Acceleration="openvino"); end
Apply Cellpose Model
Use the apply
function to run the labelNuclei
helper function on all blocks in the tissue mask. The helper function, defined at the end of this example, segments each block using the Cellpose model and returns metrics about the segmentation labels. The helper manually normalizes all blocks to the same range before segmenting them. Specify the limits, gmin
and gmax
, as the minimum and maximum intensities from the original representative region img
, respectively. The helper function uses an overlapping border between blocks to ensure it counts nuclei on the edges of a block only once. Specify a border size equal to 1.2 times the average nucleus diameter.
gmin = min(img,[],"all"); gmax = max(img,[],"all"); borderSize = round(1.2*[averageCellDiameter averageCellDiameter]); if(exist("labelData","dir")) rmdir("labelData","s") end [bstats,bcount] = apply(bim, ... @(bs)labelNuclei(bs,cp,gmin,gmax,averageCellDiameter), ... Level=1, ... BorderSize=borderSize, ... BlockLocationSet=bls, ... PadPartial=true, ... OutputLocation="labelData", ... Adapter=images.blocked.MATBlocks, ... UseParallel=true);
WARNING: no mask pixels found
Explore Cellpose Results for WSI
Explore the results by creating a heatmap and histogram of nuclei statistics, and by verifying the results.
Display Heatmap of Detected Nuclei
View the results as a heat map of the nuclei count in each block. For more efficient display, downsample the count data before displaying it.
bcount.BlockSize = [50 50];
hb = bigimageshow(bim);
showlabels(hb,bcount,Colormap=flipud(hot(550)))
title("Heatmap of Nuclei Count Per Block")
Zoom in to examine a local region. The darker heatmap regions indicate higher nuclei density, indicating more cells per unit of area.
xlim([43000 47000]) ylim([100900 104000])
Display Histogram of Nucleus Area
Load the nuclei statistics into memory.
stats = gather(bstats);
Remove empty blocks that are part of the tissue mask, but do not contain cell nuclei.
emptyInd = arrayfun(@(x)isempty(x.Area),stats); stats = stats(~emptyInd);
Concatenate the results from all blocks into vectors that contain the area, centroid coordinates, and bounding box for each nucleus.
areaVec = horzcat(stats.Area); centroidsXY = vertcat(stats.CentroidXY); boundingBoxedXYWH = vertcat(stats.BoundingBoxXYWH);
Create a histogram of nucleus area values, in square pixels.
figure histogram(areaVec); title("Histogram of Nucleus Area, " + "Mean: " + mean(areaVec) + " square pixels")
View Centroids and Bounding Boxes
Verify that nuclei within the overlapping border regions have been counted only one time. First, display a subregion of the WSI at the intersection of multiple blocks.
hb = bigimageshow(bim);
xlim([44970 45140])
ylim([98210 98380])
hold on
Display the grid lines between blocks.
hb.GridVisible = "on"; hb.GridAlpha = 0.5; hb.GridColor = "g"; hb.GridLineWidth = 5;
Plot a marker at the centroid of each nucleus detected by the model. Make the size of each marker proportional to the area of the corresponding nucleus. The display shows that the Cellpose model successfully detects nuclei near block boundaries and does not double count them. If you observe double detections for your own image data, try increasing the value of the borderSize
argument when applying the model using the apply
function.
scatter(centroidsXY(:,1),centroidsXY(:,2),areaVec/10,MarkerFaceColor="w")
Add bounding boxes to validate that each Cellpose label correctly spans the full, corresponding nucleus.
xs = xlim; ys = ylim; inRegion = centroidsXY(:,1) > xs(1) ... & centroidsXY(:,1) < xs(2) ... & centroidsXY(:,2) > ys(1) ... & centroidsXY(:,2) < ys(2); bboxesInRegion = boundingBoxedXYWH(inRegion,:); for ind = 1:size(bboxesInRegion,1) drawrectangle(Position=bboxesInRegion(ind,:)); end
Helper Functions
The labelNuclei
function performs these operations to segment nuclei within a block of blocked image data:
Converts the input image to grayscale using
im2gray
, normalizes the intensities to the range[gmin gmax]
using therescale
function, and inverts the image usingimcomplement
.Segments cells using the
segmentCells2D
function with thecellpose
objectcp
. The helper function disables automatic intensity normalization to avoid applying different normalization limits across blocks.Computes statistics for the nuclei segmentation masks, including area, centroid, and bounding boxes, by using the
regionprops
function.Removes data for nuclei in the border region by using the
clearBorderData
helper function, is defined in this section.
function [stats,count] = labelNuclei(bs,cp,gmin,gmax,ndia) % Convert to grayscale img = im2gray(bs.Data); % Normalize img = rescale(img,InputMin=gmin,InputMax=gmax); % Invert imgc = imcomplement(img); % Detect labels = segmentCells2D(cp,imgc,ImageCellDiameter=ndia,NormalizeInput=false); % Compute statistics. Add additional properties if required. % If you require pixel level masks for each cell, add "Image" to the vector in the second input argument. % Note that this increases the output directory size. rstats = regionprops(labels,["Centroid","Area","BoundingBox"]); % Only retain stats and labels whose centroids are within this block rstats = clearBorderData(rstats,bs); % Flatten out the data into a scalar structure stats.Area = [rstats.Area]; stats.CentroidXY = vertcat(rstats.Centroid); stats.BoundingBoxXYWH = vertcat(rstats.BoundingBox); if ~isempty(rstats) % Move to image coordinates (Start is in YX format, so flip it) offset = fliplr(bs.Start(1:2) - 1); stats.CentroidXY = stats.CentroidXY + offset; stats.BoundingBoxXYWH(:,1:2) = stats.BoundingBoxXYWH(:,1:2) + offset; end % Density count count = numel(rstats); end
The clearBorderData
helper function removes statistics for nuclei in the overlapping border between blocks to prevent double-counting.
function rstats = clearBorderData(rstats,bs) % Clear labels whose centroids are in the border area fullBlockSize = size(bs.Data); % includes border pixels % Go from (x,y) coordinates to (row,column) indices within the block lowEdgeXY = fliplr(bs.BorderSize(1:2)+1); highEdgeXY = fliplr(fullBlockSize(1:2)-bs.BorderSize(1:2)); % Find indices of centroids inside the current block (excludes the % border region) centroidsXY = reshape([rstats.Centroid],2,[]); centroidsXY = round(centroidsXY); inBlock = centroidsXY(1,:) >= lowEdgeXY(1) ... & centroidsXY(1,:) <= highEdgeXY(1) ... & centroidsXY(2,:) >= lowEdgeXY(2) ... & centroidsXY(2,:) <= highEdgeXY(2); % Filter stats to retain only inside block detections rstats = rstats(inBlock); end
References
[1] Stringer, Carsen, Tim Wang, Michalis Michaelos, and Marius Pachitariu. “Cellpose: A Generalist Algorithm for Cellular Segmentation.” Nature Methods 18, no. 1 (January 2021): 100–106. https://doi.org/10.1038/s41592-020-01018-x.
[2] Pachitariu, Marius, and Carsen Stringer. “Cellpose 2.0: How to Train Your Own Model.” Nature Methods 19, no. 12 (December 2022): 1634–41. https://doi.org/10.1038/s41592-022-01663-4.
See Also
cellpose
(Medical Imaging Toolbox) | segmentCells2D
(Medical Imaging Toolbox) | segmentCells3D
(Medical Imaging Toolbox) | trainCellpose
(Medical Imaging Toolbox) | downloadCellposeModels
(Medical Imaging Toolbox)
Related Topics
- Choose Pretrained Cellpose Model for Cell Segmentation
- Refine Cellpose Segmentation by Tuning Model Parameters
- Train Custom Cellpose Model