Get Started with Segment Anything Model for Image Segmentation
Perform image segmentation using the Image Processing Toolbox™ Model for Segment Anything Model support package. The Segment Anything Model (SAM) is a state-of-the-art image segmentation model that uses deep learning neural networks to accurately and automatically segment objects within images without requiring training. The SAM enables you to instantaneously segment objects by providing feedback through visual prompts, such as marked points, selected ROI, and existing masks. Because the SAM predicts multiple masks for a single prompt, such as a marked point, you can use the SAM to segment ambiguous entities, such as both a person and the shirt they wear.
Using the support package, you can perform various image segmentation tasks using the SAM. This table describes the segmentation options and the corresponding functionality available in Image Processing Toolbox.
Goal | Approach | Get Started |
---|---|---|
Interactively and instantaneously segment objects or regions in images:
| Segment Anything Model in the Image Segmenter app | To get started, see the Interactively Segment Objects in Image section. For an example, see Segment Objects Using Segment Anything Model (SAM) in Image Segmenter. |
Automatically segment the entire image or all of the objects inside an ROI, and produce a stack of masks without the need to specify prompts | imsegsam | To get started, see the Automatically Segment Full Image or ROI section. |
Segment an image or batch of images using custom visual prompts, such as points, bounding boxes, and mask logits | To get started, see the Segment Image Using Custom Visual Prompts section. |
Install Support Package
You can install the Image Processing Toolbox Model for Segment Anything Model from the Add-On Explorer. For more information about installing add-ons, see Get and Manage Add-Ons. The support package also requires Deep Learning Toolbox™. Processing image data on a GPU requires a supported GPU device and Parallel Computing Toolbox™.
Interactively Segment Objects in Image
Interactively segment objects in an image using SAM in the Image Segmenter app. You can instantaneously produce binary object masks, or fine-tune the masked regions.
The Image Segmenter app enables you to perform various segmentation tasks by using the SAM.
Automatically create object masks by marking points or selecting an ROI.
Post-process existing masks: refine existing masks using the SAM, or create new masks using the SAM, and add them to existing binary masks.
Automatically segment the entire image into objects and regions, and select objects or regions to add to a mask. You can interactively specify the segmentation parameters. This image shows the segmentation of the entire image into regions and the masks, shaded in yellow, created by selecting several of the segmented regions.
For an example of how to use the SAM in the Image Segmenter app, see the Segment Objects Using Segment Anything Model (SAM) in Image Segmenter example.
To get started with the Image Segmenter App, see Getting Started with Image Segmenter.
Automatically Segment Full Image or ROI
Use the imsegsam
function to automatically segment all the objects or regions in an image, and return
masks as connected component structures along with corresponding confidence scores. The
imsegsam
function enables you to perform these common
segmentation tasks.
Segment the entire image into all object regions, and optionally tune segmentation parameters such as the confidence score threshold by specifying name-value arguments.
Segment all objects in an ROI by specifying an ROI using the
PointGridMask
name-value argument.Segment all objects of a particular size range by specifying the maximum and minimum size of the objects to segment using the
MaxObjectArea
andMinObjectArea
name-value arguments, respectively.
To convert the imsegsam
output from a connected component
structure to a stack of binary masks, see the Segment Objects as Mask Stack Using Segment Anything Model example.
Alternatively, to visualize the imsegsam
output as a label matrix,
see the Automatically Segment Full Image Using Segment Anything Model example.
This image shows a visualization of full image segmentation using the label matrix
created from the masks
output of
imsegsam
.
Segment Image Using Custom Visual Prompts
Segment an image or batch of images by specifying custom visual prompts, such as
points or bounding boxes, to the SAM architecture using the segmentAnythingModel
object and its object functions. The SAM architecture
consists of an image and visual prompt encoder and a mask decoder. This enables you to
reuse the same image embeddings with different visual prompts. For a given image
embedding, the image encoder and mask decoder use the visual prompt to predict a
mask.
To specify custom visual prompts to the SAM, first create a pretrained SAM using the
segmentAnythingModel
object. Next, use the extractEmbeddings
object function to extract the image embeddings from
the SAM image encoder.
To perform the segmentation, use the segmentObjectsFromEmbeddings
object function to specify visual prompts
and segment objects from the image embeddings using the mask decoder.
For an example, see the Segment Objects in Interactive ROI Using Segment Anything Model example.
Refine Segmentation Results
To refine segmentation results, use the segmentObjectsFromEmbeddings
object function on the same image, but
provide the mask logits of the object mask from the previous segmentation as an
additional visual prompt input by specifying them to the MaskLogits
name-value argument. The mask logits returned in the
maskLogits
argument of
segmentObjectsFromEmbeddings
function are non-thresholded
mask logits, instead of binary masks. If you specify for the function to return
multiple masks using the ReturnMultiMask
argument, the model returns the mask logits
corresponding to only the mask with the highest confidence score. The mask logits
refinement process enables you to iteratively tune your image segmentation.
This image shows segmentation masks predicted using the SAM at two stages, before and after refinement. For both stages, the same visual prompts have been specified as foreground and background points. The image in the second stage has been refined using the mask logits returned in the first stage of the model as an additional prompt.
Perform Upstream and Downstream Tasks Using SAM
Use SAM-based techniques as initial or downstream steps in image processing and deep learning tasks. These are examples of applications where you can use a SAM-based segmentation step.
Segment detected objects: perform instance segmentation by specifying the bounding boxes generated by an object detection network as visual prompt inputs to the SAM. For example, detect objects and create bounding boxes using the
detect
(Computer Vision Toolbox) object function of theyolov4ObjectDetector
(Computer Vision Toolbox) object, and specify them as theboxPrompt
input argument to thesegmentObjectsFromEmbeddings
function.Create labeled ground truth masks for deep learning applications using the SAM. Use the Segment Anything tool in the Image Labeler (Computer Vision Toolbox) app to automatically segment and label objects as ground truth for semantic segmentation. To learn more about how to create pixel labels using the SAM, see the Automatically Label Ground Truth Using Segment Anything Model (Computer Vision Toolbox) example.
Perform unsupervised edge detection: use the automatic mask generation capability of the SAM to develop a custom method which identifies the object and region boundaries within an image.
References
[1] Kirillov, Alexander, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, et al. "Segment Anything," April 5, 2023. https://doi.org/10.48550/arXiv.2304.02643.
[2] Accessed April 28, 2024. "SA-1B Dataset," https://ai.meta.com/datasets/segment-anything.
See Also
Apps
- Image Segmenter | Image Labeler (Computer Vision Toolbox)
Functions
Related Topics
- Segment Objects Using Segment Anything Model (SAM) in Image Segmenter
- Segment Objects in Interactive ROI Using Segment Anything Model
- Get Started with Image Segmentation
- Getting Started with Image Segmenter
- Get Started with Medical Segment Anything Model for Medical Image Segmentation (Medical Imaging Toolbox)
- Get Started with Image Preprocessing and Augmentation for Deep Learning (Computer Vision Toolbox)
- Get Started with Image Segmentation