What’s your favorite dessert? Does it include strawberries? This popular fruit is increasing in demand, putting more pressure on strawberry producers to know how many fruits their plants are producing.
Why Traditional Strawberry Canopy Measurement Falls Short
The number of strawberries on a strawberry plant is estimated using the canopy size as an indicator. Canopy is the overall leafy growth of the plant, and historically, canopy size was measured the old fashioned way with rulers—a very difficult method for defining the contours of the strawberry canopy, according to Zijing Huang, a PhD student at the University of Florida.
“Because it’s a three-dimensional structure, previous measurements were very raw, and not as accurate about the actual size or dimensions,” Huang says. “You can‘t take a photo of every strawberry plant in a field, because there would be too many images to process and it would take too much time.”
How Researchers Used Machine Learning to Improve Strawberry Yield Estimation
Huang and his fellow researchers decided to use technology to solve the problem, developing a machine learning algorithm to label the strawberry plant data in images. The goal of the algorithm was to be as accurate as possible with segmenting the strawberry plants from their background environment in images.
Instead of building a canopy segmentation model from scratch, the team combined a pre-trained vision foundation model, the Segment Anything Model (SAM), with supervised YOLOv11 detectors trained on labeled field images. A prompt selection algorithm converts YOLOv11 detections (for example, plant bounding boxes and flower/fruit detections) into prompts that guide SAM, enabling accurate canopy segmentation without fine-tuning SAM on strawberry-specific masks.
“SAM is a very huge, powerful image segmentation model that gives a more global understanding of the world,” Huang explained. “If you give SAM an image, it will differentiate out objects from the background.”
SAM is promptable, meaning it generates segmentation masks based on prompts such as points, bounding boxes, and masks. In this work, YOLOv11 plant detections were used as box prompts, while additional point prompts were automatically selected (as background/exclusive points) based on overlaps between plant detections, preliminary SAM masks, and flower/fruit detections. The initial SAM output mask was then reused as a mask prompt in a second stage to refine the canopy segmentation.
How YOLOv11 Detection Improves Canopy Segmentation Accuracy
Because SAM supports zero-shot transfer, it can segment objects it was not explicitly trained on, as long as the prompts specify the target region. This makes it well suited for field imagery where pixel-level segmentation labels are limited, while still allowing the workflow to benefit from supervised detectors for robust prompt generation.
“For example, SAM isn’t trained specifically on strawberries. If we give it prompts, like a box around the plant and a few ‘background’ points, it can isolate the canopy,” Huang says. “That zero-shot capability can work really well for strawberry canopy estimation.”
The team took it a step farther, using YOLOv11 detection to enhance accuracy and efficiency. YOLOv11 is a supervised model that must be trained with labeled data. In this study, they trained two YOLOv11 models, one for plant (and weed) detection and another for flower/fruit detection across ripeness stages, to generate the prompts that guide SAM.