There is a heightened interest in artificial intelligence (AI) applications in the medical world in general, and radiation oncology is no exception. The number of PubMed-indexed publications on this topic has expanded tenfold over the last ten years. Correspondingly, AI-based auto-contouring performance has also been increasingly studied. While only two studies were published on the topic in 2019, that number rose to over 30 by 2024. This growing body of research reflects the broader adoption of AI tools in clinical practice—tools that are proving to be both efficient and beneficial in improving cancer care.
Researchers often evaluate auto-contouring solutions using automated methods that generate numerical metrics, such as Dice scores or Hausdorff distances. This type of evaluation provides exact measurements which allows for more objective analysis. However, some subjectivity remains because the “gold standard” reference is usually a manual contour—often shaped by personal interpretation of international guidelines or local protocols.
Furthermore, not all deviations in automatically generated contours have the same clinical significance—their impact depends on the spatial direction of the shift. For example, if the contour of an organ at risk moves a few millimeters toward or away from the target volume, the effect on treatment planning, dosimetry, and clinical outcomes can differ. Moreover, the value of geometric overlap between manual and automatic contours does not answer practical clinical questions: Should we adjust a structure with a high Dice score? How much time will corrections take? Will editing actually affect outcomes?
As a result, researchers have explored alternative methods for evaluating auto-generated contours, aiming for a more practical and comprehensive assessment. They applied scoring methods that not only produced quantitative results but were also easy for clinicians to interpret. The four studies described below evaluated the clinical acceptability of the structures automatically generated by MVision Contour+. They used different scales, so giving a couple of details on their methods will allow a better understanding of their meaning.
Large-scale evaluation including multiple sites
One comprehensive study published in 2023 by a team from Bologna, Italy, included 7765 volumes of interest delineated on the CT images of 111 patients. The anatomical sites included the head-neck, breast, abdomen, thorax, and male and female pelvis. A grading of clinician satisfaction regarding automatically generated contours was included in the evaluation. The grades ranged from poor (1) to excellent (5). The correspondence with the clinical usability was the following: 1 – rejected and needing complete re-contouring; 2 – needing major editing; 3 – some editing; 4 – minor editing or; 5 – no editing needed.
MVision Contour+ had an excellent performance. Forty-four percent of the evaluated structures received a score of 4 (well done) and 43% a score of 5 (very well done). The average score was 4.08, 4.23, 4.45, 4.86, 3.7, and 4.79 for the abdomen, H&N, breast, thorax, female and male pelvis, respectively. This score was statistically significantly higher in seniors than in junior radiation oncologists (p-value=0.01) and it was correlated with greater time savings during manual editing of automatic contours (p<0.001). This study was already cited 19 times by other publications on this topic, showing its high quality and trustability (1).
Quality and fast contours improve workflow
Another 2023 study demonstrated that implementing MVision Contour+ led to significant changes in the radiotherapy workflow for treating breast cancer at a UK clinic. Using the automatically generated contours enabled positive changes in the local planning technique and the delivery of more precise irradiation.
The evaluation of MVision Contour+ followed international recommendations for implementing AI in radiotherapy and took place in sequential phases over the course of a year. In Phase 0, the team evaluated the AI contours offline, using a combination of qualitative and dosimetric metrics to assess their acceptability. Two experienced breast consultant oncologists conducted the qualitative evaluation, applying a different categorization system than the one previously described. The UK researchers used a seven-point scale, ranging from 1 (acceptable “as is”) to 7 (gross error, requiring edits on more than 75% of slices).
Among the evaluated patients during this phase, 76% of all breast lymph node volumes were scored as category 1 or 2 (acceptable as is or with minor edits). During the next phase, which included the more challenging cases (who received axillary nodal dissection), the performance was still good: in 67% of cases, no edits or only minor edits were required.
The resulting plans reflected the high quality of the automated contours. For 25 prospective patients planned with AI-guided field placement for breast and axilla, lymph node PTV coverage was above the optimal 90 % value for all patients (91.7%- 99.9 %). Mean heart dose and lung V 17 Gy dose metrics also met the recommended limits. Nine in ten cases (58 from 65) that were prospectively planned during phase 1 had treatment plans and dose distributions accepted without modifications (2).
It is important to underline that this evaluation included target volumes, such as breast and lymph nodes, not only organs at risk. These relevant results add more evidence for the quality of the MVision Contour+ breast model and clearly show its positive impact on clinical activity.
Compliance with breast cancer ESTRO contouring guidelines
MVision Contour+ offers two auto-contouring models for breast cancer radiotherapy, developed using ESTRO and RTOG guidelines, respectively. This allows clinicians to select the version that best fits their clinical practice.
A team from Heidelberg University, Germany, tested three different auto-contouring models—including the ESTRO-based MVision Contour+ model—on 50 breast cancer patients receiving postoperative radiotherapy, including regional nodal irradiation. Importantly, the group included 34% of patients with residual breast tissue, 38% receiving chest wall irradiation, and 28% with breast implants
A radiation oncologist with over eight years of experience manually evaluated the clinical usability of the generated CTVs. Each case was scored for accuracy and usability as: “no adjustments needed,” “minor corrections needed,” “major corrections needed,” or “not usable.”
The ESTRO-based MVision Contour+ model enabled clinical use after only minor or no adjustments in 78% of cases—the highest score among the three models tested. (3).
Accurate contours for both adult and pediatric CT scans
Another recently published study involved 20 French radiation oncologists who scored the contours of the organs-at-risk and lymph node areas generated by eight auto-contouring AI programs. This study included CT scans from pediatric patients, which is a novelty in this field of research.
The study used a simplified scale: 3 points – no correction, major time saving; 2 points – moderate corrections, moderate correction on one or a few cuts taking a few seconds, moderate time saving; 1 point – major corrections, no time saving, being easier for the radiation oncologist to completely manually redo the structure.
For adult CT scans, MVision had the best overall average quality score (2.13) and the best score for lymph node areas (2.15). For children, MVision Contour+ was the only program to have an average score higher than 2, which was the threshold set by the authors to characterize high-performing AI software (4).
These results come from independent, non-sponsored evaluations of MVision Contour+ conducted at leading clinical centers across several countries. The positive feedback from experienced clinicians and the observed impact on clinical workflows highlight the potential of AI-driven contouring tools to support meaningful improvements in patient care. Continued refinement of these technologies will help further support radiation oncologists and medical physicists in delivering efficient, high-quality treatment.

References
- Strolin S, Santoro M, Paolani G, et al. How smart is artificial intelligence in organs delineation? Testing a CE and FDA-approved Deep-Learning tool using multiple expert contours delineated on planning CT images. Front Oncol<. 2023;13:1089807. Published 2023 Mar 2.
- S Warren, N Richmond, A Wowk, M Wilkinson, K Wright. AI segmentation as a quality improvement tool in radiotherapy planning for breast cancer. IPEM-Translation, Volumes 6–8, 2023.
- Meixner E, Glogauer B, Klüter S, et al. Validation of different automated segmentation models for target volume contouring in postoperative radiotherapy for breast cancer and regional nodal irradiation. Clin Transl Radiat Oncol. 2024;49:100855. Published 2024 Sep 11.
- Meyer C, Huger S, Bruand M, et al. Artificial intelligence contouring in radiotherapy for organs-at-risk and lymph node areas [published correction appears in Radiat Oncol. 2025 Jan 22;20(1):13. doi: 10.1186/s13014-025-02586-y.]. Radiat Oncol. 2024;19(1):168. Published 2024 Nov 21.