Toronto Metropolitan University
s41598-018-31911-7.pdf (4.38 MB)

Objective Evaluation of Multiple Sclerosis Lesion Segmentation using a Data Management and Processing Infrastructure

Download (4.38 MB)
journal contribution
posted on 2022-10-14, 14:17 authored by Olivier Commowick, Audrey Istace, Michaël Kain, Baptiste Laurent, Florent Leray, Mathieu Simon, Sorina Camarasu-Pop, Pascale Girard, Roxana Ameli, Jean-Christophe Ferré, Anne Kerbrat, Thomas Tourdias, Frédéric Cervenansky, Tristan GlatardTristan Glatard, Jeremy Beaumont, Senan Doyle, Florence Forbes, Jesse Knight, April KhademiApril Khademi, Amirreza Mahbod, Chunliang Wang, Richard McKinley, Franca Wagner, John MuschelliJohn Muschelli, Elizabeth Sweeney, Eloy Roura, Xavier Ladó, Michel M. Santos, Wellington P. Santos, Abel G. Silva-Filho, Xavier Tomas-Fernandez, Hélène Urien, Isabelle Bloch, Sergi Valverde, Mariano Cabezas, Francisco Javier Vera-Olmos, Noberto Malpica, Charles Guttmann, Sandra Vukusic, Gilles Edan, Michel Dodjat, Martin Styner, Simon K. Warfield, François Cotton, Christian Barillot

We present a study of multiple sclerosis segmentation algorithms conducted at the international MICCAI 2016 challenge. This challenge was operated using a new open-science computing infrastructure. This allowed for the automatic and independent evaluation of a large range of algorithms in a fair and completely automatic manner. This computing infrastructure was used to evaluate thirteen methods of MS lesions segmentation, exploring a broad range of state-of-theart algorithms, against a high-quality database of 53 MS cases coming from four centers following a common definition of the acquisition protocol. Each case was annotated manually by an unprecedented number of seven different experts. Results of the challenge highlighted that automatic algorithms, including the recent machine learning methods (random forests, deep learning, …), are still trailing human expertise on both detection and delineation criteria. In addition, we demonstrate that computing a statistically robust consensus of the algorithms performs closer to human expertise on one score (segmentation) although still trailing on detection scores.