MedPerf has already been used in a test of the NIH-funded Federated Tumor Segmentation Challenge, where it supported the testing of 41 different models across 32 healthcare sites on six continents. The results revealed biases in the models, which showed reduced performance at sites with different patient demographics than the ones they were trained on. However, the author of the article questions whether MedPerf can truly address the complex issues in AI for healthcare, noting that safely deploying medical models requires ongoing, thorough auditing on the part of vendors and their customers.
Key takeaways:
- The healthcare industry is increasingly adopting AI, with 80% of organizations having an AI strategy in place, according to a 2020 survey by Optum.
- MLCommons has developed a new testing platform called MedPerf to benchmark and evaluate medical models, aiming to improve effectiveness, reduce bias, and build public trust.
- MedPerf is designed to be used by healthcare organizations to assess AI models on demand, supporting popular machine learning libraries as well as private models and models available only through an API.
- Despite the potential of MedPerf, the article suggests that it may not fully address the complex issues in AI for healthcare, such as integrating the technology into the daily routines of healthcare practitioners and technical systems.