Philips accelerates development of AI-enabled healthcare solutions with an MLOps platform built on Amazon SageMaker

This is a joint blog with AWS and Philips. Philips is a health technology company focused on improving people’s lives through meaningful innovation. Since 2014, the company has been offering customers its Philips HealthSuite Platform, which orchestrates dozens of AWS services that healthcare and life sciences companies use to improve patient care. It partners with healthcare providers, startups, universities, and other companies to develop technology that helps doctors make more precise diagnoses and deliver more personalized treatment for millions of people worldwide.

One of the key drivers of Philips’ innovation strategy is artificial intelligence (AI), which enables the creation of smart and personalized products and services that can improve health outcomes, enhance customer experience, and optimize operational efficiency. Amazon SageMaker provides purpose-built tools for machine learning operations (MLOps) to help automate and standardize processes across the ML lifecycle. With SageMaker MLOps tools, teams can easily train, test, troubleshoot, deploy, and govern ML models at scale to boost productivity of data scientists and ML engineers while maintaining model performance in production.

In this post, we describe how Philips partnered with AWS to develop AI ToolSuite—a scalable, secure, and compliant ML platform on SageMaker. This platform provides capabilities ranging from experimentation, data annotation, training, model deployments, and reusable templates. All these capabilities are built to help multiple lines of business innovate with speed and agility while governing at scale with central controls. We outline the key use cases that provided requirements for the first iteration of the platform, the core components, and the outcomes achieved. We conclude by identifying the ongoing efforts to enable the platform with generative AI workloads and rapidly onboard new users and teams to adopt the platform.

Customer context
Philips uses AI in various domains, such as imaging, diagnostics, therapy, personal health, and connected care. Some examples of AI-enabled solutions that Philips has developed over the past years are:

– Philips SmartSpeed – An AI-based imaging technology for MRI that uses a unique Compressed-SENSE based deep learning AI algorithm to take speed and image quality to the next level for a large variety of patients
– Philips eCareManager – A telehealth solution that uses AI to support the remote care and management of critically ill patients in intensive care units, by using advanced analytics and clinical algorithms to process the patient data from multiple sources, and providing actionable insights, alerts, and recommendations for the care team
– Philips Sonicare – A smart toothbrush that uses AI to analyze the brushing behavior and oral health of users, and provide real-time guidance and personalized recommendations, such as optimal brushing time, pressure, and coverage, to improve their dental hygiene and prevent cavities and gum diseases.

For many years, Philips has been pioneering the development of data-driven algorithms to fuel its innovative solutions across the healthcare continuum. In the diagnostic imaging domain, Philips developed a multitude of ML applications for medical image reconstruction and interpretation, workflow management, and treatment optimization. Also in patient monitoring, image guided therapy, ultrasound and personal health teams have been creating ML algorithms and applications. However, innovation was hampered due to using fragmented AI development environments across teams. These environments ranged from individual laptops and desktops to diverse on-premises computational clusters and cloud-based infrastructure. This heterogeneity initially enabled different teams to move fast in their early AI development efforts, but is now holding back opportunities to scale and improve efficiency of our AI development processes. It was evident that a fundamental shift towards a unified and standardized environment was imperative to truly unleash the potential of data-driven endeavors at Philips.

Key AI/ML use cases and platform requirements
AI/ML-enabled propositions can transform healthcare by automating administrative tasks done by clinicians. For example:

– AI can analyze medical images to help radiologists diagnose diseases faster and more accurately
– AI can predict future medical events by analyzing patient data and improving proactive care
– AI can recommend personalized treatment tailored to patients’ needs
– AI can extract and structure information from clinical notes to make record-taking more efficient
– AI interfaces can provide patient support for queries, reminders, and symptom checkers

Overall, AI/ML promises reduced human error, time and cost savings, optimized patient experiences, and timely, personalized interventions.

One of the key requirements for the ML development and deployment platform was the ability of the platform to support the continuous iterative development and deployment process, as shown in the following figure. The AI asset development starts in a lab environment, where the data is collected and curated, and then the models are trained and validated. When the model is ready and approved for use, it’s deployed into the real-world production systems. Once deployed, model performance is continuously monitored. The real-world performance and feedback are eventually used for further model improvements with full automation of the model training and deployment.

The more detailed AI ToolSuite requirements were driven by three example use cases:

1. Develop a computer vision application aimed at object detection at the edge. The data science team expected an AI-based automated image annotation workflow to speed up a time-consuming labeling process.
2. Enable a data science team to manage a family of classic ML models for benchmarking statistics across multiple medical units. The project required automation of model deployment, experiment tracking, model monitoring, and more control over the entire process end to end both for auditing and retraining in the future.
3. Improve the quality and time to market for deep learning models in diagnostic medical imaging. The existing computing infrastructure didn’t allow for running many experiments in parallel, which delayed model development. Also, for regulatory purposes, it’s necessary to enable full reproducibility of model training for several years.

Non-functional requirements
Building a scalable and robust AI/ML platform requires careful consideration of non-functional requirements. These requirements go beyond the specific functionalities of the platform and focus on ensuring the following:

– Scalability – The AI ToolSuite platform must be able to scale Philips’s insights generation infrastructure more effectively so that the platform can handle a growing volume of data, users, and AI/ML workloads without sacrificing performance. It should be designed to scale horizontally and vertically to meet increasing demands seamlessly while providing central resource management.
– Performance – The platform must deliver high-performance computing capabilities to efficiently process complex AI/ML algorithms. SageMaker offers a wide range of instance types, including instances with powerful GPUs, which can significantly accelerate model training and inference tasks. It also should minimize latency and response times to provide real-time or near-real-time results.
– Reliability – The platform must provide a highly reliable and robust AI infrastructure that spans across multiple Availability Zones. This multi-AZ architecture should ensure uninterrupted AI operations by distributing resources and workloads across distinct data centers.
– Availability – The platform must be available 24/7, with minimal downtime for maintenance and upgrades. AI ToolSuite’s high availability should include load balancing, fault-tolerant architectures, and proactive monitoring.
– Security and Governance – The platform must employ robust security measures, encryption, access controls, dedicated roles, and authentication mechanisms with continuous monitoring for unusual activities and conducting security audits.
– Data Management – Efficient data management is crucial for AI/ML platforms. Regulations in the healthcare industry call for especially rigorous data governance. It should include features like data versioning, data lineage, data governance, and data quality assurance to ensure accurate and reliable results.
– Interoperability – The platform should be designed to integrate easily with Philips’s internal data repositories, allowing seamless data exchange and collaboration with third-party applications.
– Maintainability – The platform’s architecture and code base should be well organized, modular, and maintainable. This enables Philips ML engineers and developers to provide updates, bug fixes, and future enhancements without disrupting the entire system.
– Resource optimization – The platform should monitor utilization reports very closely to make sure computing resources are used efficiently and allocate resources dynamically based on demand. In addition, Philips should use AWS Billing and Cost Management tools to make sure teams receive notifications when utilization passes the allocated threshold amount.
– Monitoring and logging – The platform should use Amazon CloudWatch alerts for comprehensive monitoring and logging capabilities, which are necessary to track system performance, identify bottlenecks, and troubleshoot issues effectively.
– Compliance – The platform can also help improve regulatory compliance of AI-enabled propositions. Reproducibility and traceability must be enabled automatically by the end-to-end data processing pipelines, where many mandatory documentation artifacts, such as data lineage reports and model cards, can be prepared automatically.
– Testing and validation – Rigorous testing and validation procedures…

Source link