AI
Confidential
A new way of implementing artificial intelligence promises to revolutionize healthcare, while protecting patient privacy.
AI’s ability to analyze and act on massive amounts of data is set to transform numerous industries and help humanity solve some its biggest problems.
Among the most promising applications of the technology involves AI’s potential to drastically improve healthcare outcomes. Purpose-built algorithms can help researchers find new ways of treating serious diseases, and AI-assisted diagnostic equipment can quickly scan imaging results to detect problems that could be missed by the human eye. AI can also help with medical research by scanning through millions of studies, journals and patient reports to learn patterns and generate new treatment ideas.
Ideally, the data used to generate such medical insights is diverse, accounting for a large variety of historical cases to cast as wide a net as possible for relevant information. In best-case scenarios, that means pulling data from dozens of hospitals and research institutions.
But sharing data outside of a medical institution can be thorny, due to commercial, technical and privacy concerns – including Health Insurance Portability and Accountability Act provisions that protect patient confidentiality. But thanks to pioneering work from technology giant Intel and several partners, now there is a way to address these concerns. Machine learning and privacy experts have pioneered a new way to train AI models that allows researchers to use data from multiple institutions while allowing all the participants to keep their information inside their firewalls.
Moving AI to the data
In federated learning, as the approach is called, the AI model moves to where the data, such as a collection of lung X-Rays, resides. “In traditional AI, you collect data from a bunch of sources into one location and you train the model in that location,” said Micah Sheller, senior research scientist at Intel Corp. “In federated learning, you do it in the opposite direction. You send the model out to all the places where the data lives, they each update it slightly and send it back. Then all those models get aggregated. And you repeat the process until you get the desired performance.”
“In traditional AI, you collect data from a bunch of sources into one location. In federated learning, you do it in the opposite direction.”
In the case of healthcare, hospitals and research institutions can contribute their data to a project without it ever leaving their four walls. By getting around the restrictions on moving medical data, this approach can speed up AI-driven research efforts by months, if not years.
Beyond protecting privacy, another advantage of federated learning is that, because it can draw from institutions across the U.S. and around the world, it’s more likely to include diverse data sets. Past AI implementations have often produced biased results that underrepresent women and ethnic minorities because they were built with limited training data.
Here’s how federated learning works
Participants agree on settings that define what model will be used, and how it will be trained. This is basically a contract that ensures all institutions know how their data will be used.
The aggregator, which can be thought of the project coordinator, distributes the training code to all the research participants who make up the federation. The aggregator could be cloud-based, or run at one of the participants, depending on what is agreed upon by participants.
Each institution in the federation downloads the same initial global model parameters from the aggregator. This will be used to start the first round of training.
At this stage, each institution trains using their own training data to obtain their local model for this round as well as collects accuracy measurements using the validation portion of their local data.
Each local model, along with accuracy results, are sent back to the aggregation server. There, the local models and accuracy results are averaged to create the next global model and global accuracy respectively.
The global model is then redistributed to the participants for another round.
This process is repeated until the desired level of accuracy is achieved.







Participants agree on settings that define what model will be used, and how it will be trained. This is basically a contract that ensures all institutions know how their data will be used.

The aggregator, which can be thought of the project coordinator, distributes the training code to all the research participants who make up the federation. The aggregator could be cloud-based, or run at one of the participants, depending on what is agreed upon by participants.
Each institution in the federation downloads the same initial global model parameters from the aggregator. This will be used to start the first round of training.
At this stage, each institution trains using their own training data to obtain their local model for this round as well as collects accuracy measurements using the validation portion of their local data.
Each local model, along with accuracy results, are sent back to the aggregation server. There, the local models and accuracy results are averaged to create the next global model and global accuracy respectively.
The global model is then redistributed to the participants for another round.
This process is repeated until the desired level of accuracy is achieved.






From theory to practice
Intel has partnered with a major university to put this approach into practice for research into brain tumors. Through federated learning, researchers from Intel and the school used Machine Learning and AI to identify malignant brain tumors. The researchers used an unprecedented, global dataset from 71 institutions across six continents. The project demonstrated the ability to improve brain tumor detection by 33 percent compared to models trained on the largest public dataset at the time of publication.*
“Our work has the potential to positively impact patients across the globe and we look forward to continuing to explore the promise of federated learning,” said Prashant Shah, Head of AI Health and Life Sciences, Intel.
Industry observers say the project shows federated learning’s potential to drastically improve healthcare outcomes. “The inability to analyze data that has already been captured has significantly delayed the massive medical breakthroughs AI has promised. This federated learning study showcases a viable path for AI to advance and achieve its potential as one of the most powerful tools to fight our most difficult ailments,” said Rob Enderle, principal analyst, Enderle Group.
“Our work has the potential to positively impact patients across the globe.”
Beyond healthcare, federated learning could help AI achieve its potential in other industries where sensitive data is closely guarded and siloed, including financial services. “Criminal activity like large-scale fraud often spans institutions, so federated learning could be of potential benefit in detecting it,” said Sheller.
Intel supports federated learning through an internally developed framework called OpenFL, which is now an open source Linux Foundation project. OpenFL acts as a kind of coordinator that sits atop existing Machine Learning frameworks like TensorFlow, PyTorch or Keras to facilitate learning across multiple data owners. It also includes a number of built-in security features that help to protect data and intellectual property, from the aggregator through to the local endpoints.
Intel’s focus on privacy and security are part of what makes it a leader in the emerging field of federated learning. OpenFL supports hardware-based trusted execution, ensuring the confidentiality of data and compute throughout FL workflows. OpenFL designers also keep security and privacy top of mind, so that the code and workflow allows for easy review by IT departments at participating institutions, and developers can find it easy to extend features without compromising on security and privacy.
Intel also provides thought leadership in forums on FL for medical imaging and adjacent applications, including through MLCommons, MICCIA and BraTS.
Intel plans to continue to refine its approach to federated learning in the coming months, and is hopeful about the method’s prospects for unleashing the full potential of AI, which is rapidly evolving. “We’re working to overcome several practical barriers that hinder rapid deployment and training models in real-world federated learning systems," said Sheller, who helped develop OpenFL. "We want to get to a place where data scientists can’t tell the difference between running training on a centralized or federated system."