Computer-implemented Training Method, Computer-Implemented Prediction Method, Computer Program, Computer-Readable Medium and Device
Simple SummaryContent extracted from patent full text and abstract with AI.
This invention describes a computer-based method for training and using machine learning models in a way that allows data from multiple sources (such as hospitals) to be used without compromising data privacy. The method groups data into meaningful subsets, trains specific models for each group using advanced validation techniques (resampling), and then combines the outputs of these group-specific models through a meta-model to produce a final prediction. This approach is particularly suited for situations where sensitive data, like medical images or genetic information, must remain private but collaborative model improvement is desired.
Use CasesContent extracted from patent full text and abstract with AI.
- Medical imaging analysis (e.g., predicting patient age or disease status from MRI scans) without sharing actual patient data between hospitals.
- Genetic research where data from different laboratories can be combined at the prediction level without exposing raw genetic sequences.
- Satellite or remote sensing applications that require shared analysis across organizations while preserving location data security.
- Web analytics and personalized recommendation systems where user behavior can inform models collaboratively without leaking identifiable information.
- Biomedical studies involving multi-center data pooling (e.g., for rare diseases) while maintaining strict privacy compliance.
BenefitsContent extracted from patent full text and abstract with AI.
- Improves privacy by sharing only model predictions (not raw or sensitive data) across different data sites or organizations.
- Enables collaborative and larger-scale machine learning where traditional data centralization is restricted by privacy laws or regulations.
- Achieves higher prediction accuracy by using meta-models to aggregate group-level information and harmonize data from heterogeneous sources.
- Flexible grouping makes it adaptable to a variety of data types, including medical images, genetic data, web behavior, and satellite images.
- Supports interpretability and explainability due to its structured, multi-level modeling approach, aiding in critical areas like healthcare.
Technical Classifications (CPCs)
Main Classifications
Physics & Measurement
Sub Classifications
Computing & Calculating
Information and Communication Technology for Specific Applications
CPC Codes
Inventors & Applicants
Inventors
Applicants
Forschungszentrum Juelich Gmbh
Patent Abstract
The invention relates to a computer-implemented training method for training machine learning models, comprising A1) a learning data collection (L, LA, LB, LC) with learning data sets (LDS) is provided, wherein each learning data set (LDS) comprises a measurement data set (MDS) with measurement data entries (MDE) and a target variable (T), A2) a grouping of the measurement data entries (MDE) of the measurement data sets (MDS) of the at least one learning data collection (L, LA, LB, LC) into multiple groups (G1, G2), A3) for each group (G1, G2), its own machine learning model is trained and tested using a resampling method, and at least one testing process with another portion of the group sub measurement data sets belonging to the respective group (G1, G2), and wherein the testing process provides predictions for different groups, group-specific predictions (P1, P2, AP1, AP2, BP1, BP2), A4) with the group-specific predictions (P1, P2, AP1, AP2, BP1, BP2) and target variables (T), at least one other machine learning model, cross-group meta learning model (MM, MA+B, MA_B), is trained, such that it can provide a cross-group prediction (P) from multiple group-specific predictions (P1, P2, AP1, AP2, BP1, BP2) associated with different groups (G1, G2). The invention also relates to a computer-implemented prediction method for predicting a characteristic using machine learning models, a computer program, a computer-readable medium and a device.
Key Information
Publication No.
WO2023237608A1
Family ID
86899237
Publication Date
2023-12-14
Application No.
EP2023065239W
Application Date
2023-06-07
Priority Date
2022-06-10
Granted
No
Possible Cooperation
For further information please contact the transfer office.