Pixelwise Positional Embeddings for Medical Images in Vision Transformers
Simple SummaryContent extracted from patent full text and abstract with AI.
This invention relates to an improved method for analyzing medical images using vision transformer networks. Specifically, it introduces pixelwise positional embeddings, which encode detailed positional information at the pixel level for every image. By integrating precise location and scale metadata from medical imaging devices, this approach produces features that vision transformers can use for various medical analysis tasks (like classification, detection, or segmentation), overcoming limitations of conventional patch-wise embedding techniques that are better suited for natural images.
Use CasesContent extracted from patent full text and abstract with AI.
- Enhanced automated diagnosis or detection of diseases in medical images such as CT, MRI, X-ray, and ultrasound.
- Accurate segmentation of anatomical structures or abnormalities, including tumor or organ boundaries, in 2D/3D medical imaging.
- Longitudinal comparison of patient images from different imaging sessions or devices, increasing consistency and accuracy.
- Multimodal image analysis where images from different modalities (e.g., MRI and CT) with varying spatial relationships can be robustly processed.
- Development and training of advanced AI models for clinical decision support, radiology, and telemedicine applications.
BenefitsContent extracted from patent full text and abstract with AI.
- Enables the use of precise spatial information of each pixel or voxel, leading to improved model performance over traditional methods using only patch-wise positional embeddings.
- Allows flexible processing of medical images acquired from different devices or imaging protocols without the need for resampling, reducing information loss.
- Facilitates better alignment and comparison between images from varying modalities and sessions, thus enhancing accuracy in analysis and diagnosis.
- Improves the ability of machine learning models to learn detailed spatial relationships in complex medical data.
- Demonstrated superior performance in medical image analysis tasks (e.g., brain infarction segmentation) compared to established models like UNETR and Swin UNETR.
Technical Classifications (CPCs)
Main Classifications
Physics & Measurement
Sub Classifications
Computing & Calculating
Information and Communication Technology for Specific Applications
CPC Codes
Inventors & Applicants
Applicants
Siemens Healthineers Ag
Univ Friedrich Alexander Er
Patent Abstract
Systems and methods for performing a medical imaging analysis task based on pixelwise positionally encoded features are provided. One or more input medical images are received. One or more pixelwise positional embedding images are generated for the one or more input medical images using a spatially varying function. Patches are extracted from the one or more input medical images and the one or more pixelwise positional embedding images. The patches extracted from the one or more input medical images are encoded with corresponding ones of the patches extracted from the one or more pixelwise positional embedding images into pixelwise positionally encoded features. A medical imaging analysis task is performed using a machine learning based network based on the pixelwise positionally encoded features. Results of the medical imaging analysis task are output.
Key Information
Publication No.
EP4531001A1
Family ID
92925695
Publication Date
2025-04-02
Application No.
EP24202841A
Application Date
2024-09-26
Priority Date
2023-09-26
Granted
No
Possible Cooperation
For further information please contact the transfer office.