The World’s Largest and Most Diverse, Curated Global Data Repository for ILDs
The OSIC Cloud Data Repository (OSIC Cloud) was built with the expertise and help of radiologists, pulmonologists, machine learners, and imaging experts from around the world. It is a data‑rich, curated repository of anonymized HRCT scans and clinical information regarding ILDs, and houses a plethora of real world clinical and imaging data that is both multi‑ethnic and multi‑center. One of the key features of the OSIC Cloud is its diverse and representative sample of anonymized patient cases. By including global data from multiple centers and ethnicities, the repository offers a more complete and realistic picture of ILDs than individual studies or datasets can provide.
The purpose of the OSIC Cloud is to provide researchers and clinicians with access to high‑quality data that can be used to advance our understanding of ILDs to help improve patient care. This allows researchers to analyze and compare imaging findings with clinical data, providing a more comprehensive understanding of ILDs. |
The data in the OSIC Cloud is carefully curated and quality controlled to ensure its reliability and accuracy. This makes the repository a valuable resource for machine learning and artificial intelligence technologies, which require large amounts of accurately‑labeled data to build algorithms. By providing access to anonymized, high‑quality, multi‑center data, this repository has the potential to advance research and improve patient outcomes in this important field of medicine.
It is our belief that this database could be the start of finding digital imaging biomarkers that could potentially speed up diagnosis, and aid in better understanding of individual prognosis and response to therapy. |
NEW
Cloud Platform Highlights |
The future of medical research depends heavily on our ability to collate significant amounts of data, and make that data available for detailed and open scientific investigation. It's a proud moment that OSIC is at the forefront of this movement. Data is the essence of scientific progress and the OSIC repository already contains preliminary data rich enough to better understand the causes of disease, leading to better treatment and patient outcomes." Prof. David Barber University College London, OSIC Computational Science Lead |
The OSIC Cloud Data Overview
Diversity of Regions
Total committed datasets from Northern Europe, Western Europe, Central & Eastern Europe, North America, South America, and Asia-Pacific (APAC): 20,789
Countries in the Northern Europe region are defined as Denmark, Ireland, Finland, Norway, Sweden, or the United Kingdom.
Countries in the Western Europe region are defined as Austria, Belgium, France, Germany, Greece, Israel, Italy, the Netherlands, Spain, or Switzerland.
Countries in the Central and Eastern Europe region are defined as Bulgaria, Czech Republic, Croatia, Estonia, Hungary, Latvia, Lithuania, Poland, Romania, Russia, Slovenia, Serbia, or Ukraine.
Countries in the North America region are defined as Canada, Costa Rica, Guatemala, Mexico, or the United States of America.
Countries in the South America region are defined as Argentina, Brazil, Chile, Colombia, Ecuador, Paraguay, or Peru.
Countries in the Africa region are defined as Egypt, Gabon, Ghana, Kenya, Malawi, Morocco, Nigeria, South Africa, Tanzania, Tunisia, or Zambia.
Countries in the APAC region are defined as Australia, Bangladesh, China, Hong Kong, India, Japan, Korea, Malaysia, New Zealand, Philippines, Singapore, Taiwan, Thailand, or Turkey.
No Information is defined as datasets that specific regions cannot be released (ex. clinical trial datasets).
Countries in the Western Europe region are defined as Austria, Belgium, France, Germany, Greece, Israel, Italy, the Netherlands, Spain, or Switzerland.
Countries in the Central and Eastern Europe region are defined as Bulgaria, Czech Republic, Croatia, Estonia, Hungary, Latvia, Lithuania, Poland, Romania, Russia, Slovenia, Serbia, or Ukraine.
Countries in the North America region are defined as Canada, Costa Rica, Guatemala, Mexico, or the United States of America.
Countries in the South America region are defined as Argentina, Brazil, Chile, Colombia, Ecuador, Paraguay, or Peru.
Countries in the Africa region are defined as Egypt, Gabon, Ghana, Kenya, Malawi, Morocco, Nigeria, South Africa, Tanzania, Tunisia, or Zambia.
Countries in the APAC region are defined as Australia, Bangladesh, China, Hong Kong, India, Japan, Korea, Malaysia, New Zealand, Philippines, Singapore, Taiwan, Thailand, or Turkey.
No Information is defined as datasets that specific regions cannot be released (ex. clinical trial datasets).
New Cohort
Coming In 2025
Lung Cancer Screening – Coming Soon
The presence of ILA on a lung cancer screening is a significant finding that can increase the risk of developing ILD and may be associated with a higher risk of mortality. Early detection and appropriate management of ILA may be crucial in preventing the progression to ILD and improving long-term outcomes.
In a recent study of 1,384 individuals who received a lung cancer screening via CT scan, researchers identified “4% ILA in a lung cancer screening cohort; 37% had radiologic progression of ILA at 1 year and 40% were diagnosed with ILD within 5 years. Fibrotic ILA, defined by the presence of traction bronchiectasis, was a strong predictor of mortality, reduced progression-free survival, and diagnosis of ILD.”
The lung cancer cohort in our database will allow for future research to help identify biomarkers that can predict ILA/ILD disease progression. |
Available soon
|
OSIC AI/Biomarker Innovation Showcase
OSIC invited member radiologists, computational scientists, clinicians and industry experts to share a variety of different insights and key learnings that have resulted from working with our curated, global repository of anonymized HRCT scans and clinical data for interstitial lung diseases (ILDs).
During this collaborative forum, attendees:
- Listened to, learned from, and asked questions of OSIC members who presented their quantifiable AI advances and emerging innovations using the OSIC data.
- Learned about benefits of the new OSIC Cloud Data Repository, powered by VIDA, including easier contribution of data, robust new filters for cohort selection, collaboration tools to support data sharing.
- Discussed the robust clinical data curation, global normalization, and advanced data power to drive radical progress in drug development for fibrosing lung diseases.
- Had a dialog about the inclusion of new cohorts, including lung cancer screening scans to look for ILAs and ILDs, sarcoidosis, alpha-1.
- Learned about Project OPUS, a new, real-time observational study that applies AI and machine learning technology to the spirometry, environmental and imaging data.
- Participated in a dynamic discussion led by panelists, who shared valuable insights on OSIC’s progress and future direction, early ILD detection, and the impact of clinical data on improving predictions.
Explore what was presented and discussed during the OSIC AI/Biomarker Innovation Showcase:
Released Open-Source
CenTime
Event-conditional modeling of censoring in survival analysis
OSIC provided a grant to University College London (UCL) to create algorithms to help advance digital imaging biomarkers for accurate imaging-based diagnosis, prognosis and prediction of response to therapy. Through this initiative, the UCL team created “CenTime: Event-conditional modeling of censoring in survival analysis.” This work aims to address key limitations in current survival analysis methods.
The algorithm has been trained and evaluated using the combination of real-world, high-resolution lung CT scans (HRCT), associated clinical data, and mortality labels from the OSIC Data Repository. CenTime has shown promising results in more accurately predicting survival probabilities. This has the potential to advance digital imaging biomarkers, not only for lung disease but for other diseases by addressing key limitations in current survival analysis methods and allowing for more accurate and reliable models.
The algorithm has been trained and evaluated using the combination of real-world, high-resolution lung CT scans (HRCT), associated clinical data, and mortality labels from the OSIC Data Repository. CenTime has shown promising results in more accurately predicting survival probabilities. This has the potential to advance digital imaging biomarkers, not only for lung disease but for other diseases by addressing key limitations in current survival analysis methods and allowing for more accurate and reliable models.
Current Limitations
Vertical Divider
|
CenTime
|
UCL compared CenTime with standard methods like the Cox proportional-hazard model and DeepHit. Results have indicated that CenTime offers state-of-the-art performance in predicting time-to-death while maintaining comparable ranking performance.
Learn more about OSIC
We encourage you to contact us to discuss our mission, the OSIC Cloud Data Repository, and how you can help make a difference in the fight against IPF and ILDs. Additional partners, collaborators and contributors are welcome and encouraged.
|