NIH Clinical Center Provides One of the Largest Publicly Available Chest X-Ray Datasets to Scientific Community

A chest x-ray identifies a lung mass.

The NIH Clinical Center recently released over 100,000 anonymized chest x-ray images and their corresponding data to the scientific community. The release will allow researchers across the country and around the world to freely access the datasets and increase their ability to teach computers how to detect and diagnose disease. Ultimately, this artificial intelligence mechanism can lead to clinicians making better diagnostic decisions for patients.   

NIH compiled the dataset of scans from more than 30,000 patients, including many with advanced lung disease. Patients at the NIH Clinical Center, the nation’s largest hospital devoted entirely to clinical research, are partners in research and voluntarily enroll to participate in clinical trials. With patient privacy being paramount, the dataset was rigorously screened to remove all personally identifiable information before release.

Reading and diagnosing chest x-ray images may be a relatively simple task for radiologists but, in fact, it is a complex reasoning problem which often requires careful observation and knowledge of anatomical principles, physiology and pathology. Such factors increase the difficulty of developing a consistent and automated technique for reading chest X-ray images while simultaneously considering all common thoracic diseases.

By using this free dataset, the hope is that academic and research institutions across the country will be able to teach a computer to read and process extremely large amounts of scans, to confirm the results radiologists have found and potentially identify other findings that may have been overlooked.

In addition, this advanced computer technology may also be able to:

  • help identify slow changes occurring over the course of multiple chest x-rays that might otherwise be overlooked
  • benefit patients in developing countries that do not have access to radiologists to read their chest x-rays, and 
  • create a virtual radiology resident that can later be taught to read more complex images like CT and MRI in the future.

With an ongoing commitment to data sharing, the NIH research hospital anticipates adding a large dataset of CT scans to be made available as well in the coming months.

No Comments Yet

Leave a Reply

Your email address will not be published.