32,000 images of analysed CT scans released for AI development

Posted: 23 July 2018 | | No comments yet

Thousands of images have been released to encourage scientists to develop their analysis of lesions, advocating its use for the development of a new universal lesion detector…


32,000 annotated images have been released into the public domain by the National Institutes of Health’s Clinical Centre. The aim is to help the scientific community improve the accuracy of the detection of lesions.

Most publicly available datasets only contain around 1000 images or less, and as such are not as thorough in their analysis of different types of lesions in different areas of the body.

This large-scale dataset comprises of images that have been carefully anonymised and represents 4400 patients.

Images acquired from a CT scanner were sent to a radiologist at the Clinical Centre to interpret, who then measured clinically meaningful findings with an electronic bookmark tool, complexly mapping the lesion with text, arrows, lines and diameters. The aim was to identify the exact size and location of the lesion, so any further growth or change can be identified.

The newly available dataset, called DeepLesion, is abundant with these types of retrospective annotated images. It detects more than one type of lesion, opposed to other datasets which only distinguish one type of lesion.

Medical image annotations require extensive clinical knowledge and cannot currently be completed in similar ways to a search engine; by collecting image labels as tags. However, using this data set, a deep neural network could be trained to identify lesions in CT scan images, which may enable the scientific community to create a universal lesion detector within one framework.

This dataset contains critical radiology details from a vast array of lesions across the body, such as liver tumours, enlarged lymph nodes and lung nodules. Scientists at the Clinical Centre hope that the release of this dataset, other scientists will be able to:

  • Accurately measure the size and location of all lesions in a patient, enabling a whole-body assessment
  • Study the relationship between differing types of lesions, identifying multiple lesions in one CT examination, and analysing these to make new discoveries
  • Develop a new universal lesion detector, which may serve as an initial screening tool, then going on to send the data to more a specialist system designed to deal with particular lesions.

NIH’s Clinical Centre aims to continue improving DeepLesion by collecting and uploading more images in order to improve its detection accuracy, with future intentions to upload other analysed images such as MRI, possibly combining data from multiple hospitals.