The NLM-Mayo Image Collection: Common Access to Uncommon Data

Please use this identifier to cite or link to this publication: http://hdl.handle.net/1926/22
For over two decades, the National Library of Medicine (NLM)
has provided support for the collection of biomedical image data for use
throughout the biomedical image, visualization and analysis community.
Data collected during the Visible Human Project has been utilized to
advance development of medical image research, education, and other
ventures. One goal of our research has been to further diversify such
image data and make it openly available to the biomedical imaging community.
The approach is to provide open access to a diverse collection
of biomedical image data that can be used for the development and validation
of new image processing and analysis techniques. With support
from NLM, over 100 datasets were incorporated into the NLM-Mayo
data collection. There is variation in species, anatomy, pathology, scale,
and modality. In addition to providing linical qualitymedical image
data, the collection also includes newly acquired datasets of several animals,
including a whole mouse with both T and R data volumes.
This unique collection of data was categorized and organized into an intuitive
web-based browser which allows a user to rapidly access descriptive
information as well as the actual data volumes. The data collection
will be made available by the NLM for distribution, vis-a-vis its Visible
Human Project (VHP). Because the landscape of biomedical imaging
continues to change with new, advanced image acquisition systems and
techniques, continuously updating the VHP data collection seems prudent.
Additional new and varied image data will be incorporated into our
collection and disseminated to researchers in the medical image analysis
community.
Data
minus 1 File (397Kb)
Code
There is no code review at this time.

Reviews
minus Great Idea. Waiting for data to be released for open access. by Tina Kapur on 09-17-2005 for revision #1
starstarstarstarstar expertise: 3 sensitivity: 4.7
yellow
Summary:

As previous reviewers have stated, this paper describes a collection of 100 diverse data sets that has been compiled already and will be made available (shortly?) for open access by the NLM via the Visible Human Project. The data sets are across species (mouse, Canine, Dophin, Mouse, Rabbit), anatomy(Abdomen, Brain, Cells, Chest, Ear, Hand, Heart, Knee, Larynx, Liver, Prostate), modality (MR, CT, Confocal Microscopy, PET, SPECT, Ultrasound, microCT, microMR), and pathology. The highest level categories of the collection are human and animal, each further subdivided into anatomical regions, then into modality, and then pathology. Many of the data sets were retrieved from archives at Mayo and collaborators, while an interesting one acquired specifcally for the collection is a full mouse data set acquired as overlapping sections with microCT and microMR at voxel resolutions of 0.02mm^3 and 0.125mm^3, and realigned into a single volume.

Each data set is available in three file formats: Analyze 7.5, Analyze Volumefile, and MetaIO. References for descriptions of these formats are provided.

Human data has been anonymized and complies with HIPPA standards.

The data has been organized into a website with a custom browser.

Hypothesis:
A hypotheis is that openly accessible data is useful for advancing science.

Evidence:
This is a widely accepted truth in many fields, including biomedical research. The authors have also provided references on how exisiting collections of data such as the Visible Human and the Vanderbilt Registration data have been successfully used by many researchers in different applications.

Open Science:
Data sets like this are key enablers of open science.

Reproducibility:

Use of Open Source Software:
It is not clear if the web browser and file format conversion tools are open source.

Additional Comments:

- This is very useful work and should enable research in the area as the authors have noted.

- For those interested in additional image collections, the authors have provided an interesting list (from open to restricted to closed): the visible Koren and Chinese human projects, MNI Brainweb for brain anatomy, UCDavis Brain atlas project for human and animal brains, BIRN brain data collection.

- Following a link in the bibliography explains BIR (http://www.mayo.edu/bir)- it stands for Biomedical Imaging Resource, the group computational group at Mayo where this work was conducted.

-The categories noted in the paper include 5 species, 11 anatomical regions, and 8 modalities. One can see that even if one data set was available for each bin of this table, 5x11x8x2 (the last two is one normal and one pathological) or 880 data sets would be needed. Clearly, there is room to add to this collection. Question for authors: are there guidelines that would allow other researchers to supply data sets that could be added to the collection to help populate the table? Are there NLM plans to solicit entries for this collection the way that algorithm implementations were successfully added to ITK?

- Having a web based browser for the data was a great choice. As useful as downloaded data viewing applications can be, it seems that browsing data in a collection is something that is very well suited for a web based browser. I am waiting to try it out to see if the data sizes will make network lag an issue while browsing.

-File formats: Is there a reason that images in this collection are not provided in the industry standard DICOM format? Most of the data in this collection seem to be the direct output of imaging scanners, does keeping it in DICOM (assuming it was once in that format) rather than converting it to the Analyze/MetaIO formats lose anything?
minus Great data for testing by Hans Johnson on 09-16-2005 for revision #1
starstarstarstarstar expertise: 2 sensitivity: 4.7
yellow

Summary:
This document describes a repository of publicly available images and the tools developed to allow access to the data.

Hypothesis:
Access to diverse data sets is critical to testing software. These data sets are useful for verifying that an algorithm is robust across data sets that differ from the data
used during the initial development of the agorithm.

Evidence:
Examples of the web based interface to the downloading the data, as well as a description of the organization of the data is described.

Open Science:
The data in this document are public. The tools described for dissemination of the data do not seem to be publicly available (although that is not the focus of the
paper).


Reproducibility:
NA

Use of Open Source Software:
Unclear, but the data itself is collected from multiple public repositories.

Open Source Contributions:
No code, but the data is very important to have available.

Code Quality:
NA

Suggestions for future work:
Continue to add interesting data sets to this collection.
minus Great contribution supporting open science (when can we get our hands on it?) by Josh Cates on 09-08-2005 for revision #1
starstarstarstarstar expertise: 4 sensitivity: 4.3
yellow
Summary: This paper describes a new, open-access collection of medical
image data assembled by the Mayo Clinic with support from the National Library
of Medicine (NLM). The collection includes over 100 datasets of
varying modalities, anatomy and subject species and is intended to support
medical imaging research. The focus of the collection is on diversity,
multi-modalilty, and new imaging modalities. A web-based interface to the
collection has been developed and the data will be made available through
the NLM.

Hypothesis:
The work was motivated by the idea that open access data collections facilitate
medical imaging research, and it follows in the footsteps of the original Visible
Human Project initiative. The authors anticipate that this data will be useful
in the development and validation of new imaging algorithms.

Evidence:
The authors clearly state their motivations behind the work and cite some evidence that
similar efforts have been effective, but this paper is not intended to be a rigorous
investigation into the usefulness of open access data collections.

Open Science: This is a true open science initiative. The data is to be
released to the research community through a web interface. The article indicates
that this is an ongoing effort.

Some issues: None of the data seems to be currently available. When will it be
released? It is also not clear from the article what restrictions will be placed
on the use of the data. Will they be similar to those governing use of Visible
Human Data?

Reproducibility:
Not applicable.

Use of Open Source Software:
Not applicable.

Open Source Contributions:
(Will the web interface be open source?)

Code Quality:
Not applicable.

Applicability to other problems:
The data will no doubt be of interest to researchers in domains outside
medical imaging (e.g. computer graphics, anatomists, education, ...).

Suggestions for future work:


Requests for additional information from authors: Will any of the
volumetric datasets include the raw (un-reconstructed) scanner data? This data
would obviously be of great interest to the reconstruction community. It would
also be helpful to include information about the reconstruction algorithms used
to produce the volumes and the scanner geometries.

Additional Comments: I would encourage the authors to include additional
information as it becomes available, i.e. where and how to access, restrictions
on use, etc.

Some minor points:
The article reads fairly well, but another editing pass might be useful. There are
for example, one or two typos to address (e.g. p. 3 comprhensive->comprehensive).
Also, I believe the acronym 'BIR' is never fully spelled out.

I look forward to getting my hands on some of this data!
Add a new review
Quick Comments


Resources
backyellow
Download All

Statistics more
backyellow
Global rating: starstarstarstarstar
Review rating: starstarstarstarstar [review]
Code rating:
Paper Quality: plus minus

Information more
backyellow
Keywords: Open Data, Medical Image Archive
Export citation:

Share
backyellow
Share

Linked Publications more
backyellow
An Architecture Validation Toolset for Ensuring Patient Safety in an Open Source Software Toolkit... An Architecture Validation Toolset for Ensuring Patient Safety in an Open Source Software Toolkit...
by Gary K., Kokoori S., David B., Otoom M., Blake M.B., Cleary K.
Shape-based Interpolation of a Set of 2D Slices Shape-based Interpolation of a Set of 2D Slices
by Boydev C., Pasquier D., Derraz F., Peyrodie L., Taleb-Ahmed A., Thiran J.

View license
Loading license...

Send a message to the author
main_flat
Powered by Midas