Abstract
An Insight Toolkit (ITK) implementation of our knowledgebased segmentation algorithm applied to brain MRI scans is presented in this paper. Our algorithm is a refinement of the work of Teo, Saprio, and Wandall. The basic idea is to incorporate prior knowledge into the segmentation through Bayesrule. Image noise is removed via an affine invariant anisotropic smoothing of the posteriors as in Haker et. al. We present the results of this code on two different projects. First, we show the effect of applying this code to skull-removed brain MRI scans. Second, we show the effect of applying this code to the extraction of the DLPFC from a user-defined subregion of brain MRI data.We present our results on brain MRI scans, comparing the results of the knowledge-based segmentation to manual segmentations on datasets of schizophrenic patients.
Keywords
Source Code and Data
Reviews
Vincent Magnotta
Monday 8 August 2005
Summary: [Short description of the paper. In two or three phrases describe the problem that was addressed by the authors and the approach they took to solve it.] This paper describes a Bayes based segmentation of cortical regions using MR images and expert based definitions for training the Bayes based classification. This was used to define DLPFC.
Hypothesis: [If Applicable: Describe the assumptions that the authors have made and they hypothesis of their work, note that not all papers will fit the model of hypothesis driven work, for example, the description of an image database, or the description of a toolkit will not be driven by an hypothesis, in which case, please simply write : âNon Applicableâ in this field or delete the subtitle.] Anatomical variavbility is not sufficiently large in DLPFC to allow a Bayes based classification to be applied to this region without additional image registration.
Evidence: [Describe the evidence that the authors provide in order to support their claims in the paper. This is a key component on Open Science, opinions that are not supported by evidence should be labeled as âspeculationsâ or âauthorâs opinionâ while. The same rule applies to the text of the reviews: claims should be supported by evidence] The authors present DICE measures of region agreement. It is unclear why only single slice DICE metrics are provided instead of a 3D measure.
Open Science: [Describe how much the paper and its addendums adhere to the concept of Open Science. Do the authors provide the source code of the programs used in their experiments? Do the authors provide the input images that they used? Or are those images publicly available? Do the authors provide the output images that they show in the paper? Do the authors provide enough details for you to be able to replicate their work? The authors provide source code. No manual defined ROIs or image data is provided.
Reproducibility: [Did you reproduce the authorsâ work? Did you download their code? Did you compile it? Did you run it? Did you managed to get the same results that they reported? Were there information missing from the paper, that was necessary for you to reproduce the work? Suggest improvements that will make easier for future readers to reproduce this work.]
Use of Open Source Software: [Did the authors use Open Source software in their work? Do they describe their experience with it, advantages and disadvantages? Do they provide advice for future users of those Open Source packages?]
Open Source Contributions: [Do the authorâs provide their source code? Is it in a form that is usable? Do they describe clearly how to use of the code? How long did it take you to use that code?]
Code Quality: [If the authors provided their source code: Was the code easy to read? Did they use a modern coding style? Did they rely on non-portable mechanism? Was it suitable for multiple-platforms?]
Applicability to other problems: [Do you find that the authors methods can be applied to other image analysis problems? Suggest other disciplines or even other specific projects that could take advantage of this work] This approach has great potential for a variety of applications.
Suggestions for future work : [Suggest to authors future directions for improving their methods, or other domains from which they could learn technique that could help them advance in their research.]
Requests for additional information from authors : [Did you find that information was missing from the paper? Maybe parameters for running the tests? Maybe some images were missing? Would you like to get more details on how the diagrams, or plots were generated?] The authors may want to specifiy how this algorithm relates to the work of Fischl et al.
Additional Comments: [This is a free-form field]
Gavin Baker
Friday 16 September 2005
Summary:
This paper presents a statistical approach to segmenting MRI brain scans. It extends a method described by Teo et al. It is an implementation of a K-means classifier.
Hypothesis:
The first stated assumption is that “the value of each voxel intensity in a given class can be considered as a random variable, independent across pixels”. The second assumption is that the voxel intensities are normally distributed. The hypothesis of the paper is that a priori intensity statistics can be derived from an image which can then be used to segment the tissue types.
Evidence:
The authors present segmentation results on two different MRI data sets. These results could not be reproduced, as the data and parameters were not provided.
The data used for validation was first from coronal MRI scans. The data was hand-segmented and compared with the classified output of the program. The correspondence in all 10 cases was very good (>0.7). The second test involved a single ROI from an MRI of the prefrontal cortex. The correspondence with manual segmentation was similarly high.
Open Science:
The paper includes full source, but no data or parameters.
The full details of the algorithm are not included in this paper, apparently due to space constraints. This makes it difficult to evaluate the algorithm, verify the claims and implementation.
Reproducibility:
No data was provided to reproduce the findings in the paper. At least one data set and accompanying parameters should be supplied, in order to enable people to reproduce the results.
I downloaded, compiled and ran the program. It was built on a Debian GNU/Linux Pentium 4 system with GCC 3.3 and ITK CVS.
In the absence of any test data, I obtained a data set from BrainWeb (http://www.bic.mni.mcgill.ca/brainweb/). The dataset selected was Modality=T1, Protocol=ICBM, Phantom_name=normal, Slice_thickness=1mm, Noise=3%, INU=20%. Since the implementation works in 2D, I selected slice 87 for testing (being close to those shown in the paper, with bone, white matter, grey matter, and CSF - bone was note removed as mentioned in the paper). The histogram of this slice reveals 4 peaks (at approx 3.5, 42.1, 99.0, 133.2) representing the four classes mentioned above.
The program was run, specifying 2 filter passes and 4 classes. The output is shown below:
cluster[0]-- estimated mean : 4.70918 estimated covariance : 27.4318 cluster[1]-- estimated mean : 46.2727 estimated covariance : 136.147 cluster[2]-- estimated mean : 97.0753 estimated covariance : 98.3453 cluster[3]-- estimated mean : 133.666 estimated covariance : 103.385 Prior image in initial section [0.25, 0.25, 0.25, 0.25] RawData image in initial section 0 Data image in initial section [1.79769e+308, 1.79769e+308, 1.79769e+308, 1.79769e+308] Initial Posteriors [4.49423e+307, 4.49423e+307, 4.49423e+307, 4.49423e+307] After renormalizing in initial section [0, 0, 0, 0] Posteriors after smoothing in initial section [nan, nan, nan, nan] Label image in initial section 0 Posteriors after decision rule in initial section [nan, nan, nan, nan]
The K-means clustering correctly identified the peaks in the histogram corresponding to the 4 classes to within a small degree. The labelled output was blank, presumably due to the zeros and nans in the results above. It appears the program failed to calculate the initial posteriors correctly. Further results were not persued.
Use of Open Source Software:
The implementation is an extension of ITK, and adds a utility class. The algorithm described is intended to ultimately be contributed also, once it has been rewritten as a proper ITK filter.
Open Source Contributions:
Source is provided to test the algorithm in the form of a command-line program, which appears to be portable. The paper describes how to use the code. However as it stands the algorithm is implemented as one monolithic function as main(), and is not reusable in its current form.
Code Quality:
The code did not build as published. The CMakeLists file needed to be changed to refer to the main driver program, and some corrections needed to be applied to KnowledgeBasedSegmentation.cxx in order to compile and run correctly:
--- orig/KnowledgeBasedSegmentation.cxx 2005-09-16 18:03:11.898713993 +1000
+++ ./KnowledgeBasedSegmentation.cxx 2005-09-16 19:27:30.498662560 +1000
@@ -37,7 +37,7 @@
int main( int argc, char * argv [] )
{
- if( argc < 8 )
+ if( argc < 5 )
{
std::cerr << "Missing command line arguments" << std::endl;
std::cerr << "Parameters: inputFileName outputFileName nSmoothingIterations nClasses" << std::endl;
@@ -46,8 +46,8 @@
char * rawDataFileName = argv[1];
char * labelMapFileName = argv[2];
- int nSmoothingIterations = argv[3]; // USER VARIABLE (DEFAULT = 10)
- unsigned int nClasses = argv[4];
+ int nSmoothingIterations = atoi(argv[3]); // USER VARIABLE (DEFAULT = 10)
+ unsigned int nClasses = atoi(argv[4]);
float timeStep = 0.1; // USER VARIABLE (DEFAULT = 0.1)
float conductance = 3.0; // USER VARIABLE (DEFAULT = 3.0)
The implementation is provided largely in one very large main() function. This does not follow the principles of modular design, and prevents the algorithm being reused in other projects. The coding style is fairly consistent but sparsely commented.
As shown above, the program did not appear to run correctly to completion, so no real results were obtained.
The code uses only standard ITK and C library functions, and should be easily portable to common platforms.
Applicability to other problems:
This technique could be developed further and extended to apply to other k-class segmentation problems.
Suggestions for future work:
The explanation of the algorithm design could benefit from expansion.
Requests for additional information from authors:
Why is the algorithm only working in 2D with individual slices?
Why does “removing” the bone improve the results? Surely it is simply another tissue class?
Why did you choose DICE to compare the segmentation results?
It would be very helpful to provide at least one test data set and the parameters required to reproduce the data you describe in the paper.
Steps 2-4 need further explanation and discussion.
Additional Comments:
Many people view the term “knowledge-based” as a fairly significant claim, often overused, in that “knowledge” is a very high-level concept implying experience, ideas and inferences. In other words, much more than just facts, data or statistics. In this case, the segmentation is ostensibly based on a priori statistics, which is arguably not the same as “knowledge”.
In this instance, a priori information is not gathered from other training data sets. The statistics are gathered from the one image, and the cluster means are used to classify each voxel using a MAP approach. It is probably a stretch to call this “knowledge-based”.
The paper does not discuss other statistical approaches, nor contextualise this research.
As the paper suggests this is part of ongoing work, the newer more developed versions could serve as a useful addition to the ITK codebase.
