Segmentation of Skull-infiltrated Tumors Using ITK: Methods and Validation

Please use this identifier to cite or link to this publication: http://hdl.handle.net/1926/21
Methods for segmentation of skull infiltrated tumors in Computed
Tomography (CT) images using Insight Segmentation and Registration
Toolkit ITK (www.itk.org) are presented. Pipelines of filters and
algorithms from ITK are validated on the basis of different criteria: sensitivity,
specificity, dice similarity coefficient, Chi-squared, and Hausdorff
distance measure. The method to rate segmentation results in relation
to validation metrics is presented together with analysis of importance of
different goodness measures. Results for one simulated dataset and three
patient are presented.
Data
minus 1 File (210Kb)
Code
There is no code review at this time.

Reviews
minus Evaluatin of ITK Segmentation Methods by Luis Ibanez on 09-18-2005 for revision #1
starstarstarstarstar expertise: 5 sensitivity: 4.5
yellow
Summary:
This papers describes the application of algorithms available in ITK to the segmentation of brain-tumors infiltrating the skull. It also implements a methodology for evaluating their results compared to a human expert.

Hypothesis:
This paper hypothesis that segmentation methods from the Insight Toolkit can be used for delineating skull infiltrated tumors. In order to test this hypothesis, the authors selected multiple segmentation methods from the toolkit, ran segmentation on test data and implemented an evaluation methodology for comparing the results from the software-based segmentation with the human expert segmentations.

Evidence:
The authors present abundant evidence from the result of their segmentation experiments. The comparision of the segmentation methods reveals that the authors invested a significan amount of work in exploring the applicability of the many segmentation methods compared here. It is unfortunate that the full material of those experiments was not shared with the readers, who would have derived great benefits from being able to apply the same set of experiments to other medical segmentation problems.

Open Science:
The paper describe the segmentation methods evaluated in this case, and the evaluation methodology used for comparing the results. This evaluation methodology seems to be the most valuable contribution of the paper, since it reveals that a wide set of segmentation tools were systematically tested. Unfortunately the authors did not share the source code used for their test and for their evaluation methodology. Despite the fact that their code is based on ITK filters, it would take a reader a significant amount of time to replicate the work of the authros, since the reader will have to reimplement the code for all the tests.

Reproducibility:
The work can hardly be reproduced. Not enough coding details are provided by the authors, and in particular, no details regarding the many numerical parameters of the segmentation methods are mentioned. In the current form it is impossible to replicate the work performed by the authors. A reader could run an equivalent set of experiments, by using ITK filters, but the lack of the input images, and the lack of details regarding the parameters will make impossible to ensure that the reader's set of test is equivalent to the one reported by the authors.


Use of Open Source Software:
The authors make extensive use of the Insight Toolkit, and took care of using ITK filters as building blocks. Their study of the applicability of ITK segmentation filters is certainly valuable for the readers, since it guides them towards the segmentation methods that displayed the best performance (compared to a human expert) for segmenting the skull-infiltrated tumors.



Open Source Contributions:
The authors did not provide their source code.


Code Quality:
Reviewer's speculation: Judgging from the description in the paper it seems that the authors made a good use of the code available in the Insight Toolkit, by combining ITK filters as building blocks for their segmenation pipelines.

Applicability to other problems:
The generic problem faced by the authors is indeed quite challenging since the intensity of the image alone is not enough for segmenting the objets. The ITK methods that the authors found to be more appropriate for this problem, will probably perform well for other segmentation situations where the intensity of the object to be extracted is similar to the intensity of adjacent structures.


Suggestions for future work:
Many of the methods used in VALMET (Gerig, G., Jomier, M., Chakos, M.) are now available in ITK in the form of filters. In particular the overlap measures, the Haussdorf distance and surface mean distance. It could be interesting to create a reusable framework for segmentation validation that any authors could use during the validation stages of their segmentation work.



Requests for additional information from authors:
It will be very useful if the authors share their source code with the community. In particular all the parameters setting of the segmentation methods. The source code of their evaluation framework will also be extremely useful for anybody performing a segmentation evaluation study.


Additional Comments:
Since the authors found the Chi squared value to be a convenient measure for comparing the segmentation results to the human expert delineation, it seems to be interesting to create that measure as an ITK filter.
minus Interesting Validation Study by Sylvain Bouix on 08-24-2005 for revision #1
starstarstarstarstar expertise: 4 sensitivity: 4.5
yellow
Summary:
This paper describes a validation study of the segmentation of skull infiltrated tumors.
Two segmentation pipelines implemented in ITK are presented, one based on region growing, one based on level sets.
They are then validated agains one expert's segmentations using five similarity measurement: sensitivity, specificity, Dice, X2 and Hausdorff

Hypothesis:
Non applicable

Evidence:
Four datasets were used, 3 patients with manual tracings by 1 human expert and 1 phantom with ground truth.
The experimetns are clearly explained and sufficient evidence is provided.

Open Science:
There is enough information in the paper to reproduce the experiment, but no data or source code is provided.

Use of Open Source Software:
Authors used ITK for the segmentation pipelines and Hausdorff measurements

Open Source Contributions:
No code provided.

Applicability to other problems:
The main contribution of the paper is to question certain measurement when validating segmentation technique, which I think is important for the community in general. They partiucularly caution the reader in using specificity and sensitivity as the sole validation parameters as they are dependent on the image size vs. object size relationship. They later show that these measures actually rank region growing higher than level sets even though other measures disagree.

Suggestions for future work :
In the context of open science, I suggest the author incorporate the validation measurements into ITK and share their data and source code so experiments can be reproduced elsewhere.
I also suggest they investigate other similarity measures found in the statistic litterature. The article by Hripcsakk abd Heitjjan on "Measuring Agreement in medical informatics reliability studies" is a good starting point.

Comment by Aleksandra Popovic: Comment by the author yellow
I would like to thank Sylvain for the comprehensive and helpful reviewI would like to thank Sylvain for the comprehensive and helpful review. I haven’t previously read the paper you suggested. Hripcsak and Heitjan present an overview on validation metrics in medical reliability studies. They pay the most attention to kappa statistics, defined as normalized difference between observed agreement and agreement expected by chance. Normalization (1- expected agreement by chance) assures that in case of agreement per chance, kappa metric equals to zero.
As stated in this and other paper by Hripcsak and Rothschield (Hripcsak and Rothschield. Agreement, the F-Measure, and Reliability in Information Retrieval, J Am Med Inform Assoc., 2005; 12:296-298 DOI 10.1197) kappa statistics is:

- prevalence sensitive
- impossible to calculate without a negative case count (requires number of TN)

Hripcsak and Rothschield propose use of F-Measure which is combined weighted mean between recall and precision. In case of balanced data (weight = 1) it is equal to Dice Similarity Coefficient (DSC).

I have implemented kappa calculation in our classes. I can roughly say that DSC and kappa come out with very similar results (difference < 0.02) if both are high (i.e. above 0.7). In cases of small DSC and kappa (<0.7) I could notice more significant discrepancies (about 0.1).
I will try to organize data and event. add it in some future revisions.

This is partially consistent with conclusion from Hripcsak and Rothschield:
“If d (true negative number) is at least known to be large, however, the probability of chance agreement on positive cases approaches zero; Equation 5 (kappa) approaches Equation 4 (DSC), and kappa approaches the positive specific agreement (i.e. DSC)”.

Greetings
Aleksandra
Add a new review
Quick Comments


Resources
backyellow
Download All

Statistics more
backyellow
Global rating: starstarstarstarstar
Review rating: starstarstarstarstar [review]
Code rating:
Paper Quality: plus minus

Information more
backyellow
Keywords: ITK, Segmentation, Validation, Level Sets, Region Growing
Export citation:

Share
backyellow
Share

Linked Publications more
backyellow
A Label Geometry Image Filter for Multiple Object Measurement A Label Geometry Image Filter for Multiple Object Measurement
by Padfield D., Miller J.
Label object representation and manipulation with ITK Label object representation and manipulation with ITK
by Lehmann G.

View license
Loading license...

Send a message to the author
main_flat
Powered by Midas