|
Keith Baggerly answers a few questions about this month's
new hot paper in the field of Computer Science.
From
•>>July 2005
Field:
Computer Science
Article Title: Reproducibility of SELDI-TOF protein patterns in serum: comparing datasets from different experiments
Authors: Baggerly,
KA;Morris, JS;Coombes, KR
Journal: BIOINFORMATICS
Volume: 20
Page: 777-U710
Year: MAR 22 2004
* Univ Texas, MD Anderson Canc Ctr, Dept Biostat, 1515 Holcombe Blvd, Box 447, Houston, TX 77030 USA.
* Univ Texas, MD Anderson Canc Ctr, Dept Biostat, Houston, TX 77030 USA.
|
Why
do you think your paper is highly cited?
|

“Over the past few years, dramatic claims have been made about how proteomic patterns derived from serum samples could be used to detect ovarian cancer and possibly other types as well.”
|
|
There are a few reasons. First, we looked at a diagnostic test of
great clinical relevance. Second, we gave clear examples of how
things could go wrong, supported by vivid pictures. Third, we
suggested how these problems could be avoided in future studies.
Fourth, we illustrated the need for the occasional use of
"forensic bioinformatics," diagnosing features of the
experiment on the basis of traces left in the raw data. Fifth, the
results have been controversial, and have led to an ongoing debate.
The most recent part of this debate
is in Baggerly et al., J. Natl. Canc. Inst. 97:307-309,
2005.
Does
it describe a new discovery or a new methodology that's useful to
others?
Oddly enough, the main thrust of our paper is not to introduce
NEW methodology, but rather to describe the need for careful
application of established methodology to new methods of
measurement. Over the past few years, several high-throughput assays
for biological information have been developed. These assays are of
great utility because of their ability to detect changes. However,
since these assays are not picky about the types of changes they
detect, randomization of the samples needs to be employed to ensure
that the differences we find are due to what we think they are.
Could
you summarize the significance of your paper in layman's terms?
Over the past few years, dramatic claims have been made about how
proteomic patterns derived from serum samples could be used to
detect ovarian cancer and possibly other types as well. These
patterns are found by data mining of high-dimensional mass
spectrometry profiles, and do not require the identification of the
proteins involved. Since we don't have noninvasive tests at present,
such assays would be of great clinical utility.
In reexamining the raw data, we found evidence that the
differences found and attributed to disease status could be
explained at least as well by systematic biases in sample handling
and preprocessing. As an example, if cancer data are gathered on
Monday and control data on Wednesday, then a drift of the values
reported due to a gradual breakdown of the machine would give rise
to machine differences that could be mistaken for biological
differences. Our paper illustrated the types of problems that could
occur, and pointed out the need for careful experimental design in the
planning of such studies so that biases could be avoided (especially
vital when black box approaches are used). It indicated the importance
of having access to the raw data, and the need to do quality checking
before data mining.
How
did you become involved in this research?
Working at a cancer center, there is a great deal of interest in
developing better diagnostic assays that can catch the disease
early, when treatment is more likely to be effective. Consequently,
when new methods that look promising appear, we are often contacted
with the question "can we do this too?" In this context,
we try to figure out how results were initially achieved, which at
times puts us in the devil's advocate position of asking whether the
results could have been obtained due to causes other than the ones
initially thought. Understanding how the data are acquired helps us
pose these questions, and the learning process is fascinating!
Keith Baggerly
Associate Professor
Biostatistics and Applied Mathematics
MD Anderson Cancer Center
Houston, TX, USA
Jeff Morris
Assistant Professor
Biostatistics and Applied Mathematics
MD Anderson Cancer Center
Houston, TX, USA
Kevin Coombes
Associate Professor
Biostatistics and Applied Mathematics
MD Anderson Cancer Center
Houston, TX, USA
|
ESI Special Topics,
July 2005
Citing URL - http://www.esi-topics.com/nhp/2005/july-05-KeithBaggerly.html
|
|