Beginning in mid-February 2008, the 1997-2007 online version of the Science Watch® newsletter, ESI-Topics.com, and in-cites.com, will all be featured together on the redesigned ScienceWatch.com. All previous content from the three sites will be permanently archived, and remain accessible from any existing bookmarks to the archived pages. No new content will be added to this site. Updates and new content (updated biweekly) are available at ScienceWatch.com now.

New Hot Paper Comments

By Keith Baggerly

ESI Special Topics, July 2005
Citing URL - http://www.esi-topics.com/nhp/2005/july-05-KeithBaggerly.html

Keith Baggerly answers a few questions about this month's new hot paper in the field of Computer Science.


From •>>July 2005

Field: Computer Science
Article Title: Reproducibility of SELDI-TOF protein patterns in serum: comparing datasets from different experiments
Authors: Baggerly, KA;Morris, JS;Coombes, KR
Journal: BIOINFORMATICS
Volume: 20
Page: 777-U710
Year: MAR 22 2004
* Univ Texas, MD Anderson Canc Ctr, Dept Biostat, 1515 Holcombe Blvd, Box 447, Houston, TX 77030 USA.
* Univ Texas, MD Anderson Canc Ctr, Dept Biostat, Houston, TX 77030 USA.

ST:  Why do you think your paper is highly cited?

Left to right: Jeff Morris, Keith Baggerly, and Kevin Coombes
“Over the past few years, dramatic claims have been made about how proteomic patterns derived from serum samples could be used to detect ovarian cancer and possibly other types as well.”

There are a few reasons. First, we looked at a diagnostic test of great clinical relevance. Second, we gave clear examples of how things could go wrong, supported by vivid pictures. Third, we suggested how these problems could be avoided in future studies. Fourth, we illustrated the need for the occasional use of "forensic bioinformatics," diagnosing features of the experiment on the basis of traces left in the raw data. Fifth, the results have been controversial, and have led to an ongoing debate. The most recent part of this debate is in Baggerly et al., J. Natl. Canc. Inst. 97:307-309, 2005.

ST:  Does it describe a new discovery or a new methodology that's useful to others?

Oddly enough, the main thrust of our paper is not to introduce NEW methodology, but rather to describe the need for careful application of established methodology to new methods of measurement. Over the past few years, several high-throughput assays for biological information have been developed. These assays are of great utility because of their ability to detect changes. However, since these assays are not picky about the types of changes they detect, randomization of the samples needs to be employed to ensure that the differences we find are due to what we think they are.

ST:  Could you summarize the significance of your paper in layman's terms?

Over the past few years, dramatic claims have been made about how proteomic patterns derived from serum samples could be used to detect ovarian cancer and possibly other types as well. These patterns are found by data mining of high-dimensional mass spectrometry profiles, and do not require the identification of the proteins involved. Since we don't have noninvasive tests at present, such assays would be of great clinical utility.

In reexamining the raw data, we found evidence that the differences found and attributed to disease status could be explained at least as well by systematic biases in sample handling and preprocessing. As an example, if cancer data are gathered on Monday and control data on Wednesday, then a drift of the values reported due to a gradual breakdown of the machine would give rise to machine differences that could be mistaken for biological differences. Our paper illustrated the types of problems that could occur, and pointed out the need for careful experimental design in the planning of such studies so that biases could be avoided (especially vital when black box approaches are used). It indicated the importance of having access to the raw data, and the need to do quality checking before data mining.

ST:  How did you become involved in this research?

Working at a cancer center, there is a great deal of interest in developing better diagnostic assays that can catch the disease early, when treatment is more likely to be effective. Consequently, when new methods that look promising appear, we are often contacted with the question "can we do this too?" In this context, we try to figure out how results were initially achieved, which at times puts us in the devil's advocate position of asking whether the results could have been obtained due to causes other than the ones initially thought. Understanding how the data are acquired helps us pose these questions, and the learning process is fascinating!End

Keith Baggerly
Associate Professor
Biostatistics and Applied Mathematics
MD Anderson Cancer Center
Houston, TX, USA

Jeff Morris
Assistant Professor 
Biostatistics and Applied Mathematics
MD Anderson Cancer Center
Houston, TX, USA

Kevin Coombes
Associate Professor
Biostatistics and Applied Mathematics
MD Anderson Cancer Center
Houston, TX, USA

ESI Special Topics, July 2005
Citing URL - http://www.esi-topics.com/nhp/2005/july-05-KeithBaggerly.html

•> Search Special Topics
New Hot Papers Menu || All Topics Menu
New Hot Papers Comments Menu
Help || About || Contact

ScienceWatch.com - Tracking Trends and Perfomance in Basic Research
Go to the new ScienceWatch.com

Write to the Webmaster with questions/comments. Terms of Usage.
The Research Services Group of Thomson Scientific |
(c) 2008 The Thomson Corporation.