Beginning in mid-February 2008, the 1997-2007 online version of the Science Watch® newsletter, ESI-Topics.com, and in-cites.com, will all be featured together on the redesigned ScienceWatch.com. All previous content from the three sites will be permanently archived, and remain accessible from any existing bookmarks to the archived pages. No new content will be added to this site. Updates and new content (updated biweekly) are available at ScienceWatch.com now.

Fast Breaking Comments

By Sorin Draghici

ESI Special Topics, October 2006
Citing URL - http://www.esi-topics.com/fbp/2006/october06-SorinDraghici.html

Sorin Draghici answers a few questions about this month's fast breaking paper in the field of Computer Science.


From •>>October 2006

Field: Computer Science
Article Title: Ontological analysis of gene expression data: current tools, limitations, and open problems
Authors: Khatri, P;Draghici, S
Journal: BIOINFORMATICS
Volume: 21
Issue: 18
Page: 3587-3595
Year: SEP 15 2005
* Wayne State Univ, Dept Comp Sci, 431 State Hall, Detroit, MI 48202 USA.
* Wayne State Univ, Dept Comp Sci, Detroit, MI 48202 USA.

ST:  Why do you think your paper is highly cited?

I think the number of citations reflects the importance of this area of research. High-throughput methods have become ubiquitous in modern life sciences research. Our data-gathering capabilities have greatly surpassed the available data analysis techniques. The current challenge is to analyze these vast amounts of data and translate them into biological knowledge. In many gene or protein expression profiling experiments, independently of the platform and the analysis methods used, the result is a list of genes or proteins found to be differentially expressed between two or more conditions under study. The challenge faced by the researcher is to translate such lists into a better understanding of the underlying biological phenomena.


“This paper looked for the first time at the ontological analysis from a larger perspective, trying to unify existing efforts in the field.”

One approach to this is to translate the list of differentially expressed genes or proteins into a functional profile that identifies the biological processes, cellular locations, and molecular functions which are significantly different in the condition under study. However, this functional profiling cannot be performed manually.

The area of ontological analysis includes all computerized techniques and methods developed to perform such profiling. In recent years, as it has become apparent that this type of analysis can be useful in most, if not all, high-throughput experiments, many tools have been developed to perform this task. As more researchers discover the need for and the potential of this type of analysis, they might find this paper very useful.

ST:  Does it describe a new discovery, methodology, or synthesis of knowledge?

The paper attempts to synthesize our current knowledge in this area. This detailed analysis of the capabilities of these tools will hopefully help researchers understand the most important problems associated with this type of analysis: the scope of the analysis, visualization capabilities, the statistical model(s) used, correction for multiple comparisons, the reference gene lists used, some installation issues, and the sources of annotation data.

More importantly, in spite of the fact that this type of analysis has been generally adopted, this approach has several important intrinsic drawbacks. These drawbacks are associated with all tools discussed and represent conceptual limitations of the current state-of-the-art in ontological analysis. We propose these as challenges for the next generation of secondary data analysis tools.

ST:  Could you summarize the significance of your paper in layman's terms?

This paper examined, for the first time, the ontological analysis as seen from a larger perspective, trying to unify existing efforts in the field. We tried to identify the main ideas used in this type of analysis, as well as the most common mistakes and the yet unsolved problems. This approach has become immensely popular in recent years with new tools being published almost every month. However, most tools use the same approach, and only a handful of statistical models, which are not that different from each other.

In spite of its popularity and proven usefulness, this approach is also severely limited in certain regards. It would be more beneficial if developers of future tools could try to expand the current ontological analysis approach by addressing some of the limitations, rather than providing endless variations of the same idea. If this paper could contribute to this shift, it would be a significant contribution. Time will tell. I am hopeful.

ST:  How did you become involved in this research, and were any problems encountered along the way?

Sometime in 2000, a colleague from a different department approached us with this problem: given a list of differentially expressed genes and a database of annotations using gene ontology terms, could we build a software tool that would automatically retrieve all the annotations for all genes and then show the number of genes in each category? My Masters student at the time, Purvesh Khatri, managed to implement something very quickly. We then thought this might be useful to more people and we developed a more refined tool, "Onto-Express" (OE), able to perform this analysis over the web (Khatri et al., "Profiling Gene Expression Utilizing Onto-Express," Genomics 79[2]:266-270, February 2002).

At the same time, it occurred to me that the mere number of genes in each category is completely insufficient because various categories are represented to different extents in each experiment. In many cases, the number of genes can actually be misleading. Rather than looking at the number of genes in each category, one should compare the observed number of genes with what is expected in each category just by chance.

We then developed a statistical approach for this type of analysis, which we published in a subsequent paper (Draghici et al., "Global functional profiling of gene expression," Genomics 81[2]:98-104, February 2003). Since then, this analysis approach has become the de facto standard in the second-stage analysis of microarray experiments. Currently, while OE continues to have a devoted user base of several thousand researchers world-wide, over 20 similar tools are available from other groups.

ST:  Are there any social or political implications for your research?

Today, life sciences are at the center of attention from many points of view. Cutting-edge research, in everything from cancer detection and treatment, to chronic illnesses and old age afflictions, is performed with high-throughput techniques. The functional profiling approach discussed here is ubiquitously used in most modern high-throughput experiments.End

Sorin Draghici, Ph.D. 
Director of the Bioinformatics Core, Karmanos Cancer Institute
and Associate Professor
Dept. of Computer Science
Wayne State University
Detroit, MI, USA

ESI Special Topics, October 2006
Citing URL - http://www.esi-topics.com/fbp/2006/october06-SorinDraghici.html

•> Search Special Topics
Fast Breaking Papers Menu || All Topics Menu
Fast Breaking Papers Comments Menu
Help || About || Contact

ScienceWatch.com - Tracking Trends and Perfomance in Basic Research
Go to the new ScienceWatch.com

Write to the Webmaster with questions/comments. Terms of Usage.
The Research Services Group of Thomson Scientific |
(c) 2008 The Thomson Corporation.