Beginning in mid-February 2008, the 1997-2007 online version of the Science Watch® newsletter, ESI-Topics.com, and in-cites.com, will all be featured together on the redesigned ScienceWatch.com. All previous content from the three sites will be permanently archived, and remain accessible from any existing bookmarks to the archived pages. No new content will be added to this site. Updates and new content (updated biweekly) are available at ScienceWatch.com now.

Fast Breaking Comments

By Dr. Prakash Nadkarni

ESI Special Topics, June 2002
Citing URL - http://www.esi-topics.com/fbp/comments/june02-PrakashNadkarni.html

Dr. Prakash Nadkarni answers a few questions about this month's fast breaking paper in field of Social Sciences.


From •>>June 2002

Field: Social Sciences, general
Article Title: "UMLS concept indexing for production databases: A feasibility study"
Authors: Nadkarni, P;Chen, R;Brandt, C
Journal: J AMER MED INFORM ASSOC
Volume: 8
Page: 80-91
Year: JAN-FEB 2001
* Yale Univ, Sch Med, Ctr Med Informat, POB 208009, New Haven, CT 06520 USA.
* Yale Univ, Sch Med, Ctr Med Informat, New Haven, CT 06520 USA.

ST:  Why do you think your paper is highly cited?

I guess the field of concept indexing and the UMLS is currently a hot one. I don't rate this paper as one of my very best, though the work was definitely fun to do and describe, and it did have the side effect of putting a lot of previous work (by other researchers) into proper perspective.

ST:  Does it describe a new discovery or new methodology that's useful to others?

The paper discusses, based on an experiment, the possible pitfalls that the researcher encounters when trying to match phrases in electronic text to terms in a controlled vocabulary, such as the National Library of Medicine's Unified Medical Language System (UMLS). It does describe a computer program that attempts this task.

ST:  Can you give us some background on this research?

Information Retrieval (IR) is the field of computer science that is concerned with general methods of processing text so as to facilitate its subsequent search. Large bibliographic databases such as MedLine (and ISI's own Current Contents) use this technology to allow users to search these databases by keywords. In most cases, keywords (or search terms) are the individual words that are part of the abstract (or the full text of the article). The drawback of using individual words is the problem of synonyms: a user must specify all synonyms of a particular word to make sure the search does not miss articles of interest.

Controlled vocabularies are really thesauri, which contain CONCEPTS in a particular domain (e.g., medicine, and life sciences) and their synonyms. If one is able to scan the text of an article and match the text to concepts in the thesaurus, then the IDs of the concepts can be used for indexing, instead of words. This way, the user, having specified a single keyword for searching, can have the thesaurus expand the query by matching the keyword to a concept and then searching for all articles indexed by those concepts. In practice, concept-indexing is not foolproof. One problem is polysemy-words that have multiple meanings, and can match to multiple concepts. For example, "anesthesia" can be a procedure ancillary to surgery, or a loss of sensation in part of the body (e.g., following a nerve injury). Also, cryptic abbreviations, neologisms, and elisions, all of which occur in dictated medical text, can foil the process of concept recognition. Our work concluded that concept indexing by itself could not substitute for traditional word-indexing, but could be ancillary to the latter.

ST:  Could you summarize the significance of your paper in layman's terms?

If the kinks in concept indexing can be removed by very sophisticated natural language processing (no guarantee that this will happen), this technology could benefit all users of bibliographic databases.End

Prakash Nadkarni, MD
Associate Professor
Yale University School of medicine,
New Haven, CT

ESI Special Topics, June 2002
Citing URL - http://www.esi-topics.com/fbp/comments/june02-PrakashNadkarni.html

•> Search Special Topics
Fast Breaking Papers Menu || All Topics Menu
Fast Breaking Papers Comments Menu
Help || About || Contact

ScienceWatch.com - Tracking Trends and Perfomance in Basic Research
Go to the new ScienceWatch.com

Write to the Webmaster with questions/comments. Terms of Usage.
The Research Services Group of Thomson Scientific |
(c) 2008 The Thomson Corporation.