Why
do you think your paper is highly cited?
It is the first paper presenting mathematical theory for
high-dimensional variable selection. Furthermore, the method
which we analyzed is computationally feasible and thus, our
mathematical results have a direct impact on a huge variety of
practical applications.
Does
it describe a new discovery, methodology, or synthesis of knowledge?
 |
 |
|
“...the method which we analyzed is
computationally feasible and thus, our
mathematical results have a direct
impact on a huge variety of practical
applications.” |
|
We examined the mathematical properties of an existing method
and expanded on methodology.
Would
you summarize the significance of your paper in layman’s terms?
The amount of available data is growing very fast in many
scientific disciplines. The truly interesting pieces of
information are often buried in just a few important variables
(among hundreds or thousands of unimportant variables). A clear
and accurate way of filtering out important variables helps to
manage the information overload and leads to interpretable
models. We have examined the potential and limitations of a
popular variable selection method for high-dimensional data,
where the number of monitored variables is potentially much
larger than the number of observed units.
How
did you become involved in this research, and were there any
particular problems encountered along the way?
Our mathematical research has been motivated by applications
in molecular biology where a huge number of variables (genes,
proteins) are measured for a few individuals. We wanted to
support our results in practical applications by some rigorous
mathematical theory.
Where
do you see your research leading in the future?
Modern data analysis has to cope with heterogeneous data
coming from potentially very different sources. New methodology
and mathematical theory is needed.
Are
there any social or political implications for your research?
Many discoveries in life and medical sciences rely on
variable selection; for example, to discover new bio-markers or
potential causes of a specific disease. Other scientific fields
are also dependent on the analysis of large datasets.
Understanding the potential and limitations of a methodology and
uncovering necessary and sufficient mathematical assumptions
helps to support adequate and precise interpretation of results
in practical applications.
Nicolai Meinshausen, Ph.D.
Fellow
Somerville College
Department of Statistics
University of Oxford
Oxford, UK
Peter Bühlmann, Ph.D.
Professor
Swiss Federal Institute of Technology Zürich (ETH)
Zürich, Switzerland