
Peter Bühlmann
(in the middle between Alessio Figalli and Cédric Villani)
heads the initiative “Foundations of Data Science”.
(Photo: PPR / Christian Merz)
The data-driven methods are different from the classic methods, however, explains Peter Bühlmann, with his characteristic wit: “The classical statistical approach would be for a researcher to start with a scientific hypothesis and then to carefully consider which data he would collect with which method in order to derive the most informative conclusions. You could say that because nowadays the data are automatically dropping right out of the sky, that’s often no longer the case.”
A new dimension
The new approaches that use intelligent algorithms can automatically extract interesting information from existing data sets without relying on planned out data collection. The subject of data science has arisen in recent years from these new possibilities. Today it’s an interdisciplinary research and development field that intersects statistics, computer science, information technology and mathematics.
“Data science is something new. It’s not just statistics, not simply computer science and not merely information technology, but instead a combined effort of all three fields,” says Bühlmann. ETH Zurich is strengthening the exploration of the fundamentals of data science with the new initiative by bundling existing expertise. Eleven professors from three ETH departments are involved, representing
Statistics
,
Machine Learning
and
Information Technology and Electrical Engineering
.
It focuses on the foundational issues of mathematical theories and algorithmic methods. There are two programmes, one for postdoctoral students and one for visiting researchers. The initiative complements activities in education (Master in Data Science, DAS in Data Science) and in the transfer of knowledge and technology between the disciplines and industry (
Swiss Data Science Center
). It started on 1 January 2019. It receives 2.7 million Swiss francs in funding from “
ETH+
”, the ETH-wide initiative for supporting interdisciplinary projects.
Responsibility and fair algorithms
Because data science developments affect many users throughout scientific research, the economy and society, research on the fundamentals of data science carries a special responsibility, says Bühlmann. He sees one challenge in developing algorithms that can deliver causally correct, stable, reliable and easily interpretable results even in the case of complex data sets. For his research on stability and causality (cause and effect), Bühlmann was awarded the prestigious “Guy Medal in Silver” by the Royal Statistical Society and an “ERC Advanced Grant” in 2018.
Ultimately, the automation doesn’t always work as elegantly as it does in the photo app. The application of intelligent algorithms can sometimes become problematic. If, for example, computers can use characteristic data (age, sex, nationality, health, etc.) to determine who is creditworthy, or if they can provide judges with results about the probability that a defendant could be guilty, then it shouldn’t result in any disadvantages.
So “interpretable machine learning” and “fair algorithms” are two major research issues that Bühlmann takes a great personal interest in. “As a researcher of the fundamentals of the subject, I want to produce something useful for society. I want to know when an application delivers reliable results, and when they are less reliable,” says Bühlmann. “That’s my position.”