The overall objective of my group's research is to develop new
techniques, algorithms, and software systems that help (i) engineers
to construct complex systems in less time with higher confidence of
correctness, and (ii) scientists to develop models that can be used to
gain deeper insights into physical and biological systems. The research work is within the intersection of (i) programming language and compiler research
, (ii) real-time and cyber-physical systems
, and (iii) machine learning with a focus on probabilistic methods
A central part of this research is to develop programmatic modeling
languages, that is, expressive formal modeling languages where model developers describe what the system should do, not exactly how it is
executed. Models can be constructed for the system before its being
build (engineering perspective) or models may describe abstractions of
an already existing system, such as a biological system that is then
used for analysis (science perspective).
Overall research questions include, but are not limited to:
- (Q1) How can we formalize the semantics of such languages and prove correctness properties?
- (Q2) How can we develop algorithms and compilation strategies that result in effective and
scalable analysis tools (including inference and simulation)?
- (Q3) How can language fragments and models be modularly defined and then composed
in a sound and efficient manner?
The research concerns several abstraction levels, including
semantic aspects of languages
, efficiency of compilers
and properties of target computer architectures
; either as
custom research hardware or compilation to standard computer
architectures, such as graphical processing units (GPUs). As a
consequence—from a computer architecture perspective—David's research includes both hardware and software aspects, with a focus on the software.
My group's research is funded by a number of research projects, which include both theory (development of new algorithms and
methods) and practice (development and dissemination of open-source
software). Since we focus on fundamental research of algorithms and
tools, we collaborate closely with domain experts, including both
scientists (e.g., within evolutionary biology) and industry
(engineering companies within telecom and manufacturing technology).
The overall research can be divided into three research areas:
- Research Area 1: Differentiable Probabilistic Programming Languages
- Research Area 2: Modular Meta-Programming Systems
- Research Area 3: Predictable and Timed Systems
These three areas are presented below. Besides the technical research,
David has also been involved in pedagogical research. See for instance
our work on the company approach to software engineering projects [IEEE
IEEE Transactions on Education 2012
assessment models for large project courses [SIGCSE 2014
Research Area 1: Differentiable Probabilistic Programming Languages
This research area focuses on the area of combining differentiable
programming (the ability to make use of differentiable functions and
automatic differentiation directly within programming languages) and
probabilistic programming (a generalized approach to Bayesian networks
where programmatic probabilistic models can be encoded within Turing
complete languages). Some highlights within this area are:
- Probabilistic Programming and Machine Learning.
We develop algorithms, semantics, and compilers for probabilistic programming languages in general. Recent results include
work on delayed sampling [AISTATS 2018],
proving correctness of Sequential Monte Carlo (SMC) inference within
PPLs [ESOP 2021], and efficient compilation
of universal probabilistic programming of SMC to GPUs [ESOP
2022]. We have earlier worked on
non-PPL-based supervised and unsupervised learning methods in the
context of software engineering and automated bug assignment
[Empirical Software Engineering
2016] and as generalized methods for
parallelization of Latent Dirichlet allocation models [Journal of
Computational and Graphical Statistics
2017]. Recently, we showed for the first
time how universal probabilistic programming offers a new, unique, and
a powerful approach to statistical phylogenetics, enabling rapid
modeling that was not possible before [Communication Biology
- Differentiable and Equation-Based Languages.
We have for a long time
been developing static and dynamic semantics for equation-based
object-oriented (EOO) modeling languages, which are based on
differentiable-algebraic equations. I have been part of the Modelica
language design group since 2005. Some results and contributions are
on types [Modelica 2006], [GPCE 2006], higher-order models [SNE 2009],
connection semantics [PADL 2012], and some involvement in the open-source tool OpenModelica [SNE 2005]. We are currently working on
incorporating automatic differentiation as first-class citizens in
languages, both in an efficient and semantically correct way.
Key areas where we are actively doing research in: (i) defining a new domain-specific modeling language within phylogenetics, (ii) develop more domain-specific compilation techniques for both automatic differentiation and probabilistic inference, and (iii) develop reinforcement techniques that incorporate differentiable probabilistic programming concepts, including equation-based modeling and simulation.
Research Area 2: Modular and Efficient Meta-Programming Systems
The second research area focuses on theory and practical approaches for constructing modular programming systems, where language fragments, models, and model instances can be composed together in a sound and efficient manner.
- The Miking System.
Central for our current group's activities is the open-source platform Miking that we use as a research platform (see the Vision paper in [SLE 2019] or the Github repositories). We develop new techniques for composability and a technique called Resolvable Ambiguity [CC 2021], as well as a model for interactive programmatic modeling [TECS 2021]. A central part of this line of work is the ability to compose language fragments so that new domain-specific languages can be rapidly defined without starting from scratch. For instance, the work on probabilistic programming in Area 1 (published in ESOP 2021 and 2022) is developed on top of the Miking framework. We base many aspects of the Miking framework on experiences and ideas from David's previous system called Modelyze (see [PEPM 2018] and www.modelyze.org).
- Co-simulation. Another aspect of modular systems is co-simulation. We have done work in this area for many years, which has resulted in several visible publications, including determinate composition for co-simulation [EMSOFT 2013], techniques for hybrid co-simulation [SoSym 2019], and requirements for hybrid co-simulation [HSCC 2015], and the leading survey on the topic [ACM Computing Survey in 2019].
Key areas where we currently do research are: (i) automatic tuning techniques to improve compilation performance, (ii) acceleration and domain-specific compilation to GPUs, (iii) theories for language fragment composition, both at syntax and semantic level.
Research Area 3: Predictable and Timed Systems
The third research area includes languages and compilers, specially targeted for real-time systems, and hardware designed for predictability. These two abstractions meet at the compiler level, where compilers can be designed to improve predictability and timing aspects.
- Timed Programming Languages and compilers. A line of work that our group has recently been developing is language primitives for efficient and simple real-time programming in standard programming languages. In particular, we have developed a language extension called Timed C [RTAS 2018 paper], where the standard C language is extended with a small set of timing primitives. We have also developed end-to-end tools for Timed C [RTSS 2019]. We have also investigated WCET-aware code mapping techniques [TECS 2017] and models for approximate synchrony [CAV 2015].
- Predictable Processors and Mixed-Critical Systems. Another line of research has focused on what is called Precision Timed (PRET) machines, mainly developed at UC Berkeley. An essential step in this development is the FlexPRET architecture [RTAS 2014], a predictable RISC-V-based softcore processor for mixed-critical systems. We have also developed predictable DRAM controllers [RTAS 2015] and models for relaxing the synchronous approach for mixed-critical systems [RTAS 2014].
During the past years, the group's research has focused mostly on compiler and language perspectives in general, and especially on research related to Timed C. Out of the three research areas, there are fewer planned activities within Area 3. However, one important aspect is connecting predictable and timed programming with co-simulation aspects and differentiable programming. There are both interesting semantic problems and compiler challenges.
This is a personal web page. More information