# KTH Royal Institute of Technology

## Combinatorial and Algebraic Statistics (Spring 2021)

Welcome to the course page for the PhD course Combinatorial and Algebraic Statistics that is to take place in the Spring semester of 2021 at KTH. Here you will find the necessary information about the course as it becomes available.

## Course Description:

Combinatorial and algebraic statistics is an emerging field within the mathematical foundations of data science and artificial intelligence. Many of the modern techniques in data science and artificial intelligence aim to identify optimal models or discern useful properties of data-generating distributions that allow practitioners to make informed predictions. Such techniques often make assumptions about the data-generating distribution that have in turn lead to the careful study of statistical models whose important features are naturally represented via combinatorial and/or algebraic objects. Such objects commonly include directed acyclic graphs, convex polytopes, or special algebraic varieties. What can be said about the combinatorics and algebra of these objects arising from statistical models of interest? Can the combinatorics and algebra of these objects teach us useful things about their statistical models of origin? These are the types of questions that will be explored in this class. Topics to be discussed include hypothesis tests for contingency tables, Markov bases for hierarchical and network models, likelihood inference for discrete and Gaussian models, undirected and directed acyclic graphical models, and connections to causality and phylogenetics. Following this course, participants should find themselves familiar with the basic objects of study currently discussed at a typical workshop or seminar on combinatorial and algebraic statistics.

## Teachers:

KathlĂ©n Kohn and Liam Solus (Examiner)

## Time and Location:

Class will take place Fridays 13.15-15.00. All course meetings will take place online in the zoom room here.

## Prerequisites:

A masters level background in algebra and/or combinatorics. Please contact the teachers of the course if you are unsure as to whether or not this course is right for your level of experience.

## Examination:

Registered participants in the course will submit solution sets to two homework assignments, give one lecture in the course on the course material for that day, and write a review of another participant's lecture. The two homework sets and the lecture will each be worth 30 points, the review will be worth an additional 10 points. A participant must earn at least 70 points to pass the class. In the case that there are more students than possible lectures, some students will instead read a paper related to the course material and give a presentation of their paper to other members of the class.

## Schedule:

Most lectures will be based on the free online book "Lectures on Algebraic Statistics." Some of the lectures will be based on chapters from the new book "Algebraic Statistics" by Seth Sullivant. However, it is not necessary to purchase this book just for this course.

22 Jan Sullivant Ch. 1 What is algebraic statistics? Liam notes
29 Jan Sullivant Ch. 3 Some algebra basics Kathlén slides
5 Feb Sullivant Ch. 2 and 5 Some probability and statistics basics Liam notes
12 Feb Lec on Alg Stat Ch. 1.1 Hypothesis Tests for Contingency Tables Danai slides
19 Feb Lec on Alg Stat Ch. 1.2 Markov Bases for Hierarchical Models Leo slides
26 Feb Lec on Alg Stat Ch. 1.3; Casanellas et al. Sec. 2 Markov Bases for Network Models and Relational Data Nils slides
5 Mar Lec on Alg Stat Ch. 2.1 Likelihood Inference for Discrete and Gaussian Models Felix slides Problem Set 1 handed out
12 Mar Lec on Alg Stat Ch. 2.3 Likelihood Ratio Tests Aryaman notes
19 Mar Sullivant Ch. 6 Exponential Families Liam notes Problem Set 1 due
26 Mar Sullivant Ch. 8 The Cone of Sufficient Statistics Kathlén slides
9 Apr Lec on Alg Stat Ch. 3.1 and first part of 3.2 Conditional Independence Models and Undirected Graphical Models Lukas slides
16 Apr Lec on Alg Stat Ch. 3.2 (second part minus chain graphs) and 3.3 Directed Acylic Graphical models (discrete and Gaussian) and Parametrizations of Graphical Models Tianfang slides
23 Apr Sullivant Ch. 14.1-14.2 Mixture Models and Hidden Variable Graphical Models Petter slides
30 Apr Casanellas et a. Sec. 4 and Sullivant Ch. 15.1-15.2 Phylogenetics: Trees and Splits, Basic Types of Models Jiayue slides Problem Set 2 handed out
7 May Sullivant Ch. 15.3 Group Based Models Francesca notes
21 May Sections 1-4 of paper 1 and all but Section 4 of paper 2 Invariant Theory and Maximum Likelihood Kathlén slides
31 May Problem Set 2 due

Liam Solus
Assistant Professor of Mathematics
KTH Royal Institute of Technology
Last updated on 1 June 2021