KTH

SE-100 4 Stockholm, Sweden

Email: solus@kth.se

Welcome to the course page for the PhD course *Combinatorial and Algebraic Statistics* that is to take place in the Spring semester of 2021 at KTH. Here you will find the necessary information about the course as it becomes available.

Combinatorial and algebraic statistics is an emerging field within the mathematical foundations of data science and artificial intelligence. Many of the modern techniques in data science and artificial intelligence aim to identify optimal models or discern useful properties of data-generating distributions that allow practitioners to make informed predictions. Such techniques often make assumptions about the data-generating distribution that have in turn lead to the careful study of statistical models whose important features are naturally represented via combinatorial and/or algebraic objects. Such objects commonly include directed acyclic graphs, convex polytopes, or special algebraic varieties. What can be said about the combinatorics and algebra of these objects arising from statistical models of interest? Can the combinatorics and algebra of these objects teach us useful things about their statistical models of origin? These are the types of questions that will be explored in this class. Topics to be discussed include hypothesis tests for contingency tables, Markov bases for hierarchical and network models, likelihood inference for discrete and Gaussian models, undirected and directed acyclic graphical models, and connections to causality and phylogenetics. Following this course, participants should find themselves familiar with the basic objects of study currently discussed at a typical workshop or seminar on combinatorial and algebraic statistics.

KathlĂ©n Kohn and Liam Solus (Examiner)

Class will take place Fridays 13.15-15.00. All course meetings will take place online in the zoom room here.

- MacLagan et al. Computational Algebra and Combinatorics of Toric Ideals.
- Drton et al. Lectures on Algebraic Statistics.
- Casanellas et al. Algebraic Statistics in Practice: Applications to Networks. Annual Review of Statistics and its Applications (2020).
- Sullivant. Algebraic Statistics. Vol. 194. American Mathematical Soc., 2018.

A masters level background in algebra and/or combinatorics. Please contact the teachers of the course if you are unsure as to whether or not this course is right for your level of experience.

Registered participants in the course will submit solution sets to two homework assignments, give one lecture in the course on the course material for that day, and write a review of another participant's lecture. The two homework sets and the lecture will each be worth 30 points, the review will be worth an additional 10 points. A participant must earn at least 70 points to pass the class. In the case that there are more students than possible lectures, some students will instead read a paper related to the course material and give a presentation of their paper to other members of the class.

Most lectures will be based on the free online book "Lectures on Algebraic Statistics." Some of the lectures will be based on chapters from the new book "Algebraic Statistics" by Seth Sullivant. However, it is not necessary to purchase this book just for this course.

Date | Reading | Lecture Content | Lecturer | Notes | Deadlines |
---|---|---|---|---|---|

22 Jan | Sullivant Ch. 1 | What is algebraic statistics? | Liam | notes | |

29 Jan | Sullivant Ch. 3 | Some algebra basics | Kathlén | slides | |

5 Feb | Sullivant Ch. 2 and 5 | Some probability and statistics basics | Liam | notes | |

12 Feb | Lec on Alg Stat Ch. 1.1 | Hypothesis Tests for Contingency Tables | Danai | slides | |

19 Feb | Lec on Alg Stat Ch. 1.2 | Markov Bases for Hierarchical Models | Leo | slides | |

26 Feb | Lec on Alg Stat Ch. 1.3; Casanellas et al. Sec. 2 | Markov Bases for Network Models and Relational Data | Nils | slides | |

5 Mar | Lec on Alg Stat Ch. 2.1 | Likelihood Inference for Discrete and Gaussian Models | Felix | slides | Problem Set 1 handed out |

12 Mar | Lec on Alg Stat Ch. 2.3 | Likelihood Ratio Tests | Aryaman | notes | |

19 Mar | Sullivant Ch. 6 | Exponential Families | Liam | notes | Problem Set 1 due |

26 Mar | Sullivant Ch. 8 | The Cone of Sufficient Statistics | Kathlén | slides | |

9 Apr | Lec on Alg Stat Ch. 3.1 and first part of 3.2 | Conditional Independence Models and Undirected Graphical Models | Lukas | slides | |

16 Apr | Lec on Alg Stat Ch. 3.2 (second part minus chain graphs) and 3.3 | Directed Acylic Graphical models (discrete and Gaussian) and Parametrizations of Graphical Models | Tianfang | slides | |

23 Apr | Sullivant Ch. 14.1-14.2 | Mixture Models and Hidden Variable Graphical Models | Petter | slides | |

30 Apr | Casanellas et a. Sec. 4 and Sullivant Ch. 15.1-15.2 | Phylogenetics: Trees and Splits, Basic Types of Models | Jiayue | slides | Problem Set 2 handed out |

7 May | Sullivant Ch. 15.3 | Group Based Models | Francesca | notes | |

21 May | Sections 1-4 of paper 1 and all but Section 4 of paper 2 | Invariant Theory and Maximum Likelihood | Kathlén | slides | |

31 May | Problem Set 2 due |

Assistant Professor of Mathematics

KTH Royal Institute of Technology

Last updated on 1 June 2021