Causal Discovery in Python
Part of the Scientific Python specialist track
A review and comparison of software available for causal discovery in Python. Causal discovery means learning "what causes what" from your data. The input is a tabular dataset; the output is a causal graphical model (or a set of potential models) over your features. If feature A affects feature B, there should be an arrow A-->B in the causal graphical model. Causal discovery is useful for hypothesis generation, experiment selection, and for testing our assumptions around causation.
I'll give a brief intro to causal discovery, then review the following packages: py-tetrad, causal-learn, tigramite, causalnex, and cdt (causal discovery toolbox). The packages have some overlap but different emphases: each one implements at least one algorithm not covered by the other packages, making them useful in different situations. If time permits I'll finish with a quick demo, showing each package learning a model from the same dataset.
See this talk and many more by getting your ticket to PyCon AU now!
I want a ticket!Lizzie Silver is a Senior Data Scientist at WSP. She has broad interests in applied data science, and has worked on projects in electricity distribution, water distribution, abandoned mine shaft detection, fish ecology, and arthritis monitoring via wearable devices, among others. She did her PhD in causal discovery at Carnegie Mellon University. Her pastimes include singing in choirs, and running the monthly Melbourne Machine Learning and AI Meetup, and the Melbourne chapter of Puzzled Pint.