PyCon AU 2024

Tennessee Leeuwenburg

Tennessee Leeuwenburg is a data scientist and software developer with over 20 years of experience. He has an interest in open source software, machine learning, and forecast verification. His current research work includes the development of scientific machine learning models for weather and environmental prediction. For an overview of his recent publications, please visit https://orcid.org/0009-0008-2024-1967 .


What pronouns do you use?

he/him


Sessions

11-22
10:00
30min
Verifying and evaluating scientific results with the open source package "scores"
Tennessee Leeuwenburg

Verifying, evaluating or interpreting complex data requires specialist tools and methods. Many data scientists, programmers and scientists will be familiar with some evaluation metrics such as accuracy, mean squared error or true positive rate. There are many situations where these scores are insufficient for assessing correctness, accuracy or suitability of a model or prediction. The challenge of verifying models and predictions affects most fields of science, engineering, and many machine learning applications.

This talk will introduce "scores", an open source Python package for verifying and evaluating labelled, n-dimensional (multidimensional) data at any scale. "scores" includes over 50 metrics, statistical techniques and data processing tools. The software repository can be found at https://github.com/nci/scores and the documentation can be found at https://scores.readthedocs.io/ .

This talk is suitable for beginner, intermediate and expert audiences. Developers and data scientists who are familiar mainly with tabular data, such as supported by the pandas library, may be interested in the additional functionality offered by "scores" (and the xarray library it utilises). For those learning about more advanced methods, every metric and statistical test has a companion Jupyter Notebook tutorial. For expert users already familiar with these ideas, you may be interested in some of the novel scoring methods not commonly found in other packages.

Come to this talk to hear about:
- The difference between tabular data, n-dimensional data, and labelled n-dimensional data
- Examples of using a common metric from "scores" on labelled, n-dimensional data
- Examples of using "scores" for interrogating data in multiple dimensions
- Examples of where basic methods overlook important considerations
- Examples of using some of the more complex metrics in "scores"

Scientific Python
Eureka 2
11-23
10:40
30min
Making an open source package - lessons learned
Tennessee Leeuwenburg

Making an open source package is pretty hard in 2024. Expectations are high, and there’s a lot to take into account. I recently developed an open source package. This talk covers what worked, what didn’t work, what I would do again and what I would do differently.

I developed an open source package called “scores” ( https://github.com/nci/scores , https://scores.readthedocs.io/ ). This is not a presentation about what “scores” does, but instead covers the lessons I learned. Despite being an experienced software developer and having used lots of open source software, there was still a lot to learn (and a lot to figure out) about open source package maintenance.

Every package is different, but this is what I did and these are the lessons I learned.

  • Technical Matters:
    • How to lay things out on disk
    • Configuration files
    • Automated testing
    • Type hinting
    • Linting and other static analysis tools
    • Code layout and design
  • Documentation:
    • What documentation to produce
    • Picking and using a tech stack
    • Rendering (documentation often renders differently in different locations)
  • Ecosystem Integration:
    • How to fit in well with the tools around you
    • Versioning
    • Publishing to PyPI
    • How and what to automate
    • How to do releases
  • Community Considerations:
    • Code review standards
    • Clear presentation of information
    • Understanding your user base and audience
Main Conference
Goldfields Theatre