From e347c2835ae0283d756e614978704fcb43e4b87f Mon Sep 17 00:00:00 2001 From: Zeitsperre <10819524+Zeitsperre@users.noreply.github.com> Date: Thu, 28 Mar 2024 13:37:14 -0400 Subject: [PATCH] add slides and settings.json --- .vscode/settings.json | 5 + docs/.marprc.yml | 3 + docs/slides.md | 916 ++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 924 insertions(+) create mode 100644 .vscode/settings.json create mode 100644 docs/.marprc.yml create mode 100644 docs/slides.md diff --git a/.vscode/settings.json b/.vscode/settings.json new file mode 100644 index 0000000..631b49d --- /dev/null +++ b/.vscode/settings.json @@ -0,0 +1,5 @@ +{ + "markdown.marp.themes": [ + "https://cunhapaulo.github.io/style/freud.css" + ] +} \ No newline at end of file diff --git a/docs/.marprc.yml b/docs/.marprc.yml new file mode 100644 index 0000000..f79e3cd --- /dev/null +++ b/docs/.marprc.yml @@ -0,0 +1,3 @@ +allowLocalFiles: true +html: true +watch: true diff --git a/docs/slides.md b/docs/slides.md new file mode 100644 index 0000000..fbdb099 --- /dev/null +++ b/docs/slides.md @@ -0,0 +1,916 @@ +--- + +marp: true +theme: freud +_class: lead +class: default +footer: Building Open Climate Change Information Services in Python +header: PyCon Lithuania 2024 +author: Trevor James Smith +paginate: true +backgroundColor: white +transition: fade +# backgroundImage: url('https://marp.app/assets/hero-background.svg') +size: 16:9 +style: | + footer { + left: 5%; + font-size: 20px; + text-shadow: 0px 0px 10px #fff; + } + header { + right: 10%; + left: 60%; + text-align: right; + font-size: 20px; + text-shadow: 0px 0px 10px #fff; + } + img[alt~="center"] { + display: block; + margin: 0 auto; + } + .container{ + display: flex; + } + .col{ + flex: 1; + } + +--- + + + + + + +# Building Open Climate Change Information Services in Python + + + +![bg width:100% height:100%](img/canada-climate-map.png) +![img top-right](img/logo-ouranos-vertical-couleur.svg) + +- **Trevor James Smith** +**PyCon Lithuania** +**April 4th, 2024** +**Vilnius, Lithuania** + +--- + + + + + +![bg left](img/cyclone-extra-tropical-aout-2016.jpg) + +### Presentation Outline + + + +- Who am I? / What is Ouranos? +- What's our context? +- Climate Services? +- `xclim`: climate operations +- `finch`: `xclim` as a Service +- Climate WPS Frontends +- What's next for us? +- Acknowledgements + +--- + + + + + +![bg absolute left:40% 85%](img/profile2.jpg) + +# Who am I? + +**Trevor James Smith** + +![height:35](img/github.png) [**github.com/Zeitsperre**](https://github.com/Zeitsperre) +![height:35](img/mastodon-logo.png) [**Zeit@techhub.social**](https://techhub.social/@zeit) + +- Research software developer/maintainer from Montréal, Canada +- Studied climate change impacts on wine viticulture in Southern Québec +- Making stuff with Python for ~6 years +- 僕は日本語を勉強しています! + +--- + + + + + + +![bg vertical right:50% 95%](img/Ouranos-Website.png) +![bg 85%](img/ice-storm.jpg) + +# What is [Ouranos](https://www.ouranos.ca/en)? + + + +* Not-for-profit climate research consortium established in 2003 in Montréal, Québec, Canada + * Created in response to the [January 1998 North American Ice Storm](https://en.wikipedia.org/wiki/January_1998_North_American_ice_storm) +* Climate change adaptation, climate modelling, and **climate information services** +* Regional Climate Projection Data Provider + +Photo credit: https://www.communitystories.ca/v2/grand-verglas-saint-jean-sur-richelieu_ice-storm/ + +--- + + + + + + +![bg vertical left:55% width:90% height:95%](img/hockey-stick.png) +![bg width:90% Surface air temperature anomaly for February 2024 using ERA5 Reanalysis - Courtesy of C3S/ECMWF](img/ecmwf-sat-anomaly-feb-2024.png) + +# What's the **climate** situation? + + + +- Climate Change is having major impacts on Earth's environmental systems +- IPCC: **Global average temperature has increased > 1.1 °C** over pre-industrial normals + +*"Since systematic scientific assessments began in the 1970s, the influence of human activities on the warming of the climate system has evolved from theory to established fact"* + +\- IPCC Sixth Assessment Report Technical Summary + +--- + + + + + +![bg right:45% 88%](img/overpeck-et-al-2011.png) + +# What's the **climate data** situation? + + + +* Climate data is growing exponentially in size + * Climate models being developed every year + * Simulations being produced every day + * Higher resolution input **and** output datasets + * Specialized analyses and user needs + +--- + + + +![bg left:40% 80%](img/cccs-climate-services.png) + +# **Climate Services** + +## What do they provide? + + + +- Tailoring objectives and information to different user needs +- Providing access to **climate information** +- Building local mitigation/adaptation capacity +- Offering training and support +* Making sense of **Big** ***climate*** **Data** + +--- + + + +# What information do **Climate Services** provide? + + + +
+ +
+ +**Climate indicators**, e.g.: + - **Hot Days** (Days with temperature >= 22 deg Celsius); + - **Beginning / End / Length of the growing season**; + - **Average seasonal rainfall** (3-Month moving average precipitation); + - **Daily temperature range**; + - etc. + +
+ +
+ +**Planning tools** + - Maps + - Point estimates at geographic locations + - Time series estimates + - Gridded values + - Raw data (for experts) + * **Not really sure what they want/need?** + **➔ Guidance from experts!** + +
+ +
+ +--- + +# Why build a **Climate Services** library in **Python**? + + + +* Robust and fast scientific Python libraries +* Growing demand for climate services/products + - Provide access to the community so they can help themselves +* *The timing was right* + - Internal and external demand for common tools +* Less time writing code, more time spent doing research + +--- + +# What are the requirements? + + + +
+ +
+ +**What does it need to perform?** + - **Climate Indicators** + - Units management + - Metadata management + - **Ensemble statistics**; + - **Bias Adjustment**; + - **Data Quality Assurance Checks** + +
+ +
+ +**Implementation goals?** + - **Operational** : Capable of handling very large ensembles of climate data + - **Foolproof** : Automatic verification of data and metadata validity by default + - **Extensible** : Flexibility of use and able to easily provide custom indicators, as needed + +
+ +
+ +--- + + + + + + +![bg right:55% contain](img/xclim-schema.png) + +# **Xclim**: Climate Services library + + + +- **Asynchonous IO** and **fast** +- **Open Source** design +- **standards-compliant** metadata +- **Extensible** (modular) +- **Operational** + +--- + + + +![bg 80% padding: 0px 20px 0px 0px](img/data-structure.png) +![bg 80% padding: 0px 20px 0px 0px](img/algorithms.png) +![bg 80% padding: 0px 20px 0px 0px](img/metadata-conventions.png) + +## How did we build **Xclim**? + +
+ +
+ +* **Data Structure** + + +
+ +
+ +* **Algorithms** + + +
+ +
+ +* **Data and Metdata Conventions** + + +
+ +
+ +--- + +## Upstream contributions from **Xclim** + + + +- Non-standard calendar (`cftime`) support in `xarray.groupby` +- Quantile methods in `xarray.groupby` +- Non-standard calendar conversion migrated from `xclim` to `xarray` +- Climate and Forecasting (CF) unit definitions inspired from `MetPy` + - Inspiring work in `cf-xarray` +- Weighted variance, standard deviations and quantiles in `xarray` (for ensemble statistics) +- Faster **NaN**-aware quantiles in `numpy` +- Initial polyfit function in `xarray` +* Not to forget mentioning work done by the team in `xESMF`, `intake-esm`, `cf-xarray`, `xncml`, and others for `xclim`-related downstream tools and workflows + +--- + + + +![bg right:45% contain](img/indicator.png) + +## **Xclim** algoritm design + + + +### Two ways of calculating indicators + +* `indice` (**Core algorithms**) + - For users that don't care for the standards and quality checks +* `indicators` (**End-User API**) + - Metadata standards checks + - Data quality checks + - Time frequency checks + - Missing data-compliance + - Calendar-compliance + +--- + +## What does **Xclim** do? ➔ Units Management + + + +```python +import xclim +from clisops.core import subset + +# Data is in Kelvin, threshold is in Celsius, and other combinations + +# Extract a single point location for the example +ds_pt = subset.subset_gridpoint(ds, lon=-73, lat=44) + +# Calculate indicators with different units + +# Kelvin and Celsius +out1 = xclim.atmos.growing_degree_days(tas=ds_pt.tas, thresh="5 degC", freq="MS") + +# Fahrenheit and Celsius +out2 = xclim.atmos.growing_degree_days(tas=ds_pt.tas_F, thresh="5 degC", freq="MS") + +# Fahrenheit and Kelvin +out3 = xclim.atmos.growing_degree_days(tas=ds_pt.tas_F, thresh="278.15 K", freq="MS") +``` + +--- + + + +## What does **Xclim** do? ➔ Units Management + + + +![img](img/units-example.png) + +```python +import xclim + +# Data is in Kelvin, threshold is in Celsius, and other combinations + +# Extract a single point location for the example +ds_pt = subset.subset_gridpoint(ds, lon=-73, lat=44) + +# Calculate indicators with different units + +# Kelvin and Celsius +out1 = xclim.atmos.growing_degree_days(tas=ds_pt.tas, thresh="5 degC", freq="MS") + +# Fahrenheit and Celsius +out2 = xclim.atmos.growing_degree_days(tas=ds_pt.tas_F, thresh="5 degC", freq="MS") + +# Fahrenheit and Kelvin +out3 = xclim.atmos.growing_degree_days(tas=ds_pt.tas_F, thresh="278.15 K", freq="MS") +``` + +--- + +## What does **Xclim** do? ➔ Metadata Locales + + + +```python +import xarray as xr +import xclim + + +ds = xr.open_dataset("my_dataset.nc") + +with xclim.set_options( + # Drop timesteps with more than 5% of missing data + set_missing="pct", missing_options=dict(pct={"tolerance": 0.05}), + + metadata_locales=["fr"] # Add French language metadata +): + # Calculate Annual Frost Days (days with min temperature < 0 °C) + FD = xclim.atmos.frost_days(ds.tas, freq="YS") +``` + +--- + + + +## What does **Xclim** do? ➔ Metadata Locales + + + +![img](img/metadata-locales.png) + +```python +import xarray as xr +import xclim + + +ds = xr.open_dataset("my_dataset.nc") + +with xclim.set_options( + # Drop timesteps with more than 5% of missing data + set_missing="pct", missing_options=dict(pct={"tolerance": 0.05}), + + metadata_locales=["fr"] # Add French language metadata +): + # Calculate Annual Frost Days (days with min temperature < 0 °C) + FD = aclim.atmos.frost_days(ds.tas, freq="YS") +``` + +--- + + + +![bg 90%](img/ESPO-animation-EN.gif) + +## What does **Xclim** do ➔ Climate Ensemble Mean Analysis + + + +**Average temperature from the years 1991-2020 baseline across 14 IPCC climate models at Montréal, Québec** (*extreme warming scenario: SSP3-7.0*) + + + +--- + + + + + +![bg right:60% vertical contain](img/EQM.png) +![bg contain](img/EQM-adjusted.png) + +## What Does **Xclim** do? ➔ Bias Adjustment + + + +- Adjusts model bias from projected data using a `train`/`adjust` approach +- Several implementations available: + - Quantile Mapping + - Principle Components Analysis + - Multivariate (MBCn) +- Plugin support for Python package **SBCK** (dOTC, CDFt, and other algorithms) + +--- + +### That's great and all, but what if... + + + +* There's just too much data that we need to crunch: + - The data could be spread across servers globally + - Local computing power is just not enough for the analysis + +* We need to run lots of specific workflows regularly + +* The user doesn't know how to write a Python script: + - A biologist who uses `R` for their work + - A city planner who just needs a range of estimates for future rainfall + - Agronomist wondering about average growing conditions in 10 years + +--- + + + +![bg left:50% 95%](img/ms-planetary-computer.png) + +# **Xclim** on Compute Platforms + +## Microsoft Planetary Computer + +* [Computing Climate Indicators with xclim](https://planetarycomputer.microsoft.com/dataset/cil-gdpcir-cc0#Climate-indicators) + +--- + + + +![bg vertical right:50% 90%](img/birdhouse-git.png) +![bg contain](img/finches.png) + +# Finch: **Xclim** as a **Web Service** + + + +#### ![height:35](img/github.png) [github.com/Bird-house/Finch](https://github.com/bird-house/finch) + +- **Web Processsing Services** (WPS) + - Built with Python (**PyWPS**) +- Remote scientific analysis platforms +* _Bird-house likes to name their projects after birds_ + +--- + +## Using the **Finch** Web Service from Python (`owslib`) + + + +```python +from owslib.wps import WebProcessingService + +# URL running our service +finch_url = "https://pavics.ouranos.ca/twitcher/ows/proxy/finch/wps" + +# Connect to the Finch WPS service +finch = WebProcessingService(pavics_url) + +# Get a listing of all processes +finch.processes + +print(len(finch.processes)) # --> 430 supported indicators and analyses! +``` + +--- + + + + +## Using the **Finch** Web Service from Python (`birdy`) + + + +```python +from birdy import WPSClient + + +wps = WPSClient(finch_url) + +# Using the OPeNDAP protocol +remote_dataset = "www.exampledata.lt/climate.ncml" + +# The indicator call looks a lot like the one from `xclim` but +# passing a url instead of an `xarray` object. +response = wps.growing_degree_days( + remote_dataset, + thresh='10 degC', + freq='MS', + variable='tas' +) + +# Returned as a streaming `xarray` data object +out = response.get(asobj=True).output_netcdf + +out.growing_degree_days.plot(hue='location') +``` + +[Bird-house/birdy](https://github.com/Bird-house/birdy) -> PyWPS Helper Library + +--- + + + + + + + +## Using the **Finch** Web Service from Python (`birdy`) ![img](img/location-graphs.png) + + + + +```python +from birdy import WPSClient + + +wps = WPSClient(finch_url) + +# Using the OPeNDAP protocol +remote_dataset = "www.exampledata.lt/climate.ncml" + +# The indicator call looks a lot like the one from `xclim` but +# passing a url instead of an `xarray` object. +response = wps.growing_degree_days( + remote_dataset, + thresh='10 degC', + freq='MS', + variable='tas' +) + +# Returned as a streaming `xarray` data object +out = response.get(asobj=True).output_netcdf + +out.growing_degree_days.plot(hue='location') +``` + +[Bird-house/birdy](https://github.com/Bird-house/birdy) -> PyWPS Helper Library + +--- + + + + + + + +# Making it accessible ➔ Web Frontends + +## [ClimateData.ca](https://climatedata.ca) + +![bg width:100% height:100%](img/climatedataca-screen.png) + +--- + + + + + +![bg width:100% height:100%](img/climate-data-ca-dataset.png) + + + +--- + + + +# Our Experience Adopting Python for **Climate Science/Services** + +
+ +
+ +### Before (circa 2016) + +- `MATLAB`-based in-house libraries (**proprietary**) + - No external libraries all in-house +- Issues with data storage/access/processing + - Small team unable to meet demand +- Lack of uniformity between researchers +- Lots of bugs and human error +- Data analysis/requests served manually +- Software validation/testing??? + +
+ +
+ +### After + +- **Open Source** `Python` libraries (`numpy`, `sklearn`, `xarray`, etc.) +- Multithreading and streaming data formats (e.g. ZARR) +- Common tools built in-house and shared widely (`xclim`) +- Web service-based infrastructure +- Testing (`pytest`), Software CI/CD, and data validation +- Peer-Reviewed Software (**JOSS**) + +
+ +
+ +--- + +![bg contain](img/PAVICS.png) + + + +--- + + + + + +
+ +
+ +## Thanks! + +### Colleagues and collaborators + +- Pascal Bourgault +- David Huard +- Trevor J. Smith +- Travis Logan +- Abel Aoun +- Juliette Lavoie +- Éric Dupuis +- Gabriel Rondeau-Genesse +- Carsten Ehbrecht +- Sarah Gammon +- Long Vu +- David Caron + **and many more!** + +
+ +
+ +# Ačiū! + +**Have a great rest of PyCon Lithuania!** + +## **[Ouranosinc/xclim](https://github.com/Ouranosinc/xclim)** +[![JOSS height:50px](https://joss.theoj.org/papers/10.21105/joss.05415/status.svg)](https://doi.org/10.21105/joss.05415) +[![DOI height:50px](https://zenodo.org/badge/DOI/10.5281/zenodo.10710942.svg)](https://doi.org/10.5281/zenodo.10710942) + +## **[Bird-house/finch](https://github.com/bird-house/finch)** +[![DOI height:50px](https://zenodo.org/badge/DOI/10.5281/zenodo.10870939.svg)](https://doi.org/10.5281/zenodo.10870939) + +
+ +