dianna

Logo_ER10

Tutorials

This folder contains DIANNA tutorial notebooks. To install the dependencies for the tutorials, run (in the main dianna folder)

pip install .[notebooks]

🠊 For general demonstration of DIANNA click on the logo or run it in Colab: .

🠊 For tutorials on how to convert an Keras, PyTorch, Scikit-learn or Tensorflow model to ONNX, please see the conversion tutorials.

🠊 For specific XAI methods (explainers):

Click on the explainer names to watch explanatory videos for the respective method.
Click on the logos for direct access to a tutorial notebook. Run the tutorials directly in Google Colab by clicking on the Colab buttons.

Datasets and Tasks

Illustrative (Simple)

|*Data modality*|Dataset| *Task* |Logo| |:------------|:------|:----------------------------------------------------------------------|:----| |*Images*|Binary MNIST | Binary digit *classification* | mnist_zero_and_one_half_size

| ||[Simple Geometric (circles and triangles)](https://doi.org/10.5281/zenodo.5012824)| Binary shape *classificaiton* | SimpleGeometric Logo

| ||[Imagenet](https://image-net.org/download.php) | $1000$ classes natural images *classificaiton* | ImageNet_autocrop

| |*Timeseries* | [Coffee dataset](https://www.timeseriesclassification.com/description.php?Dataset=Coffee) | Binary *classificaiton* of Robusta and Aribica coffee beans | Coffe Logo

| | | [Weather dataset](https://zenodo.org/record/7525955) | Binary *classification* (warm/cold season) of temperature time-series | Weather Logo

| |*Tabular*| [Penguin dataset](https://www.kaggle.com/code/parulpandey/penguin-dataset-the-new-iris)| $3$ penguin spicies (Adele, Chinstrap, Gentoo) *classificaiton* | Penguin Logo

Scientific use-cases

|*Data modality*|Dataset|*Task*|Logo| |:------------|:------|:---|:----| |*Images*|[Simple Scientific (LeafSnap30)](https://zenodo.org/record/5061353/)| $30$ tree species leaves *classification* | LeafSnap30 Logo

| |*Tabular*| [Land atmosphere dataset](https://zenodo.org/records/12623257)| Prediction of "latent heat flux" (*regression*). The random forest model is used as an [emulator](https://github.com/EcoExtreML/Emulator) to replace the physical model [STEMMUS_SCOPE](https://github.com/EcoExtreML/STEMMUS_SCOPE) to predict global maps of latent heat flux. | Atmosphere Logo

Models

The ONNX models used in the tutorials are available at dianna/models, or linked from their respective tutorial notebooks.

Summary of all Tutorials

All tutorials can be accessed by clicking on the dataset & task logo in the tables below.

The explainers’ output for the models trained on the datasets & tasks which are included in the dashboard are marked with Streamlit Logo .

Illustrative (Simple)

|*Modality* \ Method|RISE|[LIME](https://youtu.be/d6j6bofhj2M)|Kernel[SHAP](https://youtu.be/9haIOplEIGM)| |:-----|:---|:---|:---| |*Images*|[ mnist_zero_and_one_half_size

](/dianna/tutorials/explainers/RISE/rise_mnist.ipynb) or [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/dianna-ai/dianna/blob/main/tutorials/explainers/RISE/rise_mnist.ipynb) Streamlit Logo

| [

](/dianna/tutorials/explainers/KernelSHAP/kernelshap_mnist.ipynb) or [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/dianna-ai/dianna/blob/main/tutorials/explainers/KernelSHAP/kernelshap_mnist.ipynb) Streamlit Logo

| | | [

](/dianna/tutorials/explainers/RISE/rise_imagenet.ipynb) or [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/dianna-ai/dianna/blob/main/tutorials/explainers/RISE/rise_imagenet.ipynb) | | [ SimpleGeometric Logo

](/dianna/tutorials/explainers/KernelSHAP/kernelshap_geometric_shapes.ipynb) or [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/dianna-ai/dianna/blob/main/tutorials/explainers/KernelSHAP/kernelshap_geometric_shapes.ipynb)| |*Text* |[ nlp-logo_half_size

](/dianna/tutorials/explainers/RISE/rise_text.ipynb) or [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/dianna-ai/dianna/blob/main/tutorials/explainers/RISE/rise_text.ipynb) Streamlit Logo

](/dianna/tutorials/explainers/LIME/lime_text.ipynb) or [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/dianna-ai/dianna/blob/main/tutorials/explainers/LIME/lime_text.ipynb) Streamlit Logo

|[]()| | *Time series*| [ Weather Logo

](/dianna/tutorials/explainers/RISE/rise_timeseries_weather.ipynb) or [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/dianna-ai/dianna/blob/main/tutorials/explainers/RISE/rise_timeseries_weather.ipynb) Streamlit Logo

| [

](/dianna/tutorials/explainers/LIME/lime_timeseries_weather.ipynb) or [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/dianna-ai/dianna/blob/main/tutorials/explainers/LIME/lime_timeseries_weather.ipynb) Streamlit Logo

](/dianna/tutorials/explainers/LIME/lime_timeseries_coffee.ipynb) or [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/dianna-ai/dianna/blob/main/tutorials/explainers/LIME/lime_timeseries_coffee.ipynb) | | | *Tabular* | [ Penguin Logo

](/dianna/tutorials/explainers/RISE/rise_tabular_penguin.ipynb) or [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/dianna-ai/dianna/blob/main/tutorials/explainers/RISE/rise_tabular_penguin.ipynb) Streamlit Logo

| [

](/dianna/tutorials/explainers/LIME/lime_tabular_penguin.ipynb) or [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/dianna-ai/dianna/blob/main/tutorials/explainers/LIME/lime_tabular_penguin.ipynb) Streamlit Logo

](/dianna/tutorials/explainers/KernelSHAP/kernelshap_tabular_penguin.ipynb) or [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/dianna-ai/dianna/blob/main/tutorials/explainers/KernelSHAP/kernelshap_tabular_penguin.ipynb) Streamlit Logo

| | |

| [

](/dianna/tutorials/explainers/LIME/lime_tabular_weather.ipynb) or [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/dianna-ai/dianna/blob/main/tutorials/explainers/KernelSHAP/kernelshap_tabular_weather.ipynb) Streamlit Logo

](/dianna/tutorials/explainers/KernelSHAP/kernelshap_tabular_weather.ipynb) or [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/dianna-ai/dianna/blob/main/tutorials/explainers/KernelSHAP/kernelshap_tabular_weather.ipynb) Streamlit Logo

| To learn more about how we aproach the masking for time-series data, please read our [Masking time-series for XAI](https://blog.esciencecenter.nl/masking-time-series-for-explainable-ai-90247ac252b4) blog-post.

Scientific use-cases

| *Modality* \ Method |RISE| [LIME](https://youtu.be/d6j6bofhj2M) |Kernel[SHAP](https://youtu.be/9haIOplEIGM)| |:--------------------|:---|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---| | *Images* | | [ LeafSnap30 Logo

](/dianna/tutorials/explainers/LIME/lime_images.ipynb) or [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/dianna-ai/dianna/blob/main/tutorials/explainers/LIME/lime_images.ipynb) || | *Text* | | [ nlp-logo_half_size

](/dianna/tutorials/explainers/LIME/lime_text_eulaw.ipynb) or [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/dianna-ai/dianna/blob/main/tutorials/explainers/LIME/lime_text_eulaw.ipynb) Streamlit Logo

| | | *Time series* | [ FRB logo

](/dianna/tutorials/explainers/RISE/rise_timeseries_frb.ipynb) or [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/dianna-ai/dianna/blob/main/tutorials/explainers/RISE/rise_timeseries_frb.ipynb) Streamlit Logo

](/dianna/tutorials/explainers/KernelSHAP/kernelshap_tabular_land_atmosphere.ipynb) or [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/dianna-ai/dianna/blob/main/tutorials/explainers/KernelSHAP/kernelshap_tabular_land_atmosphere.ipynb)|

IMPORTANT: Hyperparameters

Settings per explainer

The XAI methods (explainers) are sensitive to the choice of their hyperparameters! In this [master Thesis](https://staff.fnwi.uva.nl/a.s.z.belloum/MSctheses/MScthesis_Willem_van_der_Spec.pdf), this sensitivity is researched and useful conclusions are drawn. The default hyperparameters used in DIANNA for each explainer as well as the choices for some tutorials and their data modality (*i* - images, *txt* - text, *ts* - time series and *tab* - tabular) are given in the tables below. Also the main conclusions (🠊) from the thesis (on images and text) about the hyperparameters effect are listed. #### RISE | Hyperparameter | Default value | ImageNet_autocrop

(*i*)|

(*i*) |

(*txt*) |

(*ts*)|

(*ts*)| | ------------- | ------------- | -------------------|-----------------------------| ---------------------------------|---------------------------------|---------------------------------| | $n_{masks}$ |**$1000$** | default | $5000$ | default | $10000$ |$5000$ | | $p_{keep}$ | **optimized** (*i*, *txt*), **$0.5$** (*ts*) | $0.1$| $0.1$ | default | $0.1$| $0.1$| | $n_{features}$ |**$8$** | $6$ |default | default | default | $16$ | 🠊 The most crucial parameter is $p_{keep}$. Lower values of $p_{keep}$ lead to more sentitive explanations (observed for both images and text). Easier classificication tasks usually require a lower $p_keep$ as this will cause more perturbation in the input and therefore a more distinct signal in the model predictions. 🠊 The feature resolution $n_{features}$ exhibited an optimum at a value of $6$. Higher values can offer a finer grained result but require (far) more $n_masks$. This is also dependent on the scale of the phenomena in the input data that we want to take into account in the explanation. 🠊 Larger $n_masks$ will return more consistent results at the cost of computation time. If 2 identical runs yield (very) different results, these will likely contain a lot of (or even mostly) noise and a higher value for $n_masks$ should be used instead. #### LIME | Hyperparameter | Default value | LeafSnap30 Logo

(*i*) |

(*ts*)|

(*ts*)| [

](/dianna/tutorials/explainers/LIME/lime_text_eulaw.ipynb) | | ------------- | ------------- |--------| -----| -----|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | $n_{samples}$ | **$5000$** | $1000$ | $10 000$| $500$| 2000 | | *Kernel Width* | **$25$**| default | default| default| default | | $n_{features}$ | **$10$** | $30$ | default| default| 999 | 🠊 The most crucial parameter is the *Kernel width*: low values cause high sensitivity, however that observation was dependent on the evaluation metric. #### KernelSHAP | Hyperparameter | Default value | mnist_zero_and_one_half_size

(*i*)|

(*i*) |

(*tab*) | | ------------- | ------------- |------------- |------------- | ------------- | | $n_{samples}$ | **auto/int** | $1000$| $2000$ | $136588$| | $n_{segments}$ | **$100$** |$200$ |$200$ |default | | $sigma$ | **$0$** | default | default | default| 🠊 The most crucial parameter is the nubmer of super-pixels $n_{segments}$. Higher values led to higher sensitivity, however that observaiton was dependant on the evaluaiton metric. 🠊 Regularization had only a marginal detrimental effect, the best results were obtained using no regularization (no smoothing, $sigma = 0$) or least squares regression.

This site is open source. Improve this page.