*** Wartungsfenster jeden ersten Mittwoch vormittag im Monat ***

Skip to content
Snippets Groups Projects
Commit 766503c3 authored by Harrison, Simeon's avatar Harrison, Simeon
Browse files

Updated agenda in D1_00_WelcomeAndMotivation.ipynb

parent ae016443
No related branches found
No related tags found
No related merge requests found
%% Cell type:markdown id:196e5bb0-84d4-4f77-be0a-4da617e3a080 tags:
# [Python for HPC](https://events.vsc.ac.at/event/109/)
%% Cell type:markdown id:48392fdf-428a-472f-9188-654bc9632ca2 tags:
**Welcome to the course on Python for High Performance Computing.**
%% Cell type:markdown id:ae5386f1-19b6-4063-8b00-fc9f466142b0 tags:
## Outset
Python is a highly expressive, easy to learn and widely popular programming language. There are many built in libraries and even more made available by third parties.
However, this comes at a cost: Python is notoriously slow.
This is mainly owed to the fact that Python is dynamically typed, which means that variable types are determined and checked at runtime rather than during compilation.
In addition, Python has the infamous Global Interpreter Lock (GIL). The GIL acts like a lock that allows only one thread to hold the control over the Python interpreter. As a result only one thread can be executed at a time. This actually results in a performance boost for single threaded programs, but poses a crucial bottleneck in parallel code.
So why use Python?
The versatility and ease of use of the language still makes it a great starting point for almost any project that involves code. Properly utilized libraries and techniques speed up parallel Python considerably. Overall we could argue that what you loose in performance compared to languages like C or Fortran, is probably negligible to what you gain in time by using a simple to understand programming language.
%% Cell type:markdown id:74d6a9b1-22a4-45e5-9350-eecea97ba1df tags:
## Contents of this Course
In this course we will look at workarounds and tools to improve Python's parallel performance.
%% Cell type:markdown id:9459e518-bfd5-445e-983f-e9a890e3cd78 tags:
### Day 1 (Monday)
%% Cell type:markdown id:20bd601d-52da-4f65-a7b0-5d74f62916c4 tags:
**Advanced Python**
* **Python developer's Swiss Army knife**
>* IDEs
>* Documentation & Type hints
>* Logging
>* Testing
>* Debugging
* **Python usage on HPC systems**
>* Module & spack
**Python usage on HPC systems**
>* Module & Spack
>* Virtual env
>* Conda
>* Apptainer (Singularity)
* **Python and Slurm**
>* A Slurm refresher
>* Slurm with module/spack, venv and conda
>* Slurm with Apptainer (Singularity)
>* Slurm and MPI
* **Benchmarking & Profiling**
>* Benchmarking & Time measurements
>* Built-in profiling tools
>* Other profiling tools
>* Slurm
* **Debugging, benchmarking & profiling**
>* Debugging
>* Benchmarking & time measurements
>* Q & A
%% Cell type:markdown id:d8cf5a05-5156-4221-b9e4-285af5b4a423 tags:
### Day 2 (Wednesday)
%% Cell type:markdown id:fa82c08f-8669-4b62-be96-ff9539028333 tags:
**Advanced Python**
* **Python behind the scenes**
>* Python runtimes: CPython, PyPy, Ironpython
>* Collections & Caching Overview
>* Python datamodel: Objects, Special attributes & methods, Slots
>* Python datamodel: Inheritance, Metaclasses
>* Generators, Built-In Functions, List comprehensions & Lambdas
>* Garbage Collector
>* Integrating native code in python
**Debugging, Benchmarking & Profiling continued**
>* Built-in profiling tools
>* Other profiling tools
**Single-node parallelization**
* Introduction: Definitions, Batch & Stream processing
* **Useful libraries and their limitations**
>* NumPy
>* Numba
>* Introduction to parallelization
>* Integrating native code & libraries in Python
>* Cython
>* Numba
>* Dask
>* Q & A
%% Cell type:markdown id:d5df334f-7724-47b5-b793-dab57389a0ff tags:
### Day 3 (Friday)
%% Cell type:markdown id:c85b80eb-55b2-42dd-b2b6-4b27e1331f31 tags:
**Single-node parallelization**
* **Useful libraries and their limitations (cont.)**
>* Pandas
>* Dask
>* TensorFlow
>* cuDF & cuPy
**Multi-node parallelization**
>* Introduction to multi-node parallelization
**Frameworks for multi-node parallelization**
>* Introduction to multi-node
>* Slurm & MPI
>* mpi4py
>* Dask distributed & Dask-MPI
>* Horovod: multi-GPU for TensorFlow and PyTorch
>* Dask distributed
**Python on GPUs**
>* Numba on GPUs
>* RAPIDS
>* Benchmarking and profiling code for GPUs
>* Q & A
%% Cell type:markdown id:ed6a4cd3-7b61-41cc-8c5c-8f772a039313 tags:
## Course Materials
All course materials can be accessed *after the course* on our [python4hpc gitlab repository](https://gitlab.tuwien.ac.at/vsc-public/training/python4hpc).
%% Cell type:markdown id:46007f23-cd88-43e5-8cdf-1331f7914f10 tags:
## JupyterLab
**Finding your way around a Jupyter Notebook**
%% Cell type:markdown id:786c30ed-9772-43f4-b687-fd85c6b7fdef tags:
You can hit the plus symbol on the top left to get to the launcher page. From there you can start a Jupyter Notebook, a Python console or a standard terminal. If you want to open an existing notebook, double click on it in the Files tab of the JupyterLab window.
%% Cell type:markdown id:35f4763e-2777-4836-90e2-0b2b175c7a22 tags:
A Jupyter notebook consists of cells. The two main types of cells you will use are **code cells** and **markdown cells**.
A **code cell** contains actual code that you want to run. You can specify a cell as a code cell using the pulldown menu in the toolbar of your Jupyter notebook. Otherwise, you can can hit `Esc` and then `y` while a cell is selected to specify that it is a code cell. Note that you will have to hit `enter` after doing this to start editing it.
If you want to execute the code in a code cell, hit `Enter` while holding down the `Shift` key (denoted Shift + Enter). Note that code cells are executed in the order you shift-enter them. That is to say, the ordering of the cells for which you hit Shift + Enter is the order in which the code is executed. If you did not explicitly execute a cell early in the document, its results are not known to the Python interpreter.
**Markdown cells** contain text. The text is written in markdown, a lightweight markup language. You can read about its syntax here. Note that you can also insert HTML into markdown cells, and this will be rendered properly. As you are typing the contents of these cells, the results appear as text. Hitting `Shift + Enter` renders the text in the formatting you specify.
You can specify a cell as being a markdown cell in the Jupyter toolbar, or by hitting `Esc m` in the cell. Again, you have to hit enter after using the quick keys to bring the cell into edit mode.
In general, when you want to add a **new cell**, you can click the `+` icon on the notebook toolbar. The shortcut to insert a cell below is `Esc b` and to insert a cell above is `Esc a`. Alternatively, you can execute a cell and automatically add a new one below it by hitting `Alt Enter`.
Make sure you shut down the kernel in use after completing a Notebook. You do this by clicking on the `Stop` symbol on the left and then `x-ing` the notebook for which you want to shut down the kernel. This releases the memory you used when running the notebook.
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment