*** Wartungsfenster jeden ersten Mittwoch vormittag im Monat ***

Skip to content
Snippets Groups Projects
D1_04_prof_02_builtin_profiling_tools.ipynb 8.43 KiB
Newer Older
Muck, Katrin's avatar
Muck, Katrin committed
{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "9a298b63-cf46-4288-885f-ff767e7da837",
   "metadata": {},
   "source": [
    "# Python built-in profiling tools\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b1a2346b-6c27-44e8-bf23-74a92d75647c",
   "metadata": {},
   "source": [
    "When looking into profiling tools we first should look into what python provides right out of the box cause sometimes one may be in a situation where its simply not possible to install and run more complex tools.\n",
    "\n",
    "## Profiling code example\n",
    "\n",
    "Since we need an example to illustrate how the tools work, we are going to use a 2D heatmap calculation written in python. For the sake of having less lines of code, comments and any safe guards were omitted."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "14a853ca-db59-4b6a-8a19-b9f1374908a3",
   "metadata": {},
   "outputs": [],
   "source": [
    "%run examples/fdm_2d_heat_equation.py"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1bf2f941-d160-4ad5-b901-41238dccbba2",
   "metadata": {},
   "outputs": [],
   "source": [
    "# create a solver instance\n",
    "solver = HeatEquationSolver.get_default()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1a8beab1-ccec-4bb3-82e9-2d7deda78347",
   "metadata": {},
   "source": [
    "## [cProfile and profile](https://docs.python.org/3/library/profile.html)\n",
    "\n",
    "`cProfile` and `profile` are both deterministic profilers for Python programs and part of the standard library. To be precise, these two profilers are actually different implementations of the same profiling interface.\n",
    "\n",
    "As the name suggests, `cProfile` is a C extension with reasonable overhead and therefore a viable choice for programs with a longer runtime. `profile` is the (original) pure python module counterpart with significantly more overhead compared to `cProfile`. The module provided the original specification and is still maintained to be able to easily extend the profiler from python.\n",
    "\n",
    "Both produce statistics which can be formatted into reports using the `pstats` module."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d2f2f8e7-007e-4229-a7c9-aa56efc286f5",
   "metadata": {},
   "outputs": [],
   "source": [
    "import cProfile\n",
    "cProfile.run('solver.calculate()')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8baa79dd-5171-4a9e-b099-2e12d3dac51b",
   "metadata": {},
   "source": [
    "The column headings are\n",
    "- `ncalls` - number of calls\n",
    "- `totime` - total time spent in the given function (excluding time in subfunctions)\n",
    "    - `percall` - `totime`/`ncalls`\n",
    "- `cumtime` - cumulative time spent in this and all subfunctions\n",
    "    - `percall` - `cumtime`/`pcalls`\n",
    "- `filename:lineno(function)` - function identification\n",
    "\n",
    "Of course `cProfile` also has a magic command which we can use instead of invoking cProfile manuall or from the terminal."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ea98aa03-af80-4c6a-9234-5e346c9d8016",
   "metadata": {},
   "outputs": [],
   "source": [
    "%prun solver.calculate()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bd4720c1-e68f-40f4-822b-63c9448bca97",
   "metadata": {},
   "source": [
    "### Analysing profile data with [pstats](https://docs.python.org/3/library/profile.html#module-pstats)\n",
    "\n",
    "The `pstats` module works closely together with `profile`/`cProfile`. If `cProfile.run` is used without a file name a `Stats` class (from `pstats`) is automatically created in the background and a simple profiling report is printed.\n",
    "\n",
    "However we can also write the results to an intermediate file and create our own profiling report programmatically."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c4f7acf1-1cee-494d-af97-eecd2ec70023",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "import cProfile\n",
    "from pstats import Stats, SortKey\n",
    "\n",
    "filename = 'temp/profiling/heatmap.stats'\n",
    "\n",
    "cProfile.run('solver.calculate()', filename)\n",
    "\n",
    "stats = Stats(filename)\n",
    "stats.strip_dirs()\n",
    "stats.sort_stats(SortKey.TIME, SortKey.CALLS)\n",
    "stats.print_stats()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1237aafd-c282-4be0-9d12-8dfad5f2e547",
   "metadata": {},
   "source": [
    "It is also possible to use the profile as a context, this avoids using an intermediate file while still allowing for customization."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "10753682-8815-4856-809b-07bd39027d4f",
   "metadata": {},
   "outputs": [],
   "source": [
    "import cProfile\n",
    "from pstats import Stats, SortKey\n",
    "\n",
    "with cProfile.Profile() as profile:\n",
    "    profile.run('solver.calculate()')\n",
    "    stats = Stats(profile)\n",
    "    stats.strip_dirs()\n",
    "    stats.sort_stats(SortKey.TIME, SortKey.CALLS)\n",
    "    # only print top 5\n",
    "    stats.print_stats(5)\n",
    "    # getting the stats profile programmatically\n",
    "    print(stats.get_stats_profile())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ebabe53b-74e5-4714-910d-7e689b8f0ef6",
   "metadata": {},
   "source": [
    "## Memory profiling with [tracemalloc](https://docs.python.org/3/library/tracemalloc.html)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e8a0f8a5-bb26-486d-b7bc-a9971a6212ee",
   "metadata": {},
   "source": [
    "Another important aspect of profiling is the amount of memory used by applications, functions and expressions.\n",
    "\n",
    "`tracemalloc` is a standard python package and can be used to trace simple memory related problems such as for example:\n",
    "- statistics of how much memory was used by which lines\n",
    "- calculating differences between two snapshots to detect leaks\n",
    "- traceback where objects were allocated\n",
    "\n",
    "and more."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f1e232e3-50d6-431f-b27e-5f3b14db0bef",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "%%writefile temp/profiling/memory_tracemalloc.py\n",
    "import tracemalloc\n",
    "import linecache\n",
    "import os\n",
    "\n",
    "from fdm_2d_heat_equation import HeatEquationSolver, run_repeated\n",
    "from memory_profiling import take_snapshot, display_top, display_diff, display_biggest_diff_traceback\n",
    "\n",
    "# create instance\n",
    "solver = HeatEquationSolver.get_default()\n",
    "\n",
    "# start tracing\n",
    "tracemalloc.start()\n",
    "snap_before = take_snapshot()\n",
    "\n",
    "# call method\n",
    "run_repeated(2, solver.calculate)\n",
    "\n",
    "snap_after = take_snapshot()\n",
    "# stop tracing to release memory\n",
    "tracemalloc.stop()\n",
    "\n",
    "# display some output\n",
    "display_top(snap_after)\n",
    "display_diff(snap_before, snap_after)\n",
    "display_biggest_diff_traceback(snap_before, snap_after)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c10e7da4-dae2-4adb-9006-4955e4ca4169",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bash\n",
    "export PYTHONPATH=tooling/:examples/\n",
    "python3 -m temp.profiling.memory_tracemalloc"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c74281d1-73b1-4e52-b5b9-29fb3c0bb0d6",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "%%bash\n",
    "export PYTHONPATH=tooling/:examples/\n",
    "\n",
    "# to track everything from the beginning, set PYTHONTRACEMALLOC to something >= 1\n",
    "# or e.g. 10 to track 10 stack frames\n",
    "export PYTHONTRACEMALLOC=10\n",
    "\n",
    "python3 -m temp.profiling.memory_tracemalloc"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5b64ea4e-a476-4e6c-bb76-01604039f1aa",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}