getting_started.rst

###############
Getting Started
###############

.. TODO: Distinguish between jackknife module and jackknife_adga "script"

Welcome to `Getting Started`. This page is intended for users and will give a brief
introduction as well as tell you how to install and run the code. At the end it also shows
how to build this documentation you are currently reading.


Introduction
************

Assume you have a random variable :math:`x` that you can measure, some known, non-linear
function :math:`f` and the observable you are actually interested in is given by
:math:`y=f(x)`. Since :math:`f` is non-linear, the transformed sample mean
:math:`f(\bar{x})` is a biased estimator for :math:`y` and linear error propagation is not
well suited to estimate its error. Jackknife resampling allows to reduce the bias and give
a better estimate for the error of :math:`y`. For more details see the source code
documentation of :doc:`jackknife`.

In the special case of the present software the observables of interest :math:`y` are
self-energies and susceptibilities. It is possible to jackknife multiple of them at once.
The non-linear function :math:`f` is the ADGA_ program and the random input samples
:math:`x` are :abbr:`DMFT (dynamical mean field theory)` Green's functions obtained with
w2dynamics_. Since multiple input samples are needed, you have to use worm sampling to
generate the desired number of 2-particle Green's functions. The more worm samples you
use, the better your statistics will get. For more details see the source code
documentation of :doc:`adga`.

.. _ADGA: https://github.com/AbinitioDGA/ADGA
.. _w2dynamics: https://github.com/w2dynamics/w2dynamics


Installation
************

`jackknife` is written in Python 3, but also works with Python 2.7 and requires the
following packages.

   - NumPy (1.16.2)
   - h5py (1.10.2)
   - configparser (3.7.4)

The numbers in parenthesis are not the minimum required version, but just one set of
versions that are tested and work. In addition

   - w2dynamics_ and
   - ADGA_

must also be installed (instructions on how to do that are provided in their respective
documentation). ADGA is the program that actually calculates the self-energies and
susceptibilities. w2dynamics is a DMFT code used to calculate 1- and 2-particle Green's
functions, which are the input quantities of ADGA.


Usage
*****

The actual jackknife code is relatively easy to use, but one needs to generate quite a few
input and config files with/for w2dynamics and ADGA beforehand.

Input
=====

In order to estimate the standard error, covariance matrix, etc. of ADGA quantities the
following input and config files are necessary:

   - an ADGA config file
   - a file containing the momentum resolved values of the tight-binding hamiltonian
     :math:`H(k)`
   - a file containing the 1-particle Green's function from w2dynamics
   - a file containing *multiple* 2-particle Green's functions from w2dynamics using worm
     sampling
   - a jackknife config file

In the ADGA config file the correct ``HkFile`` and ``1PFile`` must be specified, but the
``2PFile`` and ``Outfile`` key will be set automatically by the program according to the
values given in the jackknife config file. For further details on the files regarding
ADGA_ and w2dynamics_, see their respective documentation.

Jackknife config file
---------------------

An example of a jackknife config file is provided in the code repository
(:download:`jackknife_adga.ini <../../jackknife_adga.ini>`). In the ``[General]`` section
there are three obligatory keys that every file must contain:

   - ``adga_root_directory``: the :file:`ADGA` directory which contains the
     :file:`scripts` and :file:`bin` folder
   - ``adga_config_file``: ususally called :file:`dga.conf`
   - ``two_particle_file``: the file containing the 2-particle Green's functions from
     w2dynamics

More control is possible with the following optional keys:

   - ``output_file_prefix``: default value = ``jackknife``; the current date and time will
     be appended to this prefix of the output file
   - ``store_output_samples``: default value = ``no``; if set to ``yes``, the transformed
     bias-corrected ouput samples will also be stored in the output file (see
     :class:`~jackknife.Jackknife` for a definition of these)

In the ``[Observables]`` section the group name of at least one HDF5 dataset of the ADGA
output file must be given. The keys of this section will be used as the names of the
observables in the :ref:`jackknife output file <Output>`.

.. note::
   See the documentation of `configparser.ConfigParser.getboolean()`_ for the possible
   values of boolean keys in the jackknife config file.

.. _configparser.ConfigParser.getboolean():
   https://docs.python.org/3/library/configparser.html#configparser.ConfigParser.getboolean

Run
===

If all input and config files are available execute the following command to start the
actual calculation::

   python main.py [jackknife_config_file] [number_of_processes]

If the number of processes is not specified, it is set to ``1``. If the name of the
jackknife config file is not given, the default value ``jackknife.ini`` will be used.

Output
======

The output is written to an HDF5 file named according to the ``output_file_prefix`` set in
the :ref:`jackknife config file <Jackknife config file>` with the date and time of the
calculation appended to it. For each observable given in that same file there is a group
of datasets containing the jackknife estimator as well as some other common statistics.
For a complete list see :meth:`Jackknife.write_results_to_file
<jackknife.Jackknife.write_results_to_file()>`. A description of the statistics can be
found at :meth:`Jackknife.do_estimation <jackknife.Jackknife.do_estimation>`.

The attributes of the group ``.config`` contain all key:value pairs of the ``[General]``
section of the jackknife config file except for the ``output_file_prefix``. In addition 
following useful metadata is included:

   - ``n_samples``: number of samples used for jackknifing
   - ``qmc.meas``: number of qmc measurements per core used in the calculation of the
     2-particle Green's function
   - ``n_processes``: number of processes used during the ADGA calculations


Building the documentation
**************************

The documentation you are currently reading as well as all the source code documentation
is built with Sphinx_. It uses Python's docstrings_ and reStructuredText_. To build it on
your local machine you need to install Sphinx (e.g. with conda) and execute the following
commands::

   cd .../jackknife/doc
   make html

The generated HTML files can be found in :file:`.../jackknife/doc/build/html/`. The main
page is called :file:`index.html`.

.. _Sphinx: http://www.sphinx-doc.org/en/master/
.. _docstrings: https://www.python.org/dev/peps/pep-0257/
.. _reStructuredText: https://docutils.sourceforge.io/rst.html