1. Obtaining and Installing Fleur on the RWTH cluster

NOTE: Section 1 of this tutorial is especially written for the students of the DFT lecture 2021. These students will use a certain compute cluster and the section is related to setting up the system on that cluster. If you don't belong to this group or want to use Fleur on another machine please follow the instructions in the user guide on installing Fleur. Note also that the tutorial is based on the MaX 5.1 release of Fleur and assumes that the code will be compiled with linking to the HDF5 library.

1.1. Setting up the system

To start the tutorial log in to your RWTH cluster account with something like

ssh -X yourID@login.hpc.itc.rwth-aachen.de

(Do not forget to replace "yourID" by your login ID and to activate your RWTH VPN connection.)

A typical account on the RWTH cluster is set up to be used with a z shell. Make sure that this is the case by checking that invoking

echo $SHELL

is answered by something like /usr/local_rwth/bin/zsh. If you are using a different shell you have to adapt the following steps to your needs.

The RWTH cluster uses a module system to load certain software environments. The loaded module can be seen by typing in the command module list. The available modules are seen with module avail. With module load ... and module unload ... you can specify to load or unload certain modules.

To make Fleur compilable and executable on the cluster we have to load a certain software environment consisting of some modules. Other modules have to be unloaded. The most simple way to make sure that all needed modules are loaded on every startup is to add the respective commands to the file .zshrc in your home directory. We have to add the following lines:

module purge
module load DEVELOP
module load intel/19.0
module load intelmpi/2018
module load cmake/3.16.4
module load LIBRARIES
module load hdf5

If you are not familiar with the linux operating system you additionally have to get used to some text-based editor available on such systems. Popular examples for such editors are vim and nano. You can find numerous guides on how to use them in the internet.

Please note that the hdf5 library is optional. It is an IO library that is used in Fleur to write out more userfriendly output files. Without this library the user obtains a slightly different set of output files and has to be cautious to always keep a consistent set of files.

For the usage of Fleur we have to add some more lines to .zshrc:

The memory of a computer is typically arranged in a so-called heap and a so-called stack. The size of the stack is typically rather limited and often not enough for running Fleur. To overcome this issue we add the line

ulimit -s unlimited

One specialty of the RWTH cluster is that every program that is started from the login node with MPI (message passing interface) parallelization is moved to a different cluster node. These nodes feature a different architecture and thus Fleur will not run on them. We typically compile for the architecture on which the compilation takes place. To avoid this problem one has to use mpirun with the option -host localhost. We will first encounter this problem when we run the tests to check whether the Fleur compilation was successful. Theses tests are MPI parallelized. The exact call with which the MPI parallelized tests are started can be overridden by defining the environment variable juDFT_MPI. For this we add the line

export juDFT_MPI='mpirun -np 2 -host localhost'

to .zshrc.

To invoke the commands we added to .zshrc we can either log out (with exit) and in again or execute the command

source .zshrc

in the home directory.

1.2. Obtaining Fleur

Different versions of the Fleur code are available at different websites. Official releases can be downloaded from www.flapw.de. You can also use Fleur in a virtual box (i.e. if you want to use it on a Windows computer) as part of the Quantum Mobile package. In this package you also find other freely available DFT codes. For this tutorial we start with the last release version of Fleur (MaX 5.1) but want to have the option to update it to the newest development version. For this we create a directory fleur (or something similar) and clone the Fleur git repository into this directory with:

git clone https://iffgit.fz-juelich.de/fleur/fleur.git

In general the newest aspects of the Fleur development (development version, open issues, ...) are directly available at the Fleur Gitlab server.

In your fleur directory you now find a new subdirectory fleur in which the Git repository is found. The newest release is stored in the release branch of the repository. To change into this branch invoke:

cd fleur
git checkout release
cd ..

1.3. Installing Fleur

The general documentation on the installation of Fleur can be found on the Installation of FLEUR pages. For our case the installation is described below.

We assume a directory structure in which the source files are found in a directory .../fleur/fleur. In this directory you find a script configure.sh. Invoking this script with the adequate options will generate a build directory in the current working directory in which a fleur version can be compiled. We want to have this directory in the parent directory .../fleur. To see the available options we first invoke the script with the -h option:

./fleur/configure.sh -h

There are switches that can be used to specify the paths for certain libraries and also a switch to automatically download libraries that are not yet available on the system. Fortunately we don't need them for the RWTH cluster. To generate a build directory the configure script has to be invoked with a specified machine. If you want to install Fleur on a notebook with a gfortran compiler of version >6.3 you can use AUTO as machine. With this the script looks for compilers and libraries itself. For the RWTH cluster we already have a predefined machine specification that works: CLAIX. Invoke the script with

./fleur/configure.sh CLAIX

to finally obtain the build directory. You also get an output informing you about which libraries have been found. Some libraries are mandatory, others are only optional but enhance the capabilities of the Fleur code. If everything works you are advised to change to the build directory and invoke the make command to compile Fleur.

make can either be invoked in a serial or in a parallelized (with -j) way. We don't want to block the whole login node by building Fleur so we either only invoke make or at most make -j2 to allow a two-fold parallelization.

If everyting compiles you now should have 2 executable files in the build directory: fleur_MPI and inpgen. If no MPI support was detected an executable fleur is generated instead of fleur_MPI. fleur and fleur_MPI are two versions of Fleur with different degrees of parallelization. fleur only uses an OpenMP parallelization. fleur_MPI additionally features an MPI parallelization. Calculations on complex structures have to be distributed over several nodes of a computing cluster. For this MPI parallelization is needed. For most calculations in this tutorial we will not need this but we can use fleur_MPI anyway. inpgen is the Fleur input generator. It is used to convert simple text files describing the structure of a unit cell into a Fleur input file consisting of many parameters that are at first set to default values.

The degree of OpenMP parallelization is on most computers controlled by the environment variable OMP_NUM_THREADS. On the RWTH cluster by default this should be set to 1. We check this by typing:

echo $OMP_NUM_THREADS

Is the answer to this command 1?

If you want to speed up your calculations in later tutorial exercises you may slightly increase this number with the command export OMP_NUM_THREADS=4 where you can exchange the 4 with any other meaningful number (smaller or equal to the number of CPU cores on that machine). Please consider that we will perform most calculations on the login node and don't want to block the whole node for other users of the RWTH cluster.

We check if the code works as expected by invoking the tests with:

ctest

You should see that all tests but the tests 43-50 (all tests with 'Hybrid' in their name) pass. The failing of these 8 tests is due to a special internal testing script for these tests that is incompatible to our setup of the computer cluster. Don't worry, everything works as expected.

2. A first calculation

For the first calculation we choose a perfect Si crystal in diamond structure as example system. The inpgen input for such a system is:

Si bulk
&lattice latsys='cF' a0=1.8897269 a=5.43 /
2
14 0.125 0.125 0.125
14 -0.125 -0.125 -0.125

The first line in this input is just a comment line. The &lattice line defines the lattice. In this case we here specify with latsys='cF' a cubic close-packed lattice. a=5.43 specifies the lattice constant. inpgen and fleur in general assume atomic units but in this case we provide the lattice constant in Angstrom. Therefore we need a conversion factor from Angstrom to atomic units. This is specified by a0=1.8897269. Note that the line ends with a /. In an inpgen input every line that starts with & ends with /.

The next line contains the number of atoms in the unit cell. It is followed by a list containing for each atom the atomic number (14 for Si) and the relative position in the unit cell.

A documentation of the general layout of inpgen inputs is provided at the respective page.

Create a directory for this calculation and in it a text file inpSi.txt (or similar) with the content above. To generate the Fleur input invoke in the new directory:

pathToInpgen/inpgen -f inpSi.txt

Several files are created. Of these the input files for the Fleur calculation are inp.xml for the full parametrization of the calculation, the sym.xml file with a list of symmetry operations present in the crystal, and the kpts.xml file containing several sets of Bloch vectors that can be used in the calculation. sym.xml and kpts.xml may also be directly included in the inp.xml file. The file struct.xsf is an XCrySDen structure file that can be used to visualize the unit cell. out is the general text output of inpgen (or fleur after letting fleur run in the directory). If something went wrong in the generation of the Fleur input it is a good idea to have a look in that File to see whether there is a hint what was wrong. FleurInputSchema.xsd is not relevant to the user. It is a general specification of the inp.xml file format in terms of an XML Schema definition.

Have a closer look at inp.xml and identify in it the setup of the unit cell provided to inpgen. We will discuss the contents next week. A documentation on this file is also available.

Next we invoke fleur in the directory with the inp.xml, sym.xml, and kpts.xml files:

pathToFleur/fleur

(Note that depending on your compilation choices in this command fleur may have to be replaced by fleur_MPI.)

In practice DFT is implemented as an iterative algorithm that starts with a first guess for the ground-state electron density and ends after several iterations with a self-consistent ground-state density. In Fleur by default up to 15 iterations of the self-consistency loop are performed (can be changed in inp.xml, parameter itmax). The output of the fleur calculation is available in the out file and also in the out.xml file.

You can observe the development of the distance between the input and output densities of each iteration in the terminal output. Alternatively this can also be obtained after the calculation by invoking grep dist out to find the respective entries in the generated out file.

15 iterations should be more than enough to obtain a self-consistent result for the example system we use here. For this setup Fleur will automatically detect at some point that the charge density is converged and stop before performing 15 iterations.

The terminal output should look similar to:

      Welcome to FLEUR        (www.flapw.de)   
      MaX-Release 5.1          (www.max-centre.eu)
 Running on            1  PE
           1  jobs are distributed on            1  unassigned PE
I/O warning : failed to load external entity "relax.xml"
 Now copying inp_dump.xml

 ========== k-point set info ==========
 Selected k-point list: default
 k-point list type: mesh
 8 x 8 x 8
 Number of k points: 60

 --------------------------------------------------------
 Number of MPI-tasks:             1
 Number of PE/node  :             1
 Number of OMP-threads:           1
 Most efficient MPI parallelization for:
           1           2           3           4           5           6          10          12          15          20          30          60
 Additional eigenvalue parallelization possible
 --------------------------------------------------------
 Iteration:           1  Distance:   8.39787633185374     
 Iteration:           2  Distance:   7.93080614111898     
 Iteration:           3  Distance:  0.923156920868815     
 Iteration:           4  Distance:  0.531937897328569     
 Iteration:           5  Distance:  0.218343559958832     
 Iteration:           6  Distance:  2.038280729896922E-002
 Iteration:           7  Distance:  1.345953943889698E-002
 Iteration:           8  Distance:  1.325079718581890E-003
 Iteration:           9  Distance:  1.892085345610514E-003
 Iteration:          10  Distance:  1.791518425044396E-004
 Iteration:          11  Distance:  4.657252080205842E-005
 Iteration:          12  Distance:  9.497188415732569E-006

 *****************************************
 Run finished successfully
 Stop message:
   all done
 *****************************************
Rank:0 used    0.454   0.137 GB/      492216 kB
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   840  100   101  100   739    351   2575 --:--:-- --:--:-- --:--:--  2574
 Usage data send using curl: usage.json
OK

For setups that do not converge within the specified number of iterations fleur can be invoked again in the same directory to continue the calculations starting with the last charge density of the previous fleur run. Note however that this will overwrite the out file and rename the old out.xml file by adding a number to it.

We will discuss the contents of the out and out.xml files next week.

The most important quantity that is directly obtainable from a DFT calculation is the total energy of the unit cell's ground state. Although this quantity cannot be measured, total energy differences can be used to calculate many measurable quantities. It is written out in each iteration of the self-consistency cycle but only meaningful for the self-consistent ground-state density. How large is the total energy for the Si example (grep "total energy" out to obtain the respective values for all iterations)?

You can exit the session on the RWTH cluster by typing exit. Note: On many other systems also logout works (not here).