Unsorted Notes

Using Python in Batch Jobs on Lindgren

A few notes how to run Python on a Cray system.

Python scripts are a useful complement in the organisation of batch jobs. With the availability of new Python distributions that contain comprehensive and optimized package collections I am going to let them do even some larger tasks that would require compiled programs otherwise. However, the scripts have to run on the compute nodes. This is not at all a problem on common Linux clusters, but, a few things have to be considered on Cray systems like our Lindgren. Here is a brief description how I arrange my jobs using Python.

Getting started

A simple job script could look like the following example:

 1 #!/bin/bash
 2 #PBS -l mppwidth=48  # == 2 nodes
 3
 4 cd $PBS_O_WORKDIR
 5
 6 NNODES=2 # available nodes
 7 PPN=2    # processes per node
 8
 9 # Mandatory settings:
10 # Allow usage of shared libraries
11 export CRAY_ROOTFS=DSL

20 module load anaconda/py27/1.8  # load Python 2.7.x distro, adapt
21                                # PATH and LD_LIBRARY_PATH
22
23 # Start the processes on the nodes
24 source activate_python
25 aprun -n $(( ${NNODES}*${PPN} )) -N ${PPN} python ./myscript.py
26 source deactivate_python

Python uses shared libraries as base mechanism to access its own default library as well as extension packages. We have to define the envrionment variable CRAY_ROOTFS as shown in line 11 in order to make these libraries available on the compute nodes during the job runtime.

One will use Anaconda's Python executable and its accompanying packages after loading the envrionment module on line 20. Finally, it is possible to start Python processes in parallel on the reserved compute nodes now. The call to aprun as shown on line 24 will start two Python processes on each node.

Advanced job control

It sometimes is needed to configure jobs more sophisticated. Batch jobs need frequently disk space for temporary files. Its default size is limited on the compute nodes, and therefore, one has to use a directory in the normal filesystem. Furthermore, there are some configuration files that contain my default configuration for the use of the matplotlib package. Their location can be specified by the environment variable XDG_CONFIG_HOME. So, I add the following lines to script in order to configure the job appropriatedly:

12
13 # Optional settings:
14 # Define space for temporary files
15 export TMP=/cfs/klemming/scratch/m/michs
16 # Define place for matplotlib configuration files
17 # at "${XDG_CONFIG_HOME}/.config/matplotlib"
18 export XDG_CONFIG_HOME=/cfs/klemming/nobackup/m/michs
19

Now I could run again the Python processes that would use these definitions. Nevertheless, in order to go one step further in the configuration I will start the Python processes not immediately rather I start another batch script on each compute node and this batch script in turn will execute the Python scripts after some further preparation. The index of the node, running from 0 to NNODES-1 will be given to the script as a parameter (line 26). It could be used to calculate the work sharing between the nodes. The wait statement after the loop (line 30) is important in order to avoid early termination of the batch job. So, I change my job script starting from line 23 to the end of the file as follows:

23 # Start the processes on the nodes
24 source activate_python
25 CNT=0
26 while [ $CNT -lt $NNODES ] ; do
27     aprun -n 1 -N 1 ./node_script.sh $CNT &
28     let CNT=CNT+1
29 done
30
31 wait
32 source deactivate_python

And here follows node_script.sh that will be executed on each compute node as shown above (the explanations follow below the listing).

 1 #!/bin/bash
 2
 3 # Command line args
 4 NODE_NUM=$1
 5
 6 # Filesystem for Temporary space can be shared between nodes.
 7 # Create a directory for temporary files on each node.
 8 TMP=${TMP:-/tmp}
 9 export TMP=${TMP}/$( \
10             python -c "import os; print os.getuid()")_$( \
11             python -c "import os; print os.uname()[1]")
12 rm -rf ${TMP}                # remove previous content
13 if [ "$?" != "0" ] ; then exit -1 ; fi
14 mkdir ${TMP}                 # create temp. directory
15 if [ "$?" != "0" ] ; then exit -1 ; fi
16
17 # Start the work on the node...
18 python ./myscript.py &
19 python ./myscript.py &
20
21 wait  # ...and wait for its completion
22
23 # Clean temporary space
24 rm -rf $TMP

The node index will be retrieved from the command line parameter on line 4. Thereafter will be the configuration of the disk space for temporary files completed. I defined the directory for temporary files by setting the environment variable TMP in the job script already. This definition will be exported to each shell script running on the nodes. So, every compute node would use the same directory on this shared filesystem. I prefer a setup in which each compute node uses a specific directory for its processes. Here, these directories will be created as subdirectories of the earlier specified location. The directory names are composed from the UID of the user and the name of the compute node in order to avoid duplicate names. All this is done on lines 6 to 15 of the script above. Of course, we clean up the temporary space at the end of the work, i.e. in the script on line 24.

The Python processes will be started after the preparation of the node on lines 18 and 19. I do not need to use aprun because this script had already been started through it. Even though, I have of course to use the wait statement in order to avoid early termination of the script.

Read more

Much more information how-to use Lindgren and about its software installations can be found on PDC's website.


2013-12-22 – Category: hpc – Tags: batch-processing lindgren pdc python