Illinois

Main navigation

University of Illinois shared computing cluster

User Guide

  1. Connecting
  2. Managing your Account
  3. Storage
  4. Data Transfer
  5. Managing Your Environment (Modules)
  6. Programming Environment
  7. Running Jobs
  8. Job Dependencies
  9. Job Arrays
  10. Running MATLAB Batch Jobs
  11. Running Mathematica Batch Jobs
  12. HPC & Other Tutorials
  13. Investor Specific Information

1. Connecting

The Campus Cluster can be accessed via Secure Shell (SSH) to the head nodes using your official University NetID login and password.

Below is a list of hostnames that provide round-robin access to head nodes of the Campus Cluster instances as indicated:

Access Method Hostname Head Nodes
SSH cc-login.campuscluster.illinois.edu Taub & Golub
taub.campuscluster.illinois.edu Taub
golub.campuscluster.illinois.edu Golub

Network Details for UIUC Investors:

The Campus Cluster is interconnected with the UIUC networks via the Campus Advanced Research Network Environment (CARNE) and is addressed out of fully-accessible public IP space, located outside of the UIUC campus firewall. This positioning of the Campus Cluster outside the campus firewall enables access to regional and national research networks at high speeds and without restrictions. This does mean, however, that for some special use cases where it is necessary for Campus Cluster nodes to initiate communication with hosts on the UIUC campus network (e.g., you are hosting a special license server behind the firewall), you will need to coordinate with your department IT Pro to ensure that your hosts are in the appropriate UIUC campus firewall group. Outbound communication from UIUC to the Campus Cluster should work without issue, as well as any communications from the Campus Cluster outbound to regional and national research networks.

2. Managing your Account

When your account is first activated, the default shell is set to bash.

The tcsh shell is also available. To change your shell to tcsh, add the following line:

to the end of the file named .bash_profile, located in your home ($HOME) directory. To begin using this new shell, you can either log out and then log back in, or execute exec -l /bin/tcsh on your command line.

The Campus Cluster uses the module system to set up the user environment. See the section Managing Your Environment (Modules) for details.

You can reset your NetID password at CITES Password Manager page.

3. Storage

4. Data Transfer

Campus Cluster data transfers can be initiated via Globus Online's GridFTP data transfer utility as well as SSH based tools scp (Secure Copy) and sftp (Secure FTP).

GridFTP

The Illinois Campus Cluster Program recommends using Globus Online for Campus Cluster large data transfers. Globus Online manages the data transfer operation for the user: monitoring performance, retrying failures, auto-tuning and recovering from faults automatically where possible, and reporting status. Email is sent when the transfer is complete.

Globus Online implements data transfer between machines through a web interface using the GridFTP protocol. There is a predefined GridFTP endpoint for the Illinois Campus Cluster Program to allow data movement between the Campus Cluster and other resources registered with Globus Online. To transfer data between the Campus Cluster and a non registered resource, Globus Online provides a software package called Globus Connect that allows for the creation of a personal GridFTP endpoint for virtually any local (non Campus Cluster) resource.

Steps to use Globus Online (GO) for Campus Cluster data transfers
Data transfer to existing GO endpoint
Create a new GO endpoint for data transfers
  • Type in or select one of your target endpoints from the 1st pull down selection box.
  • Activate the the endpoint.
  • Download and Install the Globus Connect software for your OS.
    Note: The Globus Connect software should be installed on the machine that you want to setup as an endpoint.
  • Type in your endpoint name that you created during the Globus Connect installation in the 1st endpoint selection box.
  • Type in or select "illinois#iccp" for your other endpoint in the 2nd pull down selection box.
  • Activate the Illinois Campus Cluster endpoint by authenticating using your official University NetID and NetID password.
  • Highlight the data to be transferred and click the appropriate transfer arrow between the two endpoint selection boxes.

SSH

For initiating data transfers from the Campus Cluster, the SSH based tools sftp (Secure FTP) or scp (Secure Copy) can be used.

A variety of SSH based clients are available for initiating transfers from your local system. There are two types of SSH clients, clients that support both remote login access and data transfers and clients that support data transfers only.

SSH Client Remote Login Data Transfer Installs On
MobaXterm is an enhanced terminal with an X server and a set of Unix commands (GNU/Cygwin) packaged in application. Yes Yes Windows
SSH Secure Shell allows you to securely login to remote host computers, to execute commands safely on a remote computer, and to provide secure encrypted and authenticated communications between two hosts in an untrusted network. Yes Yes Windows
Tunnelier is a flexible SSH client which includes terminal emulation, graphical as well as command-line SFTP support, an FTP-to-SFTP bridge, additional tunneling features including dynamic port forwarding through integrated proxy. Yes Yes Windows
PuTTY is an open source terminal emulator application which can act as a client for the SSH, Telnet, rlogin, and raw TCP computing protocols and as a serial console client. Yes Yes* Windows
Linux
Mac OS
FileZilla is a fast and reliable cross-platform FTP, FTPS and SFTP client with lots of useful features and an intuitive graphical user interface. No Yes Windows
Linux
Mac OS
WinSCP is an open source free SFTP client, SCP client, FTPS client and FTP client for Windows. Its main function is file transfer between a local and a remote computer. Beyond this, WinSCP offers scripting and basic file manager functionality. No Yes Windows
FireFTP is a free, secure, cross-platform FTP/SFTP client for Mozilla Firefox which provides easy and intuitive access to FTP/SFTP servers. No Yes Firefox(Add-On)
*PuTTY's scp and sftp data transfer functionality is implemented via Command Line Interface (CLI) by default.

5. Managing Your Environment (Modules)

The module command is a user interface to the Modules package. The Modules package provides for the dynamic modification of the user's environment via modulefiles (a modulefile contains the information needed to configure the shell for an application). Modules are independent of the user's shell, so both tcsh and bash users can use the same commands to change the environment.

Useful Module Commands:

Command Description
module availlists all available modules
module listlists currently loaded modules
module help modulefile help on module modulefile
module display modulefile Display information about modulefile
module load modulefile load modulefile into current shell environment
module unload modulefile remove modulefile from current shell environment
module swap modulefile1 modulefile2 unload modulefile1 and load modulefile2

To include particular software in the environment for all new shells, edit your shell configuration file ($HOME/.bashrc for bash users and $HOME/.cshrc for tcsh users) by adding the module commands to load the software that you want to be a part of your environment. After saving your changes, you can source your shell configuration file or log out and then log back in for the changes to take effect.

Note: Order is important. With each module load, the changes are prepended to your current environment paths.

For additional information on Modules, see the module and modulefile man pages or visit the Modules SourceForge page.

6. Programming Environment

The Intel compilers are available on the Campus Cluster. Load the Intel compiler (version 11.1) module with the following command:

    module load intel 

Load the Intel compilers version 14.0 with:

    module load intel/14.0 

[Version 13.1 is also available via: module load intel/13.1]

The GNU compilers (GCC) version 4.4.5 are in the default user environment. Version 4.7.1 is also available - load this version with the command:

    module load gcc 

Compiler Commands

Serial

To build (compile and link) a serial program in Fortran, C, and C++ enter:

GCC Intel Compiler
 
  gfortran myprog.f   
  gcc myprog.c
  g++ myprog.cc

 
  ifort myprog.f   
  icc myprog.c 
  icpc myprog.cc

MPI

There are 2 MPI implementations available: MVAPICH2 and OpenMPI. To see the all the available MPI implementation and compiler choices use the following command:

    module avail

MPI Implementationmodulefile for MPI/CompilerBuild Commands
MVAPICH2 (Project Home Page) mvapich2/2.0b-intel-14.0
mvapich2/1.6-intel
mvapich2/1.6-gcc

Fortran 77: mpif77 myprog.f  
Fortran 90: mpif90 myprog.f90  
         C: mpicc  myprog.c  
       C++: mpicxx myprog.cc 

Open MPI (Project Home Page) openmpi/1.6.5-intel-14.0
openmpi/1.6.5-gcc-4.7.1
openmpi/1.4-intel
openmpi/1.4-gcc

For example, use the following command to load MVAPICH2 v2.0b built with the Intel 14.0 compiler:

    module load mvapich2/2.0b-intel-14.0

OpenMP

To build an OpenMP program, use the -openmp / -fopenmp option:

GCC Intel Compiler

  gfortran -fopenmp myprog.f  
  gcc -fopenmp myprog.c 
  g++ -fopenmp myprog.cc

 
   ifort -openmp myprog.f   
   icc -openmp myprog.c 
   icpc -openmp myprog.cc

Hybrid MPI/OpenMP

To build an MPI/OpenMP hybrid program, use the -openmp / -fopenmp option with the MPI compiling commands:

GCC Intel Compiler

  mpif77 -fopenmp myprog.f 
  mpif90 -fopenmp myprog.f90  
  mpicc -fopenmp myprog.c
  mpicxx -fopenmp myprog.cc


  mpif77 -openmp myprog.f
  mpif90 -openmp myprog.f90   
  mpicc -openmp myprog.c
  mpicxx -openmp myprog.cc

CUDA

NVIDIA Tesla M2090 GPUs are available as a purchase option in the Golub instance of the campus cluster. CUDA is a parallel computing platform and programming model from NVIDIA for use on their GPUs. The Tesla M2090 GPUs support CUDA compute capability 2.0.

Load the CUDA Toolkit into your environment using the following module:

module load cuda/5.5

Libraries

The Intel Math Kernel Library (MKL) contains the complete set of functions from the basic linear algebra subprograms (BLAS), the extended BLAS (sparse), and the complete set of LAPACK routines. In addition, there is a set of fast Fourier transforms (FFT) in single- and double-precision, real and complex data types with both Fortran and C interfaces. The library also includes the cblas interfaces, which allow the C programmer to access all the functionality of the BLAS without considering C-Fortran issues. ScaLAPACK, BLACS and the PARDISO solver are also provided by Intel MKL. MKL provides FFTW interfaces to enable applications using FFTW to gain performance with Intel MKL and without changing the program source code. Both FFTW2 and FFTW3 interfaces are provided as source code wrappers to Intel MKL functions.

Load the Intel compiler module to access MKL.

Use the following -mkl flag options when linking with MKL using the Intel compilers:

Sequential libraries: -mkl=sequential
Threads libraries: -mkl=parallel

To use MKL with GCC, consult the Intel MKL link advisor for the link flags to include.

OpenBLAS, an optimized BLAS library based on GotoBLAS2 is also available. Load the library (version 0.2.8, built with gcc 4.7.1) module with the following command:

    module load openblas 

7. Running Jobs

User access to the compute nodes for running jobs is only available via a batch job. The Campus Cluster uses the Torque Resource Manager with the Moab Workload Manager for running batch jobs. Torque is based on OpenPBS, so the commands are the same as PBS commands. See the qsub section under Batch Commands for details on batch job submission.

Please be aware that the interactive nodes are a shared resource for all users of the system and their use should be limited to editing, compiling and building your programs, and for short non-intensive runs. The administrators may terminate user processes on the interactive nodes that impact the system without warning. An interactive batch job provides a way to get interactive access to a compute node via a batch job. See the qsub -I section for information on how to run an interactive job on the compute nodes. Also, a very short time test queue provides quick turnaround time for debugging purposes.

To ensure the health of the batch system and scheduler users should refrain from having more than 1000 of their batch jobs in the queues at any one time.

See the Running MATLAB / Mathematica Batch Jobs sections for information on running MATLAB and Mathematica on the campus cluster.

Running Programs

On successful building (compilation and linking) of your program, an executable is created that is used to run the program. The table below describes how to run different types of programs.

Program Type How to run the program/executable Example Command
Serial To run serial code, specify the name of the executable. ./a.out
MPI MPI programs are run with the mpiexec command followed by the name of the executable.
Note: The total number of MPI processes is the {number of nodes} x {cores/node} set in the batch job resource specification.
mpiexec ./a.out
OpenMP The OMP_NUM_THREADS environment variable can be set to specify the number of threads used by OpenMP programs. If this variable is not set, the number of threads used defaults to one under the Intel compiler. Under GCC, the default behavior is to use one thread for each core available on the node.
To run OpenMP programs, specify the name of the executable.
In bash:
export OMP_NUM_THREADS=12
In tcsh:
setenv OMP_NUM_THREADS 12

./a.out
MPI/OpenMP As with OpenMP programs, the OMP_NUM_THREADS environment variable can be set to specify the number of threads used by the OpenMP portion of the mixed MPI/OpenMP program. The same default behavior applies with respect to the number of threads used.
Use the mpiexec command followed by the name of the executable to run mixed MPI/OpenMP programs.
Note: The number of MPI processes per node is set in the batch job resource specification for number of cores/node.
In bash:
export OMP_NUM_THREADS=4
In tcsh:
setenv OMP_NUM_THREADS 4

mpiexec ./a.out

Primary Queues

Each investor group has unrestricted access to a dedicated primary queue with concurrent access to the number and type of nodes in which they invested.

Secondary Queue

One of the advantages of the Campus Cluster Program is the ability to share resources. A shared secondary queue will allow users access to any idle nodes in the cluster. Users must have access to a primary queue to be eligible to use the secondary queue.

While each investor has full access to the number and type of nodes in which they invested, those resources not fully utilized by each investor will become eligible to run secondary queue jobs. If there are resources eligible to run secondary queue jobs but there are no jobs to be run from the secondary queue, jobs in the primary queues that fit within the constraints of the secondary queue may be run on any otherwise appropriate idle nodes. The secondary queue uses fairshare scheduling.

The current limits in the secondary queue are below:

Queue Max Walltime Max # Nodes
secondary 4 hours 208

Notes:

  1. Jobs are routed to the secondary queue when a queue is not specified. i.e., the secondary queue is the default queue on the campus cluster.
  2. By default, jobs in the secondary queue will run on either Taub or Golub nodes. They will not span both Taub and Golub. To restrict jobs to Taub or Golub nodes, add the resource specification taub or golub respectively to the #PBS -l flag in your job batch script. See the qsub section for syntax details.

Test Queue

A test queue is available for providing very short jobs with quick turnaround time. Jobs in the test queue currently only run on Taub nodes.

The current limits in the test queue are:

Queue Max Walltime Max # Nodes
test 5 minutes 2

Batch Commands

Below are brief descriptions of the primary batch commands. For more detailed information, refer to the individual man pages.

8. Job Dependencies

PBS job dependencies allow users to set execution order in which their queued jobs run. Job dependencies are set by using the -W option with the syntax being -W depend=<dependency type>:<JobID>. PBS places the jobs in Hold state until they are eligible to run.

The following are examples on how to specify job dependencies using the afterany dependency type, which indicates to pbs that the dependent job should become eligible to start only after the specified job has completed.

On the command line:


    [taubh1 ~]$ qsub -W depend=afterany:<JobID> jobscript.pbs

In a job script:


    #!/bin/bash
    #PBS -l walltime=00:30:00
    #PBS -l nodes=1:ppn=12
    #PBS -N myjob
    #PBS -j oe
    #PBS -W depend=afterany:<JobID>

In a shell script that submits batch jobs:


    #!/bin/bash
    JOB_01=`qsub jobscript1.pbs`
    JOB_02=`qsub -W depend=afterany:$JOB_01 jobscript2.pbs`
    JOB_03=`qsub -W depend=afterany:$JOB_02 jobscript3.pbs`
    ...

Note: Generally the recommended dependency types to use are before, beforeany, after and afterany. While there are additional dependency types, those types that work based on batch job error codes may not behave as expected because of the difference between a batch job error and application errors. See the dependency section of the qsub manual page for additional information (man qsub).

9. Job Arrays

If a need arises to submit the same job to the batch system multiple times, instead of issuing one qsub command for each individual job, users can submit a job array. Job arrays allow users to submit multiple jobs with a single job script using the -t option to qsub. An optional slot limit can be specified to limit the amount of jobs that can run concurrently in the job array. See the qsub manual page for details (man qsub). The file names for the input, output, etc. can be varied for each job using the job array index value defined by the the PBS environment variable PBS_ARRAYID.

A sample batch script that makes use of job arrays is available in /projects/consult/pbs/jobarray.pbs.

Notes:

10. Running Matlab Batch Jobs

See the Using MATLAB on the Campus Cluster page for information on running MATLAB batch jobs.

11. Running Mathematica Batch Jobs

Standard batch job

A sample batch script that runs a Mathematica script is available in /projects/consult/pbs/mathematica.pbs. You can copy and modify this script for your own use. Submit the job with:

    [taubh1 ~]$ qsub mathematica.pbs

In an interactive batch job

12. HPC & Other Tutorials

The NSF funded XSEDE program offers online training on various HPC topics - see XSEDE Online Training for links to the available courses.

Introduction to Linux offered by the LINUX Foundation (classes start 3rd Quarter 2014).

13. Investor Specific Information

See here for the technical representative of each investor group and links to investor web sites (if available).