This lesson is in the early stages of development (Alpha version)

Introduction to Parallelism

Overview

Teaching: 25 min
Exercises: 0 min
Questions
  • What is parallelism?

  • What types of parallel jobs are there?

  • How do the parallel job types impact scheduling?

Objectives
  • Have a good sense of what parallelism can do for your program

  • Know the types of parallelism

What is Parallelism and why might I want it?

We always want our programs to run faster. One way to do this is to buy faster computers. Another way (which we’ll discuss later) is to optimize the compilation of our programs. Yet another way is to revise the structure of our programs to run faster.

Parallelism is simply the art of performing computational work on multiple CPUs at the same time to boost performance.

Parallelism is good.

Parallel computing choices

There are a number of different ways to use parallelism to perform compuational work:

The following image summarizes the differences between MPI and OpenMP:

This table summarizes some situations that might drive your choice of parallel programming library:

  1 Node N Nodes
1 CPU Serial job MPI
N CPUs OpenMP MPI, or Hybrid/advanced

Impact of choices of parallel job type on scheduling.

Running many serial jobs

In terms of scheduling, this is the best situation to be in, because you can submit each job separately. Each of the jobs will be asking for a single CPU, and will have different start times.

In your Slurm submission script, you will have a line that looks like:

#SBATCH        --ntasks=1

Running a distributed memory job (e.g. MPI)

Because each process of this kind of job has it’s own private memory space, these sorts of jobs can run on multiple machines and can scale up in size more easily than other parallelism types.

In terms of scheduling, each process has to start at the same time so need to fit in a block of many processors, potentially on different machines, and is harder to schedule than using many serial jobs.

Running a shared memory job (e.g. OpenMP)

On a single machine, all of the threads start at the same time so need to fit in a block of many processors on the same machine.

Running a mixed, hybrid job (e.g. OpenMP)

Multiple blocks on multiple machines need to be available for this job to run.

Key Points

  • Parallelism can speed up the execution of your program

  • The structure of your code, and scheduling considerations, will influence how you parallelize your computational work