% Parallel and Functional Programming % Robin Bate Boerop % Fredericton Developer User Group, 22 February 2012

Agenda Preface

The agenda includes the standard talk (search for "The Free Lunch Is Over") to motivate learning to program with concurrency and parallelism.

Apologies if you have heard it before.

Estimated talk time: 1h15m.

Agenda

About this talk
About the speaker
Part I
- Moore's Law
- Motivation: why understand parallelism and concurrency?
- Vocabulary: understand writings on the topic
Part II - Problems with traditional concurrent/parallel programming
Part III
- Characteristics of functional programming
- Approaches to concurrency and parallelism in functional programming

About this talk

This highly technical talk will explain the problems encountered by imperative programming languages (C/C++, Java), and will explain how functional programming languages and frameworks (Erlang, Node, Haskell) have created elegant solutions. The task of writing a high performance IRC server will serve as an example.

I am ditching the running example mentioned in the abstract.

About this talk

Goal 1: To explain why understanding parallel and concurrent programming is more important than ever.

Goal 2: To point you to "ahead-of-the-curve" software development tools and techniques for including parallelism and concurrency in the software that you make.

About The Speaker

Name: Robin Bate Boerop
Class: Software Developer
Alignment: Chaotic Good

(D&D joke appropriate for pre-WoW crowd.)

Moore's Law

Is really about the number of components on ICs.
Double every year.
Performance doubles every 18 months, history re-written by Intel.
How?
This is the motivation for learning to program parallel and concurrent programs.

Vocabulary 1

Concurrency: something to do with your problem.

Parallelism: seeking performance with repeated (or simply more) hardware.

Vocabulary 2

Data parallelism:

Operate simultaneously, often identically, on bulk data
Single flow of control
Synchronization is implicit
Provides massive parallelism
Easy to obtain correctness
a.k.a. SIMD

Vocabulary 3

Task parallelism:

Explicit threads
Synchronization by programmer via locks, messages, or STM
Provides modest parallelism
Very difficult to program correctly
a.k.a. MIMD

Vocabulary 4

Semi-implicit parallelism:

Evaluation of pure functions in parallel
Synchronization is implicit
Provides modest parallelism
Easy to program (FP langs.)

Example of Semi-Implicit Parallelism

f list =
   sum (map blah list)
   --   ^^^

becomes

f list =
   sum (parMap blah list)
   --   ^^^^^^

Parallelism in Traditional PLs 1

Semi-implicit parallelism not available
For data parallelism, use libraries
- From Intel and other vendors
- Orc
- Use OpenGL
- OpenCL (for GPUs)

Parallelism in Traditional PLs 2

Most parallelism comes from task (explicit) parallelism (concurrent programming running on multi-core)
Many threads, most performing I/O
Shared address space
Eg. GUIs, web servers, BitTorrent clients
Non-deterministic by design
Need (lightweight) threads, way to coordinate
- pthreads, Java locks

Problems with Locks

Race-condition: forgot to obtain correct lock
Deadlock: obtained locks in wrong order
... or did not fix an order
Lost wakeups: forgot to notify cond var
Error recovery mindfuck: need to restore correct lock state in error condition
Future-proofing: must ensure subsequent maintainers of your code know what to do
Just plain hard to get right

High Intellectual Overhead of Traditional Concurrent Programming

Must document re-entrancy characteristics of API
Must understand threading to use API, if only that you do not understand
Thread pools
Thread local storage
Interaction between GC and multithreading
Message pumps, event handlers, etc.

Functional Parallel Programming

Note that large-scale applications of parallel programming are declarative and value-oriented:
- SQL
- LINQ
- Map/Reduce frameworks
Which is better?
- Plan A: start with a language whose computational model is sequential, and then add parallelism.
- Plan B: start with a language whose computational model is parallel by default.

Node

Core insight: disallow multithreading!
It is a single event loop.
Not as stupid as it might seem.
Makes for specialist language.
Great for creating servers by JavaScript developers.

Erlang

Processes (really more like threads) share nothing.
Message passing by copy (serialization).
Processes have independent GC, independent failure!
Communicate over channels.
Highly reliable systems, phone switches.

Haskell - What Is It?

Strongly typed purely functional programming language.
Extremely popular among PL enthusiasts.
GHC is production-grade.
Is the target for PL researchers.
Great for writing EDSLs.

Haskell

Advantages of Functional Programming for writing correct programs which include concurrency or parallelism.

Side-effects are more tightly controlled than imperative programs.
DSLs for GPUs or data parallel instructions
Software transactional memory
Semi-implicit parallelism

Task Parallelism in Haskell

This is the explicit threads, synchronization by programmer story
Haskell has extremely lightweight threads
- Millions of threads per program not a problem
- Overhead per thread: 100 bytes approx.
I/O via Unix epoll: okay to have millions of outstanding I/O requests
Threads shared address space
Synchronization via STM, or more explicitly via MVars

Example of Task Parallel Programming in Haskell Using STM

module Main where
import Control.Monad
import Control.Concurrent
import Control.Concurrent.STM

atomRead = atomically . readTVar
dispVar x = atomRead x >>= print
appV fn x = atomically $ readTVar x >>= writeTVar x . fn
milliSleep = threadDelay . (*) 1000

main = do
   shared <- atomically $ newTVar 0
   before <- atomRead shared
   putStrLn $ "Before: " ++ show before
   forkIO $ 25 `timesDo`
      (dispVar shared >> milliSleep 20)
   forkIO $ 10 `timesDo`
      (appV ((+) 2) shared >> milliSleep 50)
   forkIO $ 20 `timesDo`
      (appV pred shared >> milliSleep 25)
   milliSleep 800
   after <- atomRead shared
   putStrLn $ "After: " ++ show after
   where timesDo = replicateM_

Output

Functional Programming for Parallelism and Concurrency: Summary

No silver bullet. Concurrency and parallelism will soon be mandatory. World is complicated. Get used to it!
Functional programming (Haskell in particular) already way better than traditional programming languages for concurrency (explicit/task parallelism).
For large scale data parallel programming, we are already using declarative programming (map/reduce, SQL, etc.). More of that is coming.
Haskell hosts multiple paradigms. Your traditional language does not.

Useful Links

Beautiful Code book (Google that)
http://haskell.org
http://book.realworldhaskell.org/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

slides.md

slides.md

Agenda Preface

Agenda

About this talk

About this talk

About The Speaker

Moore's Law

Vocabulary 1

Vocabulary 2

Vocabulary 3

Vocabulary 4

Example of Semi-Implicit Parallelism

Parallelism in Traditional PLs 1

Parallelism in Traditional PLs 2

Problems with Locks

High Intellectual Overhead of Traditional Concurrent Programming

Functional Parallel Programming

Node

Erlang

Haskell - What Is It?

Haskell

Task Parallelism in Haskell

Example of Task Parallel Programming in Haskell Using STM

Output

Functional Programming for Parallelism and Concurrency: Summary

Useful Links

Files

slides.md

Latest commit

History

slides.md

File metadata and controls

Agenda Preface

Agenda

About this talk

About this talk

About The Speaker

Moore's Law

Vocabulary 1

Vocabulary 2

Vocabulary 3

Vocabulary 4

Example of Semi-Implicit Parallelism

Parallelism in Traditional PLs 1

Parallelism in Traditional PLs 2

Problems with Locks

High Intellectual Overhead of Traditional Concurrent Programming

Functional Parallel Programming

Node

Erlang

Haskell - What Is It?

Haskell

Task Parallelism in Haskell

Example of Task Parallel Programming in Haskell Using STM

Output

Functional Programming for Parallelism and Concurrency: Summary

Useful Links