Benchmarking

Mix.install([
  {:jason, "~> 1.4"},
  {:kino, "~> 0.9", override: true},
  {:youtube, github: "brooklinjazz/youtube"},
  {:hidden_cell, github: "brooklinjazz/hidden_cell"},
  {:benchee, "~> 1.1"}
])

Navigation

Home

Report An Issue

Performance

Custom Enum With Recursion

Review Questions

Upon completing this lesson, a student should be able to answer the following questions.

How can we benchmark our programs using :timer and benchee?
Why is it important to benchmark our software, rather than relying on theoretical performance with Big O Notation?

:timer

Out of the box, Erlang provides a :timer library which contains a tc/1 function to measure the time it takes to run a function. It returns a {time, result} tuple.

flowchart LR
Function --> timer[:timer.tc/1] --> return["{timer, result}"]

Loading

We'll use Process.sleep/1 to simulate a slow function. Process.sleep/1 pauses the program's execution for a number of milliseconds.

:timer.tc(fn -> Process.sleep(1000) end)

The time returned is measured in microseconds.

1000 microseconds equals 1 millisecond. 1000 milliseconds equals 1 seconds.

Therefore we can divide the number of microseconds by 1000 to get milliseconds, and the number of milliseconds by 1000 to get seconds.

You'll notice that :timer is very close to accurate, but not perfectly accurate. This is to be expected with very small increments of time.

{microseconds, _result} = :timer.tc(fn -> Process.sleep(1000) end)

milliseconds = microseconds / 1000
seconds = milliseconds / 1000
IO.inspect(microseconds, label: "microseconds")
IO.inspect(milliseconds, label: "milliseconds")
IO.inspect(seconds, label: "seconds")

Your Turn

In the Elixir cell below, use :timer.tc/1 to measure the following slow_function.

Example Solution

Using anonymous function.

:timer.tc(fn -> slow_function.() end)

Passing the function directly.

:timer.tc(slow_function)

slow_function = fn -> Process.sleep(1000) end

Your Turn

In the Elixir cell below, use :timer.tc/1 to measure the following named function.

Example Solution

Using anonymous function.

:timer.tc(fn -> MyModule.slow_function() end)

Using capture operator.

:timer.tc(&MyModule.slow_function/0)

defmodule MyModule do
  def slow_function do
    Process.sleep(1000)
  end
end

Benchee

Benchee is a popular library for measuring performance and memory consumption.

Installing Benchee

External libraries need to be installed to use them.

We use Mix to install Benchee.

Mix is a build tool that provides tasks for creating, compiling, testing, and managing dependencies for Elixir projects.

flowchart LR
Benchee
D[Dependencies]
I[install/2]
Mix
Mix --> I
I --> D
D --> Benchee

Loading

We've installed Benchee for you for this lesson, but be aware if you try to use Benchee in other projects it likely isn't installed.

If you want Benchee in any Livebook, you have to add it to the setup section at the top of the Livebook if it is not already available. Include Benchee in the existing Mix.install/2 call or add one. There should only ever be a single call to Mix.install/2.

Make sure to replace 1.1 with the latest version. You can find the latest version of Benchee on Hex.

Mix.install([
  {:jason, "~> 1.4"},
  {:kino, "~> 0.9", override: true},
  {:benchee, "~> 1.1"},
])

In a mix project, you'll add the latest Benchee in the existing deps/0 function in mix.exs.

defp deps do
  [
    {:benchee, "~> 1.1"}
  ]
end

Usage

We can use the Benchee.run/2 function to measure the performance and memory consumption of some function.

Benchee.run(%{
  "example test" => fn -> 10000 ** 10000 end
})

Above, the most important part of the output should look like the following, but with different numbers.

Name                   ips        average  deviation         median         99th %
example test        154.65        6.47 ms    ±26.50%        6.13 ms       13.17 ms

Memory usage statistics:

Name            Memory usage
example test           600 B

The Benchee documentation explains how to read the output:

average - average execution time/memory usage (the lower the better)

ips - iterations per second, aka how often can the given function be executed within one second (the higher the better - good for graphing), only for run times

deviation - standard deviation (how much do the results vary), given as a percentage of the average (raw absolute values also available)

median - when all measured values are sorted, this is the middle value. More stable than the average and somewhat more likely to be a typical value you see, for the most typical value see mode. (the lower the better)

99th % - 99th percentile, 99% of all measured values are less than this - worst case performance-ish

Using Benchee For Comparison

Let's use Benchee to compare tuples and lists. Tuples are intended to be used as fixed-sized containers that are fast for accessing. Lists are collections intended for dynamic sized containers that get modified.

Therefore, lists should be slow to access, and tuples should be fast to access. That's in theory, let's verify our assumption and prove it to be true.

size = 10000
large_list = Enum.to_list(0..size)
large_tuple = List.to_tuple(large_list)

Benchee.run(%{
  "list" => fn -> Enum.at(large_list, size) end,
  "tuple" => fn -> elem(large_tuple, size) end
})

Upon running the above, you should see an output similar to the following. You'll notice that accessing a list was far slower!

Name            ips        average  deviation         median         99th %
tuple        1.09 M        0.92 μs  ±2806.65%        0.80 μs           2 μs
list       0.0324 M       30.89 μs    ±93.31%       28.90 μs       67.80 μs

Comparison:
tuple        1.09 M
list       0.0324 M - 33.58x slower +29.97 μs

Your Turn

Use Benchee to compare accessing the first element instead of the last element as we already did above. You should notice the list is faster this time, or at least close to the same speed. You'll learn why in a future lesson.

Benchee.run(%{
  "access first element in list" => fn -> nil end,
  "access first element in tuple" => fn -> nil end
})

Multiple Inputs

We can use the :inputs option in Benchee.run/2 to benchmark multiple inputs.

Benchee.run(
  %{
    "Enum.count/1" => fn input -> Enum.count(input) end,
    "Kernel.length/1" => fn input -> Kernel.length(input) end
  },
  inputs: [small: [1, 2, 3], medium: Enum.to_list(1..1000), large: Enum.to_list(1..100_000)]
)

You should see a benchmark for each input.

##### With Input Small #####
Name                      ips        average  deviation         median         99th %
Kernel.length/1        1.34 M      745.94 ns  ±3912.97%         500 ns        1600 ns
Enum.count/1           1.28 M      781.60 ns  ±3327.02%         500 ns        2000 ns

Comparison: 
Kernel.length/1        1.34 M
Enum.count/1           1.28 M - 1.05x slower +35.66 ns

##### With Input Medium #####
Name                      ips        average  deviation         median         99th %
Kernel.length/1      274.56 K        3.64 μs   ±627.53%        3.40 μs        6.30 μs
Enum.count/1         265.48 K        3.77 μs  ±1277.99%        3.50 μs        8.20 μs

Comparison: 
Kernel.length/1      274.56 K
Enum.count/1         265.48 K - 1.03x slower +0.124 μs

##### With Input Large #####
Name                      ips        average  deviation         median         99th %
Kernel.length/1        5.30 K      188.70 μs    ±63.77%      157.70 μs      897.38 μs
Enum.count/1           4.97 K      201.40 μs    ±75.41%      159.50 μs     1096.38 μs

Comparison: 
Kernel.length/1        5.30 K
Enum.count/1           4.97 K - 1.07x slower +12.70 μs

This is especially useful for benchmarking how larger data affects our system. It's also useful for creating different situations (i.e. worst case and best case scenarios) that may cause our system to behave differently than expected.

How To Make A Good Benchmark

It's easy for a benchmark to misrepresent the actual performance of our system.

To make our benchmarks more representative, consider the following metrics:

variety: Use a variety of inputs in your benchmark.
size: A system may be very fast with small amounts of data, but incredibly slow with large amounts of data. Consider using multiple different samples of data with different sizes to test where the system hits its limits.
consistency: Consider running your benchmark multiple times. Does it's performance change frequently due to edge cases?

In general, you can use benchmarks to optimize your systems and see how they perform under pressure. Consider exactly how you want to stress your system to get benchmark metrics that make the most sense for your situation.

We also want to consider what we're measuring. For example, if you read the average time instead of the median, it's easy to have outliers skew our data.

For example:

Ben Hill Griffin Stadium, the stadium at the University of Florida, holds 88,548 people. In Florida the minimum wage is $11.00 per hour. Working an eight hour day for five days a week fifty-two weeks a year this comes out to $22,880 a year. If every seat in the stadium was filled by someone making minimum wage, the average annual income would be $22,880. and the total worth would be just over two billion $2,025,978,240 In 2021 Jeff Bezos made ~ 197 Billion -- $197,000,000,000. If Jeff Bezos enters the stadium the total annual income is $199,025,978,240 and the AVERAGE income would be about 2.24 million -- $2,247,636. One large outlier can really skew the average. This is often why we use the median, or middle value, of a set. It's less skewed by large outliers.

Example provided by Stephen Fowler

Commit Your Progress

DockYard Academy now recommends you use the latest Release rather than forking or cloning our repository.

Run git status to ensure there are no undesirable changes. Then run the following in your command line from the curriculum folder to commit your progress.

$ git add .
$ git commit -m "finish Benchmarking reading"
$ git push

We're proud to offer our open-source curriculum free of charge for anyone to learn from at their own pace.

We also offer a paid course where you can learn from an instructor alongside a cohort of your peers. We will accept applications for the June-August 2023 cohort soon.

Navigation

Home

Report An Issue

Performance

Custom Enum With Recursion

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

benchmarking.livemd

benchmarking.livemd

Benchmarking

Navigation

Review Questions

:timer

Your Turn

Your Turn

Benchee

Installing Benchee

Usage

Using Benchee For Comparison

Your Turn

Multiple Inputs

How To Make A Good Benchmark

Commit Your Progress

Navigation

Files

benchmarking.livemd

Latest commit

History

benchmarking.livemd

File metadata and controls

Benchmarking

Navigation

Review Questions

:timer

Your Turn

Your Turn

Benchee

Installing Benchee

Usage

Using Benchee For Comparison

Your Turn

Multiple Inputs

How To Make A Good Benchmark

Commit Your Progress

Navigation