Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a YAML-parsing benchmark #342

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions pyperformance/data-files/benchmarks/bm_yaml/pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
[project]
name = "pyperformance_bm_yaml"
requires-python = ">=3.8"
dependencies = ["pyperf"]
urls = {repository = "https://github.com/python/pyperformance"}
dynamic = ["version"]

[tool.pyperformance]
name = "yaml"
tags = "serialize"
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
pyyaml==6.0.1
91 changes: 91 additions & 0 deletions pyperformance/data-files/benchmarks/bm_yaml/run_benchmark.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
"""
Script for testing the performance of YAML parsing, using yaml.

This will dump/load several real world-representative objects a few thousand
times. The methodology below was chosen to be similar to
real-world scenarios which operate on single objects at a time.

This explicitly tests the pure Python implementation in pyyaml, not its C
extension.
Comment on lines +8 to +9
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a helpful comment. It may be worth moving it next to the line where we take this action. I'm guessing that's the Loader=yaml.Loader bit on line 70.


The object structure is copied from the `json_load` benchmark.
"""


import random
import sys


import pyperf
import yaml


DICT = {
'ads_flags': 0,
'age': 18,
'bulletin_count': 0,
'comment_count': 0,
'country': 'BR',
'encrypted_id': 'G9urXXAJwjE',
'favorite_count': 9,
'first_name': '',
'flags': 412317970704,
'friend_count': 0,
'gender': 'm',
'gender_for_display': 'Male',
'id': 302935349,
'is_custom_profile_icon': 0,
'last_name': '',
'locale_preference': 'pt_BR',
'member': 0,
'tags': ['a', 'b', 'c', 'd', 'e', 'f', 'g'],
'profile_foo_id': 827119638,
'secure_encrypted_id': 'Z_xxx2dYx3t4YAdnmfgyKw',
'session_number': 2,
'signup_id': '201-19225-223',
'status': 'A',
'theme': 1,
'time_created': 1225237014,
'time_updated': 1233134493,
'unread_message_count': 0,
'user_group': '0',
'username': 'collinwinter',
'play_count': 9,
'view_count': 7,
'zip': ''}

TUPLE = (
[265867233, 265868503, 265252341, 265243910, 265879514,
266219766, 266021701, 265843726, 265592821, 265246784,
265853180, 45526486, 265463699, 265848143, 265863062,
265392591, 265877490, 265823665, 265828884, 265753032], 60)
Comment on lines +23 to +61
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be worth it to also/instead use some real-world data, like we do for bm_tomli_loads?



def mutate_dict(orig_dict, random_source):
new_dict = dict(orig_dict)
for key, value in new_dict.items():
rand_val = random_source.random() * sys.maxsize
if isinstance(key, (int, bytes, str)):
new_dict[key] = type(key)(rand_val)
return new_dict


random_source = random.Random(5) # Fixed seed.
DICT_GROUP = [mutate_dict(DICT, random_source) for _ in range(3)]


def bench_yaml(objs):
for obj in objs:
yaml.load(obj, Loader=yaml.Loader)


if __name__ == "__main__":
runner = pyperf.Runner()
runner.metadata['description'] = "Benchmark yaml.load()"

yaml_dict = yaml.dump(DICT)
yaml_tuple = yaml.dump(TUPLE)
yaml_dict_group = yaml.dump(DICT_GROUP)
objs = (yaml_dict, yaml_tuple, yaml_dict_group)

runner.bench_func('yaml', bench_yaml, objs, inner_loops=20)
Loading