Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

using dataclasses #40

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Conversation

commonism
Copy link

Hi,

I started of my own version of this, but found yours basically the last moment …
My name of choice was objset, but the poll is finished I guess.

I used dataclasses - ported here.
And updated the fields to match current OpenZFS git.

@Freaky
Copy link
Collaborator

Freaky commented Dec 29, 2022

I made a similar effort to use dataclasses shortly after 2.0's release, but decided against it because it reduced performance for a pretty arguable code improvement. Your approach seems to perform even worse than mine, perhaps because of the use of reflection:

Benchmark 1: python3.11 ./ioztat rpool 0.000001 500 >/dev/null
  Time (mean ± σ):     10.006 s ±  0.063 s    [User: 8.456 s, System: 3.738 s]
  Range (min … max):    9.895 s … 10.097 s    10 runs

Benchmark 2: python3.11 /usr/local/bin/ioztat rpool 0.000001 500
  Time (mean ± σ):      5.616 s ±  0.042 s    [User: 4.184 s, System: 3.642 s]
  Range (min … max):    5.543 s …  5.678 s    10 runs

Summary
  'python3.11 /usr/local/bin/ioztat rpool 0.000001 500' ran
    1.78 ± 0.02 times faster than 'python3.11 ./ioztat rpool 0.000001 500 >/dev/null'

@commonism
Copy link
Author

Missing slots - I did not consider this to be optimized for performance.

But nice comparison - what did you use for the numbers?

@Freaky
Copy link
Collaborator

Freaky commented Dec 30, 2022

Results are generated with hyperfine.

I tried __slots__ on baseline and found no measurable difference.

@commonism
Copy link
Author

Dropping reflection and looking on the raw performance to create an object and access the properties for normal/slots/dataclasses:

import dataclasses
import timeit
import datetime
import copy
import typing

class Dataset:
    """
    ZFS dataset statistics over a timespan in seconds
    """
    name = ''
    reads = 0
    nread = 0
    writes = 0
    nwritten = 0
    nunlinks = 0
    nunlinked = 0
    timespan = 0

    def __init__(self, name=''):
        self.name = name

    @classmethod
    def from_dict(cls, data):
        d = cls(data['dataset_name'])
        d.reads = int(data['reads'])
        d.nread = int(data['nread'])
        d.writes = int(data['writes'])
        d.nwritten = int(data['nwritten'])
        d.nunlinks = int(data.get('nunlinks', 0))
        d.nunlinked = int(data.get('nunlinked', 0))
        d.timespan = int(data['timestamp']) / 1e9
        return d


    def __add__(self, other):
        """
        Return a new Dataset with this one's values added to other

        Note timespan is set to the maximum of the two values instead of being
        added.
        """
        d = copy.copy(self)
#        d = Dataset(self.name)
        d.reads += other.reads
        d.nread += other.nread
        d.writes += other.writes
        d.nwritten += other.nwritten
        d.nunlinks += other.nunlinks
        d.nunlinked += other.nunlinked
        d.timespan = max(d.timespan, other.timespan)
        return d

class OptimizedDataset:
    """
    ZFS dataset statistics over a timespan in seconds
    """

    def __init__(self, name='', reads=0, nread=0, writes=0, nwritten=0, nunlinks=0, nunlinked=0, timespan=0):
        self.name = name
        self.reads = reads
        self.nread = nread
        self.writes = writes
        self.nwritten = nwritten
        self.nunlinks = nunlinks
        self.nunlinked = nunlinked
        self.timespan = timespan


    @classmethod
    def from_dict(cls, data):
        d = cls(data['dataset_name'], 
        int(data['reads']), 
        int(data['nread']), 
        int(data['writes']), 
        int(data['nwritten']), 
        int(data.get('nunlinks', 0)), 
        int(data.get('nunlinked', 0)), 
        int(data['timestamp']) / 1e9)
        return d

    def __add__(self, other):
        """
        Return a new Dataset with this one's values added to other

        Note timespan is set to the maximum of the two values instead of being
        added.
        """
        d = OptimizedDataset(self.name, 
        self.reads + other.reads,
        self.nread + other.nread,
        self.writes + other.writes,
        self.nwritten + other.nwritten,
        self.nunlinks + other.nunlinks,
        self.nunlinked + other.nunlinked,
        max(self.timespan, other.timespan))
        return d    


class SlotsDataset:#(Dataset):
    __slots__ = ["name","reads","nread","writes","nwritten","nunlinks","nunlinked","timespan"]
    """
    ZFS dataset statistics over a timespan in seconds
    """

    def __init__(self, name='', reads=0, nread=0, writes=0, nwritten=0, nunlinks=0, nunlinked=0, timespan=0):
        self.name = name
        self.reads = reads
        self.nread = nread
        self.writes = writes
        self.nwritten = nwritten
        self.nunlinks = nunlinks
        self.nunlinked = nunlinked
        self.timespan = timespan

    @classmethod
    def from_dict(cls, data):
        d = cls(data['dataset_name'], 
        int(data['reads']), 
        int(data['nread']), 
        int(data['writes']), 
        int(data['nwritten']), 
        int(data.get('nunlinks', 0)), 
        int(data.get('nunlinked', 0)), 
        int(data['timestamp']) / 1e9)
        return d


    def __add__(self, other):
        """
        Return a new Dataset with this one's values added to other

        Note timespan is set to the maximum of the two values instead of being
        added.
        """
        d = SlotsDataset(self.name, 
        self.reads + other.reads,
        self.nread + other.nread,
        self.writes + other.writes,
        self.nwritten + other.nwritten,
        self.nunlinks + other.nunlinks,
        self.nunlinked + other.nunlinked,
        max(self.timespan, other.timespan))
        return d    


@dataclasses.dataclass #(slots=True)
class DataclassDataset:
    """
    ZFS dataset statistics over a timespan in seconds

    The Dataset fields are "documented" here:
    https://github.com/openzfs/zfs/blob/9681de4657686d0ed19ca18d578513e74395f00f/module/zfs/dataset_kstats.c#L32

    """
    timestamp: int = 0
    dataset_name: str = ''
    writes: int = 0
    nwritten: int = 0
    reads: int = 0
    nread: int = 0

    nunlinks: int = 0
    nunlinked: int = 0
    
    timespan: int = 0

    @classmethod
    def from_dict(cls, data):
        d = cls(int(data['timestamp']),
        data['dataset_name'], 
        int(data['reads']), 
        int(data['nread']), 
        int(data['writes']), 
        int(data['nwritten']), 
        int(data.get('nunlinks', 0)), 
        int(data.get('nunlinked', 0)), 
        int(data['timestamp']) / 1e9)
        return d

    def __add__(self, other):
        """
        Return a new Dataset with this one's values added to other

        Note timespan is set to the maximum of the two values instead of being
        added.
        """

        d = DataclassDataset(self.timestamp, self.dataset_name, 
            self.reads + other.reads,
            self.nread + other.nread,
            self.writes + other.writes,
            self.nwritten + other.nwritten,
            self.nunlinks + other.nunlinks,
            self.nunlinked + other.nunlinked,
            self.timespan + other.timespan )         
        return d


@dataclasses.dataclass(slots=True)
class DataclassSlotsDataset:
    """
    ZFS dataset statistics over a timespan in seconds

    The Dataset fields are "documented" here:
    https://github.com/openzfs/zfs/blob/9681de4657686d0ed19ca18d578513e74395f00f/module/zfs/dataset_kstats.c#L32

    """
    timestamp: int = 0
    dataset_name: str = ''
    writes: int = 0
    nwritten: int = 0
    reads: int = 0
    nread: int = 0

    nunlinks: int = 0
    nunlinked: int = 0
    
    timespan: int = 0

    @classmethod
    def from_dict(cls, data):
        d = cls(int(data['timestamp']),
        data['dataset_name'], 
        int(data['reads']), 
        int(data['nread']), 
        int(data['writes']), 
        int(data['nwritten']), 
        int(data.get('nunlinks', 0)), 
        int(data.get('nunlinked', 0)), 
        int(data['timestamp']) / 1e9)
        return d

    def __add__(self, other):
        """
        Return a new Dataset with this one's values added to other

        Note timespan is set to the maximum of the two values instead of being
        added.
        """

        d = DataclassSlotsDataset(self.timestamp, self.dataset_name, 
            self.reads + other.reads,
            self.nread + other.nread,
            self.writes + other.writes,
            self.nwritten + other.nwritten,
            self.nunlinks + other.nunlinks,
            self.nunlinked + other.nunlinked,
            self.timespan + other.timespan )         
        return d

        
def main():
    data = {i:idx for idx,i in enumerate(SlotsDataset.__slots__)}
    data["dataset_name"] = "test"
    data["timestamp"] = datetime.datetime.now().timestamp()
    del data["name"]
    del data["timespan"]

    cnt = 1000000
    print("create")
    print(timeit.timeit(lambda: Dataset.from_dict(data), number=cnt))
    print(timeit.timeit(lambda: OptimizedDataset.from_dict(data), number=cnt))
    print(timeit.timeit(lambda: SlotsDataset.from_dict(data), number=cnt))
    print(timeit.timeit(lambda: DataclassDataset.from_dict(data), number=cnt))
    print(timeit.timeit(lambda: DataclassSlotsDataset.from_dict(data), number=cnt))


    print("add")
    v = Dataset.from_dict(data)
    print(timeit.timeit(lambda: v+v, number=cnt))

    v = OptimizedDataset.from_dict(data)
    print(timeit.timeit(lambda: v+v, number=cnt))


    v = SlotsDataset.from_dict(data)    
    print(timeit.timeit(lambda: v+v, number=cnt))

    v = DataclassDataset.from_dict(data)    
    print(timeit.timeit(lambda: v+v, number=cnt))


    v = DataclassSlotsDataset.from_dict(data)    
    print(timeit.timeit(lambda: v+v, number=cnt))
        
        
if __name__ == "__main__":
    main()        

Dataset
OptimizedDataset (no copy.copy, no class parameters)
SlotsDataset
DataclassDataset
DatalcassSlotsDataset

create
0.8905068870517425
0.8984799759928137
0.8242627109866589
1.0172464569914155
0.9374575979891233
add
2.3710521850152873
0.7287906949641183
0.629768434970174
0.9442946459748782
0.5963006669771858

python 3.10 dataclasses with slots=True outperforms the alternatives when doing the first __add__
__slots__ is second, normal classes third.

I think order (combined init & add) would be along:

  1. DataclassSlotsDataset
  2. SlotsDataset
  3. OptimizedDataset
  4. DataclassDataset
  5. Dataset

@Freaky
Copy link
Collaborator

Freaky commented Dec 31, 2022

Our Python MSV is 3.7, so Dataclass slots is a non-starter for now.

These long positional argument lists are quite nasty - they're begging for an easy-to-miss mixup and I'd rather avoid them.

I added versions with keyword arguments, plus a version that avoids copy.copy by direct field access, and that plus __slots__. I tested them across the range of supported versions:

3.7
create
Baseline         4.779117083991878
NoCopy           4.819048156961799
NoCopy/Slots     4.610547592048533
Optimized        4.653545118984766
Optimized/KW     5.483340958948247
Slots            4.283310921047814
Slots/KW         4.905152444029227
Dataclass        4.807075573946349
Dataclass/KW     5.597749298904091
add
Baseline         12.492388554033823
NoCopy           3.6168077409965917
NoCopy/Slots     3.1257432530401275
Optimized        3.5399633079068735
Optimized/KW     4.302096275961958
Slots            2.7610619330080226
Slots/KW         3.4788211649283767
Dataclass        3.5082433730131015
Dataclass/KW     4.293326015933417

3.9
create
Baseline         3.406673535006121
NoCopy           3.4032381870783865
NoCopy/Slots     3.3954955079825595
Optimized        3.4104782280046493
Optimized/KW     4.28153379203286
Slots            2.9133199750212952
Slots/KW         3.9084081649780273
Dataclass        3.4400185080012307
Dataclass/KW     4.317347781965509
add
Baseline         11.0603123200126
NoCopy           3.367319536046125
NoCopy/Slots     3.115666083060205
Optimized        3.4051882980857044
Optimized/KW     4.158022553892806
Slots            2.748979698983021
Slots/KW         3.56267236196436
Dataclass        3.41977860708721
Dataclass/KW     4.2807325079338625

3.10
create
Baseline         3.440365262911655
NoCopy           3.4241489080013707
NoCopy/Slots     3.343199662049301
Optimized        3.438087990041822
Optimized/KW     4.328920533065684
Slots            2.9351572120795026
Slots/KW         3.88322276098188
Dataclass        3.4257506920257583
Dataclass/KW     4.289895735913888
add
Baseline         11.33392396196723
NoCopy           3.821642320952378
NoCopy/Slots     2.6333876519929618
Optimized        3.0709942990215495
Optimized/KW     4.024556623073295
Slots            2.2230605570366606
Slots/KW         3.119670856045559
Dataclass        3.8814144630450755
Dataclass/KW     4.832975470926613

3.11
create
Baseline         2.230551384971477
NoCopy           2.233327505993657
NoCopy/Slots     2.2825138609623536
Optimized        2.2936561700189486
Optimized/KW     3.253970602992922
Slots            2.258040190907195
Slots/KW         3.156391634955071
Dataclass        2.301062963088043
Dataclass/KW     3.221311755012721
add
Baseline         10.089828453958035
NoCopy           1.2415083339437842
NoCopy/Slots     1.3665269260527566
Optimized        1.433199180988595
Optimized/KW     2.2370554810622707
Slots            1.4051770099904388
Slots/KW         2.24045382195618
Dataclass        1.3950953249586746
Dataclass/KW     2.2120107349473983

NoCopy/Slots looks like the best of the bunch. It's slightly slower than NoCopy on 3.11, but not by much, it beats all the keyword argument versions, and while some positional argument versions beat it, not by enough to justify themselves.

Also, damn my CPU sucks.

@commonism
Copy link
Author

You've been competing with AMD Ryzen 5 PRO 4650U with Radeon Graphics.

I've updated the PR to use NoCopy/Slots.

ioztat Outdated Show resolved Hide resolved
ioztat Show resolved Hide resolved
ioztat Outdated Show resolved Hide resolved
ioztat Outdated Show resolved Hide resolved
ioztat Outdated Show resolved Hide resolved
@Freaky
Copy link
Collaborator

Freaky commented Jan 2, 2023

CPU: Intel(R) Xeon(R) CPU L5639 @ 2.13GHz (2133.53-MHz K8-class CPU)

They're teenagers now. I knew they sucked, but I didn't quite expect a similarly-clocked 15W laptop part to beat them by that margin.

 - no kwargs in __init__
 - fixes requested & required
@commonism
Copy link
Author

https://www.cpubenchmark.net/compare/1983vs4272vs3547vs4827/Intel-Xeon-L5639-vs-AMD-Ryzen-9-5900-vs-AMD-EPYC-7742-vs-AMD-Ryzen-5-PRO-6650U

due to clock - the numbers for 7742 (ubuntu 23.04/py3.10) are 10-14% worse than my ryzen mobile:

create
Baseline	 1.0522814639998614
NoCopy		 1.0588048629997502
NoCopy/Slots	 1.1171760289998929
Optimized	 1.136160388999997
Optimized/KW	 1.3660570930001086
Slots		 1.0198019910003495
Slots/KW	 1.2555289689998972
Dataclass	 1.1098613029998887
Dataclass/KW	 1.3511358109999492
add
Baseline	 2.889791399999922
NoCopy		 1.0617886190002537
NoCopy/Slots	 0.9156696780000857
Optimized	 0.9038795599999503
Optimized/KW	 1.1309551809999903
Slots		 0.7515657260000808
Slots/KW	 0.9777399929998865
Dataclass	 1.1047035159999723
Dataclass/KW	 1.3352872320001552

And - there is a new revision available.
I've dropped kwargs from Dataset.init alltogether, they are not used and degrade performance.
def __init__(self, name, /): would even be better, but is python3.8+.

@Freaky
Copy link
Collaborator

Freaky commented Jan 3, 2023

Looking like a 5-10% performance uplift on my hardware, plus a 0.5-2% drop in memory use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants