Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding quickTunerStat.py #1583

Open
wants to merge 23 commits into
base: develop
Choose a base branch
from
Open

Adding quickTunerStat.py #1583

wants to merge 23 commits into from

Conversation

ethansaurusrex
Copy link

Adding statistics and testing script quickTunerStat.py, accepts *.qtfiles from quickTunerGen.py. Can either compare against data frame created by quickTunerStat.py or by running rocmlir-tuning-driver (currently disabled).

to 'NormalizedTFlops' for extracting performance information
when validating with --data flag. Cleaned up --tuning code,
still disabled until further testing is done.
Copy link
Collaborator

@krzysz00 krzysz00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could do with a better picture of what's going on here and how I'm meant to operate this


Generated statistics and verify quick tuning configs

optional arguments:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why's there a copy of usage here? Shouldn't argparse generate it? Some examples might be good


if match:
tup = match.groups()
transA = True if tup[0].lower() == 'true' else False
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GemmConfig class over in tuningRunner already has this and it might be less fragile?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is I still need to create a tuple in order to index the dictionary contained within the class. I switched out the method of converting the transA/B to the same method used in tuningRunner/perfRunner

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm more suggesting using the classes in tuningRunner or perfRunner to parse the perf config - I think parameterSweeps.py might be a good place to look or refactor

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course I will take a look at that

tile_params = tile_params.drop(['param8','param9'], axis=1)

tile_params['performance'] = df['NormalizedTFlops']

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: a bunch of extra newlines that don't have semantic meaning, please remove?

self.output_df = pd.DataFrame(rank_dict)
print(self.output_df)

class tunerValidator(perfConfigValidator):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: upper case class names

rocmlir_path,
rocm_build_script='/share/scripts/build-rocm'):
"""
initializer
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not worth a doc string


self.cpp_file = os.path.join(os.path.dirname(rocmlir_path), self.gridwise_gemm_params)
self.rocm_build_script = rocm_build_script
self.backup = self.cpp_file + ".bu"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

~ is the usual suffix for backup files, no?



def __del__(self):
# copy back original file
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm?

Copy link
Collaborator

@krzysz00 krzysz00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm rather confused as to when I'd want the "rebuild rocMLIR" mode in a script that's meant to produce statistics.

That being said, ... approved on the assumption that you'll explain this somewhere at some point

from dataclasses import dataclass
from perfCommonUtils import Operation
import perfRunner
from sklearn.preprocessing import MinMaxScaler
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to add sklearn to a Dockerfile or a requirements.txt somewhere?

# to validate file we need data already read,

all_data = []
columns = ['M/block', 'N/block', 'K/block', 'M/wave', 'N/wave', 'kPack', 'splitK', 'forceUnroll', 'bCopyMore']
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll note that, for example, the attention perf configs don't have this structure, and, more interestingl, that f32 gemm/conv on Navi doesn't either. But this is probably fine, just wanted to call out the limitation

with open(cpp_filename, 'w') as file:
file.write(cpp_content)

def buildRocm(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buildRocmlir, maybe?

Also, why is this here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants