-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding quickTunerStat.py #1583
base: develop
Are you sure you want to change the base?
Adding quickTunerStat.py #1583
Conversation
files from quickTunerGen.py. Can either compare against dataframe created by quickTunerStat.py or by running rocmlir-tuning-driver (currently disabled).
to 'NormalizedTFlops' for extracting performance information when validating with --data flag. Cleaned up --tuning code, still disabled until further testing is done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could do with a better picture of what's going on here and how I'm meant to operate this
|
||
Generated statistics and verify quick tuning configs | ||
|
||
optional arguments: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why's there a copy of usage
here? Shouldn't argparse
generate it? Some examples might be good
|
||
if match: | ||
tup = match.groups() | ||
transA = True if tup[0].lower() == 'true' else False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The GemmConfig
class over in tuningRunner
already has this and it might be less fragile?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is I still need to create a tuple in order to index the dictionary contained within the class. I switched out the method of converting the transA/B to the same method used in tuningRunner
/perfRunner
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm more suggesting using the classes in tuningRunner or perfRunner to parse the perf config - I think parameterSweeps.py
might be a good place to look or refactor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course I will take a look at that
tile_params = tile_params.drop(['param8','param9'], axis=1) | ||
|
||
tile_params['performance'] = df['NormalizedTFlops'] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: a bunch of extra newlines that don't have semantic meaning, please remove?
self.output_df = pd.DataFrame(rank_dict) | ||
print(self.output_df) | ||
|
||
class tunerValidator(perfConfigValidator): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: upper case class names
rocmlir_path, | ||
rocm_build_script='/share/scripts/build-rocm'): | ||
""" | ||
initializer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not worth a doc string
|
||
self.cpp_file = os.path.join(os.path.dirname(rocmlir_path), self.gridwise_gemm_params) | ||
self.rocm_build_script = rocm_build_script | ||
self.backup = self.cpp_file + ".bu" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
~
is the usual suffix for backup files, no?
|
||
|
||
def __del__(self): | ||
# copy back original file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm?
updated column names, correcting previous splitK mistake.
Added cpp file output on invocation of --rank
perfRunner.GemmConfiguration.fromCommandLine()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm rather confused as to when I'd want the "rebuild rocMLIR" mode in a script that's meant to produce statistics.
That being said, ... approved on the assumption that you'll explain this somewhere at some point
from dataclasses import dataclass | ||
from perfCommonUtils import Operation | ||
import perfRunner | ||
from sklearn.preprocessing import MinMaxScaler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to add sklearn
to a Dockerfile or a requirements.txt somewhere?
# to validate file we need data already read, | ||
|
||
all_data = [] | ||
columns = ['M/block', 'N/block', 'K/block', 'M/wave', 'N/wave', 'kPack', 'splitK', 'forceUnroll', 'bCopyMore'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll note that, for example, the attention perf configs don't have this structure, and, more interestingl, that f32 gemm/conv on Navi doesn't either. But this is probably fine, just wanted to call out the limitation
with open(cpp_filename, 'w') as file: | ||
file.write(cpp_content) | ||
|
||
def buildRocm(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
buildRocmlir, maybe?
Also, why is this here?
Adding statistics and testing script
quickTunerStat.py
, accepts*.qtfiles
fromquickTunerGen.py
. Can either compare against data frame created byquickTunerStat.py
or by runningrocmlir-tuning-driver
(currently disabled).