Adding quickTunerStat.py #1583

ethansaurusrex · 2024-07-19T23:42:55Z

Adding statistics and testing script quickTunerStat.py, accepts *.qtfiles from quickTunerGen.py. Can either compare against data frame created by quickTunerStat.py or by running rocmlir-tuning-driver (currently disabled).

files from quickTunerGen.py. Can either compare against dataframe created by quickTunerStat.py or by running rocmlir-tuning-driver (currently disabled).

to 'NormalizedTFlops' for extracting performance information when validating with --data flag. Cleaned up --tuning code, still disabled until further testing is done.

krzysz00

I could do with a better picture of what's going on here and how I'm meant to operate this

krzysz00 · 2024-07-22T15:32:20Z

mlir/utils/performance/quickTunerStat.py

+
+Generated statistics and verify quick tuning configs
+
+optional arguments:


Why's there a copy of usage here? Shouldn't argparse generate it? Some examples might be good

krzysz00 · 2024-07-22T15:33:26Z

mlir/utils/performance/quickTunerStat.py

+
+        if match:
+            tup = match.groups()
+            transA = True if tup[0].lower() == 'true' else False


The GemmConfig class over in tuningRunner already has this and it might be less fragile?

The issue is I still need to create a tuple in order to index the dictionary contained within the class. I switched out the method of converting the transA/B to the same method used in tuningRunner/perfRunner

I'm more suggesting using the classes in tuningRunner or perfRunner to parse the perf config - I think parameterSweeps.py might be a good place to look or refactor

Of course I will take a look at that

krzysz00 · 2024-07-22T15:34:04Z

mlir/utils/performance/quickTunerStat.py

+            tile_params = tile_params.drop(['param8','param9'], axis=1)
+
+            tile_params['performance'] = df['NormalizedTFlops']
+


Nit: a bunch of extra newlines that don't have semantic meaning, please remove?

krzysz00 · 2024-07-22T15:35:27Z

mlir/utils/performance/quickTunerStat.py

+        self.output_df = pd.DataFrame(rank_dict)
+        print(self.output_df)
+
+class tunerValidator(perfConfigValidator):


Nit: upper case class names

krzysz00 · 2024-07-22T15:35:46Z

mlir/utils/performance/quickTunerStat.py

+                 rocmlir_path,
+                 rocm_build_script='/share/scripts/build-rocm'):
+        """
+        initializer


Not worth a doc string

krzysz00 · 2024-07-22T15:36:26Z

mlir/utils/performance/quickTunerStat.py

+
+        self.cpp_file = os.path.join(os.path.dirname(rocmlir_path), self.gridwise_gemm_params)
+        self.rocm_build_script = rocm_build_script
+        self.backup = self.cpp_file + ".bu"


~ is the usual suffix for backup files, no?

krzysz00 · 2024-07-22T18:53:31Z

mlir/utils/performance/quickTunerStat.py

+
+
+    def __del__(self):
+        # copy back original file


updated column names, correcting previous splitK mistake.

Added cpp file output on invocation of --rank

perfRunner.GemmConfiguration.fromCommandLine()

krzysz00

I'm rather confused as to when I'd want the "rebuild rocMLIR" mode in a script that's meant to produce statistics.

That being said, ... approved on the assumption that you'll explain this somewhere at some point

krzysz00 · 2024-07-31T23:14:50Z

mlir/utils/performance/quickTunerStat.py

+from dataclasses import dataclass
+from perfCommonUtils import Operation
+import perfRunner
+from sklearn.preprocessing import MinMaxScaler


We might want to add sklearn to a Dockerfile or a requirements.txt somewhere?

krzysz00 · 2024-07-31T23:16:27Z

mlir/utils/performance/quickTunerStat.py

+        # to validate file we need data already read,
+
+        all_data = []
+        columns = ['M/block', 'N/block', 'K/block', 'M/wave', 'N/wave', 'kPack', 'splitK', 'forceUnroll', 'bCopyMore']


I'll note that, for example, the attention perf configs don't have this structure, and, more interestingl, that f32 gemm/conv on Navi doesn't either. But this is probably fine, just wanted to call out the limitation

krzysz00 · 2024-07-31T23:18:08Z

mlir/utils/performance/quickTunerStat.py

+        with open(cpp_filename, 'w') as file:
+            file.write(cpp_content)
+
+    def buildRocm(self):


buildRocmlir, maybe?

Also, why is this here?

ethansaurusrex added 7 commits July 18, 2024 19:09

Changing input format

056d3bd

Check

8883c72

rewrite for use of rocmlir-tuning-driver

68374cc

Working data verifier

14ed3bd

Fixing tuning runner portions

d0958e3

Checking in, rewrote some of the base class functions

aa80b68

Adding statistics and testing script quickTunerStat.py, accepts *.qt

9d64096

files from quickTunerGen.py. Can either compare against dataframe created by quickTunerStat.py or by running rocmlir-tuning-driver (currently disabled).

ethansaurusrex requested review from jerryyin and sjw36 as code owners July 19, 2024 23:42

jerryyin requested review from djramic and giuseros July 22, 2024 13:43

Updated script with debug information and switched indexing

1b17529

to 'NormalizedTFlops' for extracting performance information when validating with --data flag. Cleaned up --tuning code, still disabled until further testing is done.

krzysz00 reviewed Jul 22, 2024

View reviewed changes

ethansaurusrex added 9 commits July 25, 2024 14:19

Fixed capitalized class names

de2612e

Removed unneeded newlines, comments, and methods

303c9d5

Fixed backup filename

c2db65f

Changed method for converting transA/B string to boolean. Also

5615df1

updated column names, correcting previous splitK mistake.

Adding usage README.md and fixed some issues with column naming.

f0db872

Added cpp file output on invocation of --rank

Fixed dtype lookup bug

4fd2d8c

Added tflops back in to df

1b87e63

Replaced manual parsing of config string with

c62628a

perfRunner.GemmConfiguration.fromCommandLine()

Updated README.md to include data collection method using tuna-script.sh

fe715f9

krzysz00 approved these changes Jul 31, 2024

View reviewed changes

ethansaurusrex added 6 commits August 2, 2024 16:05

Changed romclir building during rocmlir tuner validation

21330a3

Quick checkin

f79f64f

Fixed tuner method build method

3b7c662

Fixed single file and directory input ranking

9ee03fa

Changed format for input to match quickTunerGen.py

c1f449b

Added convolution

7123a72

jerryyin assigned djramic Sep 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding quickTunerStat.py #1583

Adding quickTunerStat.py #1583

ethansaurusrex commented Jul 19, 2024

krzysz00 left a comment

krzysz00 Jul 22, 2024

krzysz00 Jul 22, 2024

ethansaurusrex Jul 30, 2024

krzysz00 Jul 30, 2024

ethansaurusrex Jul 30, 2024

krzysz00 Jul 22, 2024

krzysz00 Jul 22, 2024

krzysz00 Jul 22, 2024

krzysz00 Jul 22, 2024

krzysz00 Jul 22, 2024

krzysz00 left a comment

krzysz00 Jul 31, 2024

krzysz00 Jul 31, 2024

krzysz00 Jul 31, 2024


		Generated statistics and verify quick tuning configs

		optional arguments:

		tile_params = tile_params.drop(['param8','param9'], axis=1)

		tile_params['performance'] = df['NormalizedTFlops']

Adding quickTunerStat.py #1583

Are you sure you want to change the base?

Adding quickTunerStat.py #1583

Conversation

ethansaurusrex commented Jul 19, 2024

krzysz00 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

krzysz00 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment