Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add analysis tool for nsight reports #3428

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .buildkite/analysis/Project.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
[deps]
ArgParse = "c7e460c6-2fb9-53a9-8c5b-16f535851c63"
CSV = "336ed68f-0bac-5ca0-87d4-7b16caf5d00b"
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
UnicodePlots = "b8865327-cd53-5732-bb35-84acbb429228"
VegaLite = "112f6efa-9a02-5b7d-90c0-432ed331239a"
61 changes: 61 additions & 0 deletions .buildkite/gpu_pipeline/pipeline.yml
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,11 @@ steps:
- julia --project=perf -e 'using CUDA; CUDA.precompile_runtime()'
- julia --project=perf -e 'using Pkg; Pkg.status()'

- echo "--- Instantiate analysis"
- julia --project=.buildkite/analysis -e 'using Pkg; Pkg.instantiate(;verbose=true)'
- julia --project=.buildkite/analysis -e 'using Pkg; Pkg.precompile()'
- julia --project=.buildkite/analysis -e 'using Pkg; Pkg.status()'

- echo "--- Download artifacts"
- julia --project=examples artifacts/download_artifacts.jl

Expand All @@ -55,6 +60,9 @@ steps:
julia --threads=3 --color=yes --project=examples examples/hybrid/driver.jl
--config_file ${GPU_CONFIG_PATH}target_gpu_implicit_baroclinic_wave.yml
--job_id target_gpu_implicit_baroclinic_wave

- nsys stats --report cuda_gpu_trace target_gpu_implicit_baroclinic_wave/output_active/report.nsys-rep --output target_gpu_implicit_baroclinic_wave/output_active/ --format csv
- julia --project=.buildkite/analysis .buildkite/nsight_analysis.jl --out_dir target_gpu_implicit_baroclinic_wave/output_active/
artifact_paths: "target_gpu_implicit_baroclinic_wave/output_active/*"
env:
CLIMACOMMS_DEVICE: "CUDA"
Expand All @@ -72,6 +80,9 @@ steps:
julia --threads=3 --color=yes --project=examples examples/hybrid/driver.jl
--config_file ${GPU_CONFIG_PATH}gpu_hs_rhoe_equil_0M.yml
--job_id gpu_hs_rhoe_equil_55km_nz63_0M

- nsys stats --report cuda_gpu_trace gpu_hs_rhoe_equil_55km_nz63_0M/output_active/report.nsys-rep --output gpu_hs_rhoe_equil_55km_nz63_0M/output_active/ --format csv
- julia --project=.buildkite/analysis .buildkite/nsight_analysis.jl --out_dir gpu_hs_rhoe_equil_55km_nz63_0M/output_active/
artifact_paths: "gpu_hs_rhoe_equil_55km_nz63_0M/output_active/*"
env:
CLIMACOMMS_DEVICE: "CUDA"
Expand All @@ -90,6 +101,10 @@ steps:
julia --threads=3 --color=yes --project=examples examples/hybrid/driver.jl
--config_file ${GPU_CONFIG_PATH}gpu_hs_rhoe_equil_0M.yml
--job_id gpu_hs_rhoe_equil_55km_nz63_0M_4process

# TODO: add analysis for all gpu devices
- nsys stats --report cuda_gpu_trace gpu_hs_rhoe_equil_55km_nz63_0M_4process/output_active/report-0.nsys-rep --output gpu_hs_rhoe_equil_55km_nz63_0M_4process/output_active/ --format csv
- julia --project=.buildkite/analysis .buildkite/nsight_analysis.jl --out_dir gpu_hs_rhoe_equil_55km_nz63_0M_4process/output_active/
artifact_paths: "gpu_hs_rhoe_equil_55km_nz63_0M_4process/output_active/*"
env:
CLIMACOMMS_DEVICE: "CUDA"
Expand All @@ -110,6 +125,10 @@ steps:
julia --threads=3 --color=yes --project=examples examples/hybrid/driver.jl
--config_file ${GPU_CONFIG_PATH}target_gpu_implicit_baroclinic_wave.yml
--job_id target_gpu_implicit_baroclinic_wave_4process

# TODO: add analysis for all gpu devices
- nsys stats --report cuda_gpu_trace target_gpu_implicit_baroclinic_wave_4process/output_active/report-0.nsys-rep --output target_gpu_implicit_baroclinic_wave_4process/output_active/ --format csv
- julia --project=.buildkite/analysis .buildkite/nsight_analysis.jl --out_dir target_gpu_implicit_baroclinic_wave_4process/output_active/
artifact_paths: "target_gpu_implicit_baroclinic_wave_4process/output_active/*"
env:
CLIMACOMMS_DEVICE: "CUDA"
Expand All @@ -131,6 +150,9 @@ steps:
nsys profile --delay 100 --trace=nvtx,mpi,cuda,osrt --output=gpu_aquaplanet_dyamond_diag_1process/output_active/report julia --threads=3 --color=yes --project=examples examples/hybrid/driver.jl
--config_file ${GPU_CONFIG_PATH}gpu_aquaplanet_dyamond_diag_1process.yml
--job_id gpu_aquaplanet_dyamond_diag_1process

- nsys stats --report cuda_gpu_trace gpu_aquaplanet_dyamond_diag_1process/output_active/report.nsys-rep --output gpu_aquaplanet_dyamond_diag_1process/output_active/ --format csv
- julia --project=.buildkite/analysis .buildkite/nsight_analysis.jl --out_dir gpu_aquaplanet_dyamond_diag_1process/output_active/
artifact_paths: "gpu_aquaplanet_dyamond_diag_1process/output_active/*"
env:
CLIMACOMMS_DEVICE: "CUDA"
Expand All @@ -152,6 +174,9 @@ steps:
julia --threads=3 --color=yes --project=examples examples/hybrid/driver.jl
--config_file ${GPU_CONFIG_PATH}gpu_aquaplanet_dyamond_ss.yml
--job_id gpu_aquaplanet_dyamond_ss_1process

- nsys stats --report cuda_gpu_trace gpu_aquaplanet_dyamond_ss_1process/output_active/report.nsys-rep --output gpu_aquaplanet_dyamond_ss_1process/output_active/ --format csv
- julia --project=.buildkite/analysis .buildkite/nsight_analysis.jl --out_dir gpu_aquaplanet_dyamond_ss_1process/output_active/
artifact_paths: "gpu_aquaplanet_dyamond_ss_1process/output_active/*"
env:
CLIMACOMMS_DEVICE: "CUDA"
Expand All @@ -169,9 +194,16 @@ steps:
- mkdir -p gpu_aquaplanet_dyamond_ss_2process
- >
srun --cpu-bind=threads --cpus-per-task=4
nsys profile --delay 100 --trace=nvtx,mpi,cuda,osrt --output=
gpu_aquaplanet_dyamond_ss_2process/output_active/report-0.nsys-rep,
gpu_aquaplanet_dyamond_ss_2process/output_active/report-1.nsys-rep
julia --threads=3 --color=yes --project=examples examples/hybrid/driver.jl
--config_file ${GPU_CONFIG_PATH}gpu_aquaplanet_dyamond_ss.yml
--job_id gpu_aquaplanet_dyamond_ss_2process

# TODO: add analysis for all gpu devices
- nsys stats --report cuda_gpu_trace gpu_aquaplanet_dyamond_ss_2process/output_active/report-0.nsys-rep --output gpu_aquaplanet_dyamond_ss_2process/output_active/ --format csv
- julia --project=.buildkite/analysis .buildkite/nsight_analysis.jl --out_dir gpu_aquaplanet_dyamond_ss_2process/output_active/
artifact_paths: "gpu_aquaplanet_dyamond_ss_2process/output_active/*"
env:
CLIMACOMMS_DEVICE: "CUDA"
Expand All @@ -189,9 +221,17 @@ steps:
- mkdir -p gpu_aquaplanet_dyamond_ss_4process
- >
srun --cpu-bind=threads --cpus-per-task=4
nsys profile --delay 100 --trace=nvtx,mpi,cuda,osrt --output=gpu_aquaplanet_dyamond_ss_4process/output_active/report-0.nsys-rep,
gpu_aquaplanet_dyamond_ss_4process/output_active/report-1.nsys-rep,
gpu_aquaplanet_dyamond_ss_4process/output_active/report-2.nsys-rep,
gpu_aquaplanet_dyamond_ss_4process/output_active/report-3.nsys-rep
julia --threads=3 --color=yes --project=examples examples/hybrid/driver.jl
--config_file ${GPU_CONFIG_PATH}gpu_aquaplanet_dyamond_ss.yml
--job_id gpu_aquaplanet_dyamond_ss_4process

# TODO: add analysis for all gpu devices
- nsys stats --report cuda_gpu_trace gpu_aquaplanet_dyamond_ss_4process/output_active/report-0.nsys-rep --output gpu_aquaplanet_dyamond_ss_4process/output_active/ --format csv
- julia --project=.buildkite/analysis .buildkite/nsight_analysis.jl --out_dir gpu_aquaplanet_dyamond_ss_4process/output_active/
artifact_paths: "gpu_aquaplanet_dyamond_ss_4process/output_active/*"
env:
CLIMACOMMS_DEVICE: "CUDA"
Expand Down Expand Up @@ -227,9 +267,13 @@ steps:
- mkdir -p gpu_aquaplanet_dyamond_ws_1process
- >
srun --cpu-bind=threads --cpus-per-task=4
nsys profile --delay 100 --trace=nvtx,mpi,cuda,osrt --output=gpu_aquaplanet_dyamond_ws_1process/output_active/report
julia --threads=3 --color=yes --project=examples examples/hybrid/driver.jl
--config_file ${GPU_CONFIG_PATH}gpu_aquaplanet_dyamond_ws_1process.yml
--job_id gpu_aquaplanet_dyamond_ws_1process

- nsys stats --report cuda_gpu_trace gpu_aquaplanet_dyamond_ws_1process/output_active/report.nsys-rep --output gpu_aquaplanet_dyamond_ws_1process/output_active/ --format csv
- julia --project=.buildkite/analysis .buildkite/nsight_analysis.jl --out_dir gpu_aquaplanet_dyamond_ws_1process/output_active/
artifact_paths: "gpu_aquaplanet_dyamond_ws_1process/output_active/*"
env:
CLIMACOMMS_DEVICE: "CUDA"
Expand All @@ -247,9 +291,13 @@ steps:
- mkdir -p gpu_aquaplanet_dyamond_ws_2process
- >
srun --cpu-bind=threads --cpus-per-task=4
nsys profile --delay 100 --trace=nvtx,mpi,cuda,osrt --output=gpu_aquaplanet_dyamond_ws_2process/output_active/report-0.nsys-rep,gpu_aquaplanet_dyamond_ws_2process/output_active/report-1.nsys-rep
julia --threads=3 --color=yes --project=examples examples/hybrid/driver.jl
--config_file ${GPU_CONFIG_PATH}gpu_aquaplanet_dyamond_ws_2process.yml
--job_id gpu_aquaplanet_dyamond_ws_2process

- nsys stats --report cuda_gpu_trace gpu_aquaplanet_dyamond_ws_2process/output_active/report-0.nsys-rep --output gpu_aquaplanet_dyamond_ws_2process/output_active/ --format csv
- julia --project=.buildkite/analysis .buildkite/nsight_analysis.jl --out_dir gpu_aquaplanet_dyamond_ws_2process/output_active/
artifact_paths: "gpu_aquaplanet_dyamond_ws_2process/output_active/*"
env:
CLIMACOMMS_DEVICE: "CUDA"
Expand All @@ -267,9 +315,16 @@ steps:
- mkdir -p gpu_aquaplanet_dyamond_ws_4process
- >
srun --cpu-bind=threads --cpus-per-task=4
nsys profile --delay 100 --trace=nvtx,mpi,cuda,osrt --output=gpu_aquaplanet_dyamond_ws_4process/output_active/report-0.nsys-rep,
gpu_aquaplanet_dyamond_ws_4process/output_active/report-1.nsys-rep,
gpu_aquaplanet_dyamond_ws_4process/output_active/report-2.nsys-rep,
gpu_aquaplanet_dyamond_ws_4process/output_active/report-3.nsys-rep
julia --threads=3 --color=yes --project=examples examples/hybrid/driver.jl
--config_file ${GPU_CONFIG_PATH}gpu_aquaplanet_dyamond_ws_4process.yml
--job_id gpu_aquaplanet_dyamond_ws_4process

- nsys stats --report cuda_gpu_trace gpu_aquaplanet_dyamond_ws_4process/output_active/report-0.nsys-rep --output gpu_aquaplanet_dyamond_ws_4process/output_active/ --format csv
- julia --project=.buildkite/analysis .buildkite/nsight_analysis.jl --out_dir gpu_aquaplanet_dyamond_ws_4process/output_active/
artifact_paths: "gpu_aquaplanet_dyamond_ws_4process/output_active/*"
env:
CLIMACOMMS_DEVICE: "CUDA"
Expand Down Expand Up @@ -311,6 +366,9 @@ steps:
julia --threads=3 --color=yes --project=examples examples/hybrid/driver.jl
--config_file ${MODEL_CONFIG_PATH}aquaplanet_diagedmf.yml
--job_id gpu_aquaplanet_diagedmf

- nsys stats --report cuda_gpu_trace gpu_aquaplanet_diagedmf/output_active/report.nsys-rep --output gpu_aquaplanet_diagedmf/output_active/ --format csv
- julia --project=.buildkite/analysis .buildkite/nsight_analysis.jl --out_dir gpu_aquaplanet_diagedmf/output_active/
artifact_paths: "gpu_aquaplanet_diagedmf/output_active/*"
env:
CLIMACOMMS_DEVICE: "CUDA"
Expand Down Expand Up @@ -345,6 +403,9 @@ steps:
julia --threads=3 --color=yes --project=examples examples/hybrid/driver.jl
--config_file ${MODEL_CONFIG_PATH}aquaplanet_progedmf.yml
--job_id gpu_aquaplanet_progedmf

- nsys stats --report cuda_gpu_trace gpu_aquaplanet_progedmf/output_active/report.nsys-rep --output gpu_aquaplanet_progedmf/output_active/ --format csv
- julia --project=.buildkite/analysis .buildkite/nsight_analysis.jl --out_dir gpu_aquaplanet_progedmf/output_active/
artifact_paths: "gpu_aquaplanet_progedmf/output_active/*"
env:
CLIMACOMMS_DEVICE: "CUDA"
Expand Down
Loading
Loading