[WIP] Initial PR for generating and loading state dict #1329

Jack-Khuu · 2024-10-25T00:14:22Z

No description provided.

pytorch-bot · 2024-10-25T00:14:25Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1329

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 9 New Failures, 3 Cancelled Jobs

As of commit 8ade3c4 with merge base 70260eb ():

NEW FAILURES - The following jobs have failed:

pull / runner-aoti (macos-14-xlarge) (gh)
torch._dynamo.exc.InternalTorchDynamoError: AttributeError: freqs_cis
pull / runner-et (16-core-ubuntu) (gh)
TypeError: 'NoneType' object is not subscriptable
pull / test-cpu-aoti (aarch64, stories15M) (gh)
torch._dynamo.exc.InternalTorchDynamoError: AttributeError: freqs_cis
pull / test-cpu-aoti (x86_64, stories15M) (gh)
torch._dynamo.exc.InternalTorchDynamoError: AttributeError: freqs_cis
pull / test-gpu-aoti-bfloat16 (cuda, stories15M) / linux-job (gh)
RuntimeError: Command docker exec -t ec5173b29d322828c0094f18add992729dbdf8d7c20c00ee0f9f495b56a207fa /exec failed with exit code 1
pull / test-gpu-aoti-float16 (cuda, stories15M) / linux-job (gh)
RuntimeError: Command docker exec -t 9d8fd50dab38b0e48c72d36fe17351c34000ac010b0b351407b1e8f23a618d49 /exec failed with exit code 1
pull / test-gpu-aoti-float32 (cuda, stories15M) / linux-job (gh)
RuntimeError: Command docker exec -t e8bd9ac3615250f3f60d724f6c53d4233f2057d2d08bb2ca1babd093938cae51 /exec failed with exit code 1
pull / test-tinystories-executorch (macos-14-xlarge) (gh)
TypeError: 'NoneType' object is not subscriptable
Run the aoti runner with CUDA using stories / test-runner-aot-cuda / linux-job (gh)
RuntimeError: Command docker exec -t 9027be63efb5d9ec980fb7d08d1c765efb245165e12f49c01703837473ffcf5f /exec failed with exit code 1

CANCELLED JOBS - The following jobs were cancelled. Please retry:

pull / runner-aoti (16-core-ubuntu) (gh)
##[error]The operation was canceled.
pull / runner-et (macos-14-xlarge) (gh)
##[error]The operation was canceled.
pull / test-tinystories-executorch (16-core-ubuntu) (gh)
##[error]The operation was canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

jerryzh168 · 2024-10-25T04:11:16Z

torchchat/cli/cli.py

@@ -148,6 +148,12 @@ def _add_model_config_args(parser, verb: str) -> None:
            help="Whether to compile the prefill. Improves prefill perf, but has higher compile times.",
        )

+    model_config_parser.add_argument(


after migration we shouldn't need anything special for quantized model right

Not sure I follow? The things I'm testing out should work for tensor subclass right?

oh I meant that with tensor subclass API, quantized checkpoint should be able to be loaded the same way as normal checkpoint.

the code path of loading a quantized model v.s. quantizing model on the fly might still make sense though, maybe just need to change the naming or something

Add initial PR for generating and loading state dict

c333a78

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 25, 2024

Add None check

f3fbc91

jerryzh168 reviewed Oct 25, 2024

View reviewed changes

Jack-Khuu added 2 commits October 25, 2024 12:16

Generalize to any state_dict

763a9ce

Merge branch 'main' into add_quant_saving

8ade3c4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Initial PR for generating and loading state dict #1329

[WIP] Initial PR for generating and loading state dict #1329

Jack-Khuu commented Oct 25, 2024

pytorch-bot bot commented Oct 25, 2024 •

edited

Loading

jerryzh168 Oct 25, 2024

Jack-Khuu Oct 25, 2024

jerryzh168 Oct 25, 2024

[WIP] Initial PR for generating and loading state dict #1329

Are you sure you want to change the base?

[WIP] Initial PR for generating and loading state dict #1329

Conversation

Jack-Khuu commented Oct 25, 2024

pytorch-bot bot commented Oct 25, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1329

❌ 9 New Failures, 3 Cancelled Jobs

jerryzh168 Oct 25, 2024

Choose a reason for hiding this comment

Jack-Khuu Oct 25, 2024

Choose a reason for hiding this comment

jerryzh168 Oct 25, 2024

Choose a reason for hiding this comment

pytorch-bot bot commented Oct 25, 2024 •

edited

Loading