False Positive on CI tests: test-readme #1315

Jack-Khuu · 2024-10-18T20:31:51Z

🐛 Describe the bug

In CI, there are a few tests that should be flagged as failing, but are currently marked as green.

Specifically they seem to revolve around the test-readme unittests: see surfacing PR for examples (#1309)

Versions

NA

mikekgfb · 2024-11-05T18:22:39Z

@seemethere @malfet @kit1980 can you please have a look as to why these tests are not flagged as failing?

ditto, from yesterday - https://github.com/pytorch/torchchat/actions/runs/11634438558/job/32413340991?pr=1339 is shown as passed even though commands in the test failed and aborted with an error indication?

mikekgfb · 2024-12-11T04:09:39Z

And still happening.

One possible explanation might be that somewhere the code is catching the exception, pretty printing it, and then exit with a non-error code because the fails such as https://github.com/pytorch/torchchat/actions/runs/12243820522/job/34154220414?pr=1404 show that the code continues executing (see test run dump below) despite the fact that the first command in the generated test is set -eou pipefail which instructs the shell to immediately abort on the first error, and report a failure to the caller. (And recursively, the workflows also use this setting to cascade the first test error all the way to the top and make a test fail)

+ python3 torchchat.py generate stories15M --prompt 'write me a story about a boy and his bear'
## Running via PyTorch 
  Downloading https://huggingface.co/karpathy/tinyllamas/resolve/main/stories15M.pt...
  Downloading https://github.com/karpathy/llama2.c/raw/master/tokenizer.model...
  NumExpr defaulting to 6 threads.
  PyTorch version 2.6.0.dev20241013 available.
  Moving model to /Users/runner/.torchchat/model-cache/stories15M.
  
  Downloading builder script:   0%|          | 0.00/5.67k [00:00<?, ?B/s]
  Downloading builder script: 100%|██████████| 5.67k/5.67k [00:00<00:00, 5.30MB/s]
  Traceback (most recent call last):
    File "/Users/runner/work/torchchat/torchchat/torchchat.py", line 96, in <module>
  Using device=mps 
  Loading model...
      generate_main(args)
    File "/Users/runner/work/torchchat/torchchat/torchchat/generate.py", line 1235, in main
      gen = Generator(
    File "/Users/runner/work/torchchat/torchchat/torchchat/generate.py", line 293, in __init__
      self.model = _initialize_model(self.builder_args, self.quantize, self.tokenizer)
    File "/Users/runner/work/torchchat/torchchat/torchchat/cli/builder.py", line 603, in _initialize_model
      model = _load_model(builder_args)
    File "/Users/runner/work/torchchat/torchchat/torchchat/cli/builder.py", line 465, in _load_model
      model = _load_model_default(builder_args)
    File "/Users/runner/work/torchchat/torchchat/torchchat/cli/builder.py", line 427, in _load_model_default
      checkpoint = _load_checkpoint(builder_args)
    File "/Users/runner/work/torchchat/torchchat/torchchat/cli/builder.py", line 412, in _load_checkpoint
      checkpoint = torch.load(
    File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/serialization.py", line 1359, in load
      return _load(
    File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/serialization.py", line 1856, in _load
      result = unpickler.load()
    File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/_weights_only_unpickler.py", line 388, in load
      self.append(self.persistent_load(pid))
    File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/serialization.py", line 1820, in persistent_load
  Time to load model: 0.10 seconds
      typed_storage = load_tensor(
    File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/serialization.py", line 1792, in load_tensor
      wrap_storage=restore_location(storage, location),
    File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/serialization.py", line 1693, in restore_location
      return default_restore_location(storage, map_location)
    File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/serialization.py", line 601, in default_restore_location
      result = fn(storage, location)
    File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/serialization.py", line 467, in _mps_deserialize
      return obj.mps()
    File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/storage.py", line 260, in mps
      return torch.UntypedStorage(self.size(), device="mps").copy_(self, False)
  RuntimeError: MPS backend out of memory (MPS allocated: 1.02 GB, other allocations: 0 bytes, max allowed: 15.87 GB). Tried to allocate 256 bytes on shared pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure).
+ echo ::group::Completion
Completion
  + echo 'tests complete'
  tests complete
  + echo '*******************************************'
  *******************************************
  + echo ::endgroup::

Jack-Khuu added bug Something isn't working actionable Items in the backlog waiting for an appropriate impl/fix labels Oct 18, 2024

mikekgfb mentioned this issue Dec 11, 2024

MacOS test falsely uses MPS, fails and is misreported as passing #1416

Open

mikekgfb mentioned this issue Dec 11, 2024

bandaid for run-readme-pr-macos.yml incorrectly loading to MPS #1417

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

False Positive on CI tests: test-readme #1315

False Positive on CI tests: test-readme #1315

Jack-Khuu commented Oct 18, 2024

mikekgfb commented Nov 5, 2024 •

edited

Loading

mikekgfb commented Dec 11, 2024

False Positive on CI tests: test-readme #1315

False Positive on CI tests: test-readme #1315

Comments

Jack-Khuu commented Oct 18, 2024

🐛 Describe the bug

Versions

mikekgfb commented Nov 5, 2024 • edited Loading

mikekgfb commented Dec 11, 2024

mikekgfb commented Nov 5, 2024 •

edited

Loading