Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

M1/M2 performance fix: use Apple MPS (metal performance shaders) if available #26

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

sghael
Copy link

@sghael sghael commented Jul 17, 2023

Use Apple MPS (metal performance shaders) to move work to M1/M2 GPUs when available. The MPS backend support is part of the PyTorch 1.12 official release.

Screenshot July 17 2023 17:59:38

@karpathy
Copy link
Owner

Will this code fail for older PyTorch versions?

@sghael
Copy link
Author

sghael commented Jul 18, 2023

Good call-out. The first version would trigger an exception on pre1.2 versions of PyTorch.

I've modified the code to gracefully degrade with older versions of PyTorch (prior to 1.2). Unfortunately, this adds 2 lines to the code 😄 .

PyTorch 2.x on M1 silicon uses mps:

❯ python
Python 3.10.12 | packaged by conda-forge | (main, Jun 23 2023, 22:41:52) [Clang 15.0.7 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.__version__
'2.0.0'
>>> device = ('cuda' if torch.cuda.is_available()
...           else 'mps' if hasattr(torch.backends, 'mps') and torch.backends.mps.is_available()
...           else 'cpu')
>>> print(device)
mps

PyTorch 1.1 on M1 silicon now reverts to cpu:

❯ python
Python 3.10.12 | packaged by conda-forge | (main, Jun 23 2023, 22:41:52) [Clang 15.0.7 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.__version__
'1.11.0.post2'
>>> device = ('cuda' if torch.cuda.is_available()
...           else 'mps' if hasattr(torch.backends, 'mps') and torch.backends.mps.is_available()
...           else 'cpu')
>>> print(device)
cpu

Running gpt.py with mps on Apple M1 Pro:

❯ /Users/sghael/mambaforge/envs/ng-video-lecture/bin/python /Users/sghael/Developer/ng-video-lecture/gpt.py
mps
10.788929 M parameters
step 0: train loss 4.2221, val loss 4.2306

step 500: train loss 1.7446, val loss 1.9065
step 1000: train loss 1.3895, val loss 1.5960
...
step 4999: train loss 0.8609, val loss 1.5705

ESCALUS:
Enough of his very proper time;
Death of the poor little.

FRODH:
I cannot weither to staught the benefit of you.

ESCALUS:
Desperate your belly: when he hath infiss your joints
Throw his most mountsey: therefore, to say I hear,
And take your habby friends. Whence there I live?
An obstance of your brother have so heard that
You hope that slaughteer'd his sister than marry?

DUKE VINCENTIO:
'Come on; what's that's the princton; but that
I may know in require out absent;
That will run and

Running bigram.py w/ mps on Apple M1 Pro:

❯ /Users/sghael/mambaforge/envs/ng-video-lecture/bin/python /Users/sghael/Developer/ng-video-lecture/bigram.py
mps
step 0: train loss 4.7305, val loss 4.7241
step 300: train loss 2.8110, val loss 2.8249
...
step 2700: train loss 2.4738, val loss 2.4911

Foasthaprse tize herst el
O u fZEie hy:


Hak, CORineg aggell thrr Masearor charnge?
Tyoucre thy, chouspo in mppry way avend oubur'er sickes bokecard dhiceny

He tw el fe oupise he, lbustselownthous;
I m w
T:
The at;
I m hofaruk mondrn itheland's oe, oghithet f, badogienthofBRI'sey &CleDWeer'dsureisold array n
ICoyockind m murs, in mamybalorenyongmyooe, d Vofetthindy st
Hefqu brveseay alsteanerm to, oupomp rede d pre h, gavitYOfrrerean apsts lathind my d erouerse IOLUED d ngKE hicerire.
II IS:
I

@yihaoye
Copy link

yihaoye commented Sep 30, 2023

Tried the method, but seems like cause this issue #32
thanks for sharing anyway

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants