You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Here I post some efficiency testing numbers for Monarch based MLP v.s. vanilla nn.Linear based MLP. I found that Monarch is best suitable for MLPs in Transformer architectures, which generally have large hidden size and batch size. In recommendation-focused MLPs, the MLP is usually small (e.g., 10000x1024x512, the first is feature input dim) and importantly a small batch size (say 10) is often used for serving given concurrent online requests. The following testing numbers are provided as a reference for anyone who has similar tasks.
Train(Fwd+Bwd)
Test(Fwd only)
Batch_size=1000
GPU-P100
GPU-P100
CPU
MLP(10000x1024x512)
2.95ms
0.16ms
26.57ms
Monarch(nblk=4)
1.85ms
0.57ms
10.29ms
Monarch(nblk=16)
1.37ms
0.55ms
5.67ms
Batch_size=10
MLP(10000x1024x512)
0.48ms
0.13ms
0.59ms
Monarch(nblk=4)
1.34ms
0.54ms
1.16ms
Monarch(nblk=16)
1.31ms
0.52ms
1.37ms
Batch_size=10000
MLP(1024x1024x512)
4.86ms
0.13ms
46.99ms
Monarch(nblk=4)
6.87ms
0.53ms
47.55ms
Monarch(nblk=16)
6.04ms
0.51ms
39.66ms
Batch_size=1000
MLP(1024x1024x512)
0.74ms
0.16ms
5.35ms
Monarch(nblk=4)
1.42ms
0.53ms
4.17ms
Monarch(nblk=16)
1.38ms
0.52ms
3.84ms
Batch_size=10
MLP(1024x1024x512)
0.46ms
0.13ms
0.27ms
Monarch(nblk=4)
1.29ms
0.53ms
1.15ms
Monarch(nblk=16)
1.27ms
0.51ms
0.84ms
I will post the numbers for pixelfly later.
The text was updated successfully, but these errors were encountered:
Here I post some efficiency testing numbers for Monarch based MLP v.s. vanilla nn.Linear based MLP. I found that Monarch is best suitable for MLPs in Transformer architectures, which generally have large hidden size and batch size. In recommendation-focused MLPs, the MLP is usually small (e.g., 10000x1024x512, the first is feature input dim) and importantly a small batch size (say 10) is often used for serving given concurrent online requests. The following testing numbers are provided as a reference for anyone who has similar tasks.
I will post the numbers for pixelfly later.
The text was updated successfully, but these errors were encountered: