1 | ||
1 | ||
1 | ||
1 |
This post is meant to be a notepad I'll record some stats on tokens per second of different hardware for running mixtral.
1 RTX A2000:
6GB VRAM
Cost: $0.12/h
81% cpu, 19% gpu
Tokens per second: 1.31
Token per dollar: 39,300
Cost per token: 0.0025 cents
1 RTX 3090 TI:
12GB VRAM
Cost: $0.18/h
58%/42% CPU/GPU
Tokens per second: 3.25 //Usable for small text
Tokens per dollar: 65,000
Cost per token: 0.0015 cents
I'm still adding more
1 RTX 4090:
24GB VRAM
Cost: $0.34/h
15%/85% CPU/GPU
Tokens per second: 10.28
Tokens per dollar: 108,847
Cost per token: 0.00092 cents
1 RTX A6000
48GB VRAM
Cost: $0.44/hr
100% GPU
Tokens per second: 69.47
Tokens per dollar: 568,390
Cost per token: 0.00018 cents
2 RTX A5000
24GB VRAM per - 48GB total
Cost: $0.44/hr
100% GPU
62.76 tokens/s
1 x V100 SXM2 32GB
32GB VRAM
$0.33/hr
100% GPU
46.02 tokens/s
502,036 token/$