AnnouncementsFunnyVideosMusicAncapsTechnologyEconomicsPrivacyGIFSCringeAnarchyFilmPicsThemesIdeas4MatrixAskMatrixHelpTop Subs
2

This post is meant to be a notepad I'll record some stats on tokens per second of different hardware for running mixtral.

1 RTX A2000:
6GB VRAM
Cost: $0.12/h
81% cpu, 19% gpu
Tokens per second: 1.31
Token per dollar: 39,300
Cost per token: 0.0025 cents

1 RTX 3090 TI:
12GB VRAM
Cost: $0.18/h
58%/42% CPU/GPU
Tokens per second: 3.25 //Usable for small text
Tokens per dollar: 65,000
Cost per token: 0.0015 cents

I'm still adding more

1 RTX 4090:
24GB VRAM
Cost: $0.34/h
15%/85% CPU/GPU
Tokens per second: 10.28
Tokens per dollar: 108,847
Cost per token: 0.00092 cents

1 RTX A6000
48GB VRAM
Cost: $0.44/hr
100% GPU
Tokens per second: 69.47
Tokens per dollar: 568,390
Cost per token: 0.00018 cents

2 RTX A5000
24GB VRAM per - 48GB total
Cost: $0.44/hr
100% GPU
62.76 tokens/s

1 x V100 SXM2 32GB
32GB VRAM
$0.33/hr
100% GPU
46.02 tokens/s
502,036 token/$

Comment preview