gregtatum commited on
Commit
0c5c68a
·
1 Parent(s): f2fce80

Add a precisions table

Browse files
Files changed (1) hide show
  1. README.md +17 -5
README.md CHANGED
@@ -42,11 +42,23 @@ to end to test in Firefox on some vectors here was the cosine similarity for the
42
  mean pooled result. Note that the vector math happens in the f32 space, but storage
43
  for the embeddings is in a lower precision.
44
 
45
- f32 vs f16: cosine similarity = 1.00000000<br/>
46
- → They are essentially identical in direction.
47
-
48
- f32 vs f8: cosine similarity = 0.99956375<br/>
49
- → Very close, only tiny quantization effects.
50
 
51
  Note that this was done on the `torch.float8_e4m3fn`, while `torch.float8_e5m2` generally
52
  has more loss.
 
 
 
 
 
 
 
 
 
 
 
 
 
42
  mean pooled result. Note that the vector math happens in the f32 space, but storage
43
  for the embeddings is in a lower precision.
44
 
45
+ > f32 vs f16: cosine similarity = 1.00000000<br/>
46
+ > → They are essentially identical in direction.
47
+ >
48
+ > f32 vs f8: cosine similarity = 0.99956375<br/>
49
+ > → Very close, only tiny quantization effects.
50
 
51
  Note that this was done on the `torch.float8_e4m3fn`, while `torch.float8_e5m2` generally
52
  has more loss.
53
+
54
+ Precision also affects download size. For instance with larger
55
+ [minishlab/potion-multilingual-128M/](models/minishlab/potion-multilingual-128M/README.md)
56
+ model. The `fp32` is 228M compressed, while only 51M for `fp8_e4m3`, which has competetive
57
+ quantization values.
58
+
59
+ | precision | dimensions | size |
60
+ | ------------- | ---------- | ------- |
61
+ | fp32 | 128 | 228M |
62
+ | fp16 | 128 | 114M |
63
+ | **fp8_e4m3** | 128 | **51M** |
64
+ | fp8_e5m2 | 128 | 44M |