vsro200 commited on
Commit
c3f3cdf
·
verified ·
1 Parent(s): 5613182

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -16
README.md CHANGED
@@ -21,25 +21,15 @@ To assess the representational quality of our trained VSR encoder independently
21
 
22
  For training code, preprocessing pipelines, and evaluation scripts, please refer to the [GitHub repository](https://github.com/vsro200/vsro200).
23
 
24
- ## Configurations
25
-
26
- We trained four MLP variants that differ only in the visual preprocessing applied before the encoder:
27
-
28
- | Variant | Crop size | Region of interest |
29
- |:---|:---:|:---|
30
- | MLP v1 | 96 × 96 | Full-face resize |
31
- | MLP v2 | 64 × 64 | Center-Middle |
32
- | MLP v3 | 64 × 64 | Center-Bottom |
33
-
34
  ## Results
35
 
36
- Top-1 and Top-5 word classification accuracy (%) on the LRRo `Lab` (controlled studio recordings) and `Wild` (in-the-wild) test sets. Higher is better.
37
 
38
- | Variant | Lab Acc@1 | Lab Acc@5 | Wild Acc@1 | Wild Acc@5 |
39
- |:---|:---:|:---:|:---:|:---:|
40
- | MLP v1 | 90.6 | 98.5 | 64.5 | 87.6 |
41
- | MLP v2 | 91.4 | 99.0 | 68.6 | 89.3 |
42
- | MLP v3 | **95.0** | **99.4** | **72.7** | **92.6** |
43
 
44
  Restricting the visual input to the lower half of the face (Center-Bottom crops) consistently outperforms full-face resizing, with the 64 × 64 crop (MLP v3) yielding the largest improvement on both Lab and Wild data.
45
 
 
21
 
22
  For training code, preprocessing pipelines, and evaluation scripts, please refer to the [GitHub repository](https://github.com/vsro200/vsro200).
23
 
 
 
 
 
 
 
 
 
 
 
24
  ## Results
25
 
26
+ We trained four MLP variants that differ only in the visual preprocessing applied before the encoder. Top-1 and Top-5 word classification accuracy (%) on the LRRo `Lab` (controlled studio recordings) and `Wild` (in-the-wild) test sets. Higher is better.
27
 
28
+ | Variant | Crop size | Region of interest | Lab Acc@1 | Lab Acc@5 | Wild Acc@1 | Wild Acc@5 |
29
+ |:---|:---:|:---|:---:|:---:|:---:|:---:|
30
+ | MLP v1 | 96 × 96 | Full-face resize | 90.6 | 98.5 | 64.5 | 87.6 |
31
+ | MLP v2 | 64 × 64 | Center-Middle | 91.4 | 99.0 | 68.6 | 89.3 |
32
+ | MLP v3 | 64 × 64 | Center-Bottom | **95.0** | **99.4** | **72.7** | **92.6** |
33
 
34
  Restricting the visual input to the lower half of the face (Center-Bottom crops) consistently outperforms full-face resizing, with the 64 × 64 crop (MLP v3) yielding the largest improvement on both Lab and Wild data.
35