KVAE 2.0
Collection
KVAE 2.0 is a family of video tokenizers with a time compression ratio of 4 and spacial compression ratio of 8 and 16 • 2 items • Updated • 2
Model KVAE-3D-2.0-t4s8 has time compression 4 and spacial compression 8x8
For the test, open datasets MCL-JCV (video in 1280x720 resolution) and BVI-DVC were used. Wan-2.1 and HunyuanVideo-1.0 were considered as alternatives for the 4x8x8 format. Below are the results of a comparison using the PSNR, SSIM, and LPIPS metrics (with features from AlexNet).
Reconstruction comparison of KVAE 2.0, Hunyuan 1.0 and Wan 2.1
