mcshao
/

LAT-Audio

mcshao commited on 16 days ago

Commit

bb040a3

verified ·

1 Parent(s): cb1d6af

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -5,7 +5,7 @@ language:
 - en
 base_model:
 - Qwen/Qwen3-Omni-30B-A3B-Instruct
----
 # LAT-Audio
@@ -21,7 +21,7 @@ During reasoning, LAT-Audio iteratively incorporates audio evidence through a **
 - temporal hallucination (invalid timestamps)
 - timestamp drift (progressive misalignment over time)
----
 ## Model Description
@@ -38,7 +38,7 @@ LAT-Audio formulates long-form audio understanding as a structured reasoning pro
 This design enables robust temporal reasoning under long-context settings, where conventional direct modeling approaches often fail.
----
 ## Model Variants
@@ -48,7 +48,7 @@ We provide two model variants:
 | **LAT-Audio** | ✅ Yes | LAT-Chronicle | Tool-augmented multi-step reasoning model with global-to-local temporal inference |
 | **LAT-Audio-Base** | ❌ No | LAT-Chronicle + in-house | Direct modeling baseline fine-tuned from Qwen3-Omni with more in-house data, offering faster and simpler inference |
----
 ## Quick Start
@@ -58,10 +58,11 @@ pip install -U "huggingface_hub[cli]"
 huggingface-cli download mcshao/LAT-Audio --local-dir ./LAT-Audio
 huggingface-cli download mcshao/LAT-Audio-Base --local-dir ./LAT-Audio-Base
 For detailed inference methods and examples, please refer to the official repository:
 👉 https://github.com/alanshaoTT/LAT-Audio-Repo
----
 ## Contact

 - en
 base_model:
 - Qwen/Qwen3-Omni-30B-A3B-Instruct
 # LAT-Audio
 - temporal hallucination (invalid timestamps)
 - timestamp drift (progressive misalignment over time)
 ## Model Description
 This design enables robust temporal reasoning under long-context settings, where conventional direct modeling approaches often fail.
 ## Model Variants
 | **LAT-Audio** | ✅ Yes | LAT-Chronicle | Tool-augmented multi-step reasoning model with global-to-local temporal inference |
 | **LAT-Audio-Base** | ❌ No | LAT-Chronicle + in-house | Direct modeling baseline fine-tuned from Qwen3-Omni with more in-house data, offering faster and simpler inference |
 ## Quick Start
 huggingface-cli download mcshao/LAT-Audio --local-dir ./LAT-Audio
 huggingface-cli download mcshao/LAT-Audio-Base --local-dir ./LAT-Audio-Base
 For detailed inference methods and examples, please refer to the official repository:
 👉 https://github.com/alanshaoTT/LAT-Audio-Repo
 ## Contact