Update broke existing functionality

#5
by Eloque - opened

Whatever was done in this update, it broke things.
Suddenly when downloading model and tokenizer I get this error

"You are using a model of type higgs_audio_v2 to instantiate a model of type higgs_audio. This is not supported for all configurations of models and can yield errors."

Also, I might be dense, but why is there an 11 GB model.safetensors with this tokenizer suddenly?

Okay, for the examples in https://github.com/boson-ai/higgs-audio and for modest memory havers https://github.com/sorbetstudio/faster-higgs-audio to remain working. It is apparantly neccesary to manually download the previous versions of both the model and the tokenizer manually.

Just putting the commit revisions, ie 9d4988fbd4ad07b4cac3a5fa462741a41810dbec and 10840182ca4ad5d9d9113b60b9bb3c1ef1ba3f84 in, isn't enough.

Download them via the cli:
huggingface-cli download bosonai/higgs-audio-v2-tokenizer --revision 9d4988fbd4ad07b4cac3a5fa462741a41810dbec --local-dir ./tokenizer_old --local-dir-use-symlinks False
huggingface-cli download bosonai/higgs-audio-v2-generation-3B-base --revision 10840182ca4ad5d9d9113b60b9bb3c1ef1ba3f84 --local-dir ./model_old --local-dir-use-symlinks False

And then point the loaders towards that, something like this
audio_tokenizer = load_higgs_audio_tokenizer("./faster-higgs-audio/models/tokenizer_old", device=get_device("cpu"))

model_client = HiggsAudioModelClient(
        model_path="./faster-higgs-audio/models/model_old",
        audio_tokenizer=audio_tokenizer,
        device_id=device_id,
        max_new_tokens=4096,
        use_static_kv_cache=False,
        device=device,
        use_quantization=True,
        quantization_bits=8,
    )

Those hacks kept my pipeline working, making it possible to run this stuff on a 3080 by the way. Still a fan, but this was some headscratching.
Now there was something about a new transformers version in the commit, guess we'll be checking that out.

What an awful way to start the day lol

This is how I fixed it on my end, without changing any code in the examples.

stable_snapshot_download is now my go to download model func.

It pins the exact snapshot and writes it as main, so the rest of the code can keep loading from main like usual. No need to pass revision anywhere else. No changes to the loading logic. It just works.

def download_model():
    """Downloads the model weights during the image build step so it boots instantly."""
    from huggingface_hub import snapshot_download
    from pathlib import Path
    
    def use_snapshot_as_main(repo_path):
        repo_path = Path(repo_path)
        commit_hash = repo_path.name
        repo_ref = repo_path.parent.parent / "refs"
        repo_ref.mkdir(parents = True, exist_ok = True)
        with open(repo_ref / "main", "wt") as f:
            f.write(commit_hash)
    
    def stable_snapshot_download(repo_id: str, revision: str | None = None):
        repo_path = snapshot_download(repo_id = repo_id, revision = revision)
        use_snapshot_as_main(repo_path)

    stable_snapshot_download("bosonai/higgs-audio-v2-generation-3B-base", revision = "10840182ca4ad5d9d9113b60b9bb3c1ef1ba3f84")
    stable_snapshot_download("bosonai/higgs-audio-v2-tokenizer", revision = "9d4988fbd4ad07b4cac3a5fa462741a41810dbec")
    stable_snapshot_download("bosonai/hubert_base", revision = "b4b85f1652c16ad63fdc818221b215b79ff55934")

You can also add this at the start of the script so it never checks for model updates at runtime:

import os
os.environ["HF_HUB_OFFLINE"] = "1"
os.environ["TRANSFORMERS_OFFLINE"] = "1"
os.environ["HF_DATASETS_OFFLINE"] = "1"

This keeps the current script fully locked to the downloaded snapshots. It boots faster, does not try to hit the network, and stays reproducible.

I have tried both @Eloque and @wonderboy solutions.
I just don't know enough to make either of these gifts work.
Can someone provide additional information on how to implement one of them?

For instance @wonderboy code is clearly a python method but he states: "This is how I fixed it on my end, without changing any code in the examples." That leaves me with no clues as to where to put the method.

What I have tried:
My program imports a modified generation.py. The only change is that I rename main in generation.py to mainly to avoid the main methods conflicting when importing.
From my program I call generation with
result = runner.invoke( mainly, [ "--transcript", f"{the_line}", "--ref_audio", f"{the_voice}", "--out_path", f"{output_file}" ] )

So, download_model has no context in my program, and when I insert it into the unmodified generation.py, and call it before calling main, then calling it with a command line example, I get an absurdly quick download (which I understand to be part of the point) and still get
HiggsAudioTokenizer.init() got an unexpected keyword argument 'acoustic_model_config'". So, that clearly isn't the way to use it.

With @Eloque 's solution...
The first thing you have to realize is that they are using a non-standard install. Which means you have to modify the paths in both HiggsAudioModelClient and load_higgs_audio_tokenizer.
With a default install, and the models getting stored in higgs-audio:model_old, and higgs-audio:tokenizer_old

Search generation.py, for HiggsAudioModelClient. You will get two hits, the first is the class definition, the second one is the one to modify.

model_client = HiggsAudioModelClient(
    model_path="../model_old",
    audio_tokenizer=audio_tokenizer,
    device=device,
    device_id=device_id,
    max_new_tokens=max_new_tokens,
    use_static_kv_cache=use_static_kv_cache,
)

Regarding load_higgs_audio_tokenizer, searching results in 3 hits. The import, a loader for the class, and a loader for the main.
Both class loader and main loader have to be modified because they have to point to the same tokenizer. But that change results in the error:
huggingface_hub.errors.HFValidationError: Repo id must use alphanumeric chars, '-', '_' or '.'. The name cannot start or end with '-' or '.' and the maximum length is 96: '../tokenizer_old'.
So, it isn't looking for a folder, it's looking for a repo, or the path is too long.
Inside the tokenizer path is 42 characters. Move it up one level, anyway. Resulting in:
huggingface_hub.errors.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '../../tokenizer_old'. Use repo_type argument if needed.
Max length complaint is gone, but still complaining about the repo name.

@TBWAIGenerated

First of all, this problem has already been solved on my end. I run these models in the cloud, and I cannot afford to recreate this issue again, so my feedback will be based on memory, which may not be fully accurate.

HiggsAudioTokenizer.init() got an unexpected keyword argument 'acoustic_model_config'

This clearly shows that the issue is caused by using updated model paths while the code still relies on old parameters. So the fix is either to update the code or revert to the older models.

My method involved using the older models, as this maintained the stability and compatibility I had previously.


The key is to run my function download_model() inside the environment where you are using Higgs. My claim about not needing code changes comes from the fact that all I had to do was introduce this function at the beginning of my new environment setup, before running the usual code, and everything worked as expected.

My function should also fix compatibility with the updated models in Transformers, but I never tested that. As I mentioned, I ran it in a new environment, meaning it was the first time those models were downloaded.

If you run this in an already configured environment, the newly downloaded models will persist and consume unnecessary storage. So deleting them would be beneficial either way.


Recommendation

Delete the local models and/or delete the current environment. Then create a new Python script that you run once during the initial setup:

# initial_download_setup.py

def download_model():
    """Downloads the model weights during the image build step so it boots instantly."""
    from huggingface_hub import snapshot_download
    from pathlib import Path
    
    def use_snapshot_as_main(repo_path):
        repo_path = Path(repo_path)
        commit_hash = repo_path.name
        repo_ref = repo_path.parent.parent / "refs"
        repo_ref.mkdir(parents = True, exist_ok = True)
        with open(repo_ref / "main", "wt") as f:
            f.write(commit_hash)
    
    def stable_snapshot_download(repo_id: str, revision: str | None = None):
        repo_path = snapshot_download(repo_id = repo_id, revision = revision)
        use_snapshot_as_main(repo_path)

    stable_snapshot_download("bosonai/higgs-audio-v2-generation-3B-base", revision = "10840182ca4ad5d9d9113b60b9bb3c1ef1ba3f84")
    stable_snapshot_download("bosonai/higgs-audio-v2-tokenizer", revision = "9d4988fbd4ad07b4cac3a5fa462741a41810dbec")
    stable_snapshot_download("bosonai/hubert_base", revision = "b4b85f1652c16ad63fdc818221b215b79ff55934")

Then run:

python3 initial_download_setup.py

Finally, add the following at the beginning of
https://github.com/boson-ai/higgs-audio/blob/main/examples/generation.py
(or wherever your script first runs):

import os
os.environ["HF_HUB_OFFLINE"] = "1"
os.environ["TRANSFORMERS_OFFLINE"] = "1"
os.environ["HF_DATASETS_OFFLINE"] = "1"

Sign up or log in to comment