PersonaPlex-7B-v1 ONNX

ONNX-exported components of NVIDIA PersonaPlex-7B-v1 for use with ElBruno.PersonaPlex C# library.

Files

File Size Description
mimi_encoder.onnx 178 MB Mimi audio encoder (24kHz audio -> discrete tokens)
mimi_decoder.onnx 170 MB Mimi audio decoder (discrete tokens -> 24kHz audio)

Architecture

These are the Mimi audio codec components of PersonaPlex, based on the Moshi architecture:

  • Encoder: SEANet convolutional encoder + ProjectedTransformer + SplitResidualVectorQuantizer

    • Input: [batch, 1, samples] float32 (24kHz mono audio)
    • Output: [batch, 8, frames] int64 (8 codebooks, ~12.5 frames/sec)
  • Decoder: SplitResidualVectorQuantizer + ProjectedTransformer + SEANet convolutional decoder

    • Input: [batch, 8, frames] int64
    • Output: [batch, 1, samples] float32

Usage with C#

`csharp using ElBruno.PersonaPlex.Pipeline;

// Models download automatically on first run using var pipeline = await PersonaPlexPipeline.CreateAsync(); `

Export Details

  • Exported from PyTorch using orch.onnx.export (opset 17)
  • Source: nvidia/personaplex-7b-v1
  • Mimi model: 79.3M parameters
  • Dynamic axes enabled for batch size and sequence length

License

MIT (same as the export tooling). The base model weights are under NVIDIA Open Model License.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for elbruno/personaplex-7b-v1-onnx

Quantized
(7)
this model