NeuTTS Air – On-device TTS model by Neuphonic(github.com/neuphonic)

2 points by maxloh 18 hours ago | 1 comments

  • jzebedee 14 hours ago
    It's good to see more open models approaching on-device inference. We need more stops on the fast<>good quality spectrum than just piper and VITS.

    My first impressions of it:

    * The cloning was decent at imitating voices, but the prosody is quite bad

    * There's noticeable crackling in the GGUF models and the quality drop from base model to Q8 was significant

    * Q4 models are apparently bugged on platforms outside of Linux

    * The speed is nowhere near realtime even using all the latency reductions (Q4 backbone, pre-encoding, ONNX codec decoder), it was still lucky to hit a real-time factor of 4x

    > Optimised for on-device deployment - provided in GGML format, ready to run on phones, laptops, or even Raspberry Pis

    All of this testing was on a beefy 24 core AMD with 64GiB of RAM. There's no way this model would even come close to realtime on any Pi I know.