OuteTTS-0.1-350M

OuteTTS-0.1-350M is an innovative text-to-speech synthesis model that employs a pure language modeling approach, avoiding the need for external adapters or complex architectures. Built on the LLaMa framework and using the Oute3-350M-DEV base model, it demonstrates that high-quality speech synthesis can be achieved through a simple method using crafted prompts and audio tokens.