Create your own tailored podcast using your documents
This blueprint demonstrates how you can use open-source models & tools to convert input documents into a podcast featuring two speakers. It combines document pre-processing, language model-powered script generation, and text-to-speech synthesis. Designed to run on most local setups, it requires no external API calls or GPU access, making it both more accessible and privacy-friendly by keeping all processing local.
If you encounter any issues with the hosted demo below, try the Blueprint in the GPU-enabled Google Colab Notebook available here.
Trusted open source tools used for this Blueprint
Detailed guidance on GitHub walking you through this project installation.
View MoreGet involved in improving the Blueprint by visiting the GitHub Blueprint issues.
Join inOS: Windows, macOS, or Linux. Python 3.10 or higher. Min RAM: 10 GB. Disk space: 32 GB min.
Learn MoreExplore how the Blueprint configuration parameters have been adjusted to create solutions that fit your specific needs.
A proof of concept for using open-source models & tools to convert input story context into a radio drama featuring multiple speakers.
This version is optimized for making technical documentation more accessible through audio explanations.
Explore how the Blueprint components have been extended to expand its scope and unlock new capabilities
An extension of the Document-to-Podcast Blueprint that shows how to enable support for languages other than English by integrating additional text-to-speech (TTS) models.
Insights into our motivations and key technical decisions throughout the development process.