Create your own tailored podcast using your documents

Last updated

1/17/2025

Document

Podcast

Create your own tailored podcast using your documents

This blueprint demonstrates how you can use open-source models & tools to convert input documents into a podcast featuring two speakers. It combines document pre-processing, language model-powered script generation, and text-to-speech synthesis. Designed to run on most local setups, it requires no external API calls or GPU access, making it both more accessible and privacy-friendly by keeping all processing local.

If you encounter any issues with the hosted demo below, try the Blueprint in the GPU-enabled Google Colab Notebook available here.

Time

10 min

Complexity

Medium

Status

Stable

Contributors

Tags

Local AI

Text-to-Speech

License

Apache 2.0

Hosted Demo

Tools used to create

Trusted open source tools used for this Blueprint

Llama.cpp

Use llama.cpp to load GGUF-type models, enabling efficient local generation of podcast scripts using a text-to-text model.

OuteTTS

Use OuteTTS to initialize a text-to-speech model to generate your podcast audio.

‍

Streamlit

Use Streamlit to build an interactive app for the full document-to-podcast pipeline.

‍

Help Documentation

Detailed guidance on GitHub walking you through this project installation.

Discussion Points

Get involved in improving the Blueprint by visiting the GitHub Blueprint issues.

Join in

Requirements

OS: Windows, macOS, or Linux. Python 3.10 or higher. Min RAM: 10 GB. Disk space: 32 GB min.

Learn More

Step by step walkthrough

Use Cases

Explore how the Blueprint configuration parameters have been adjusted to create solutions that fit your specific needs.

Use Cases

Radio Drama Generator

A proof of concept for using open-source models & tools to convert input story context into a radio drama featuring multiple speakers.

Audio

LLM

Text

Use Cases

README-to-Podcast

This version is optimized for making technical documentation more accessible through audio explanations.

Audio

LLM

Text

Extensions

Explore how the Blueprint components have been extended to expand its scope and unlock new capabilities

Extensions

Multilingual Document-to-Podcast

An extension of the Document-to-Podcast Blueprint that shows how to enable support for languages other than English by integrating additional text-to-speech (TTS) models.

Choices

Insights into our motivations and key technical decisions throughout the development process.

Choices

Motivations and Technical Decisions

Create your own tailored podcast using your documents

Related content