
AI Podcast & NPC Generation
AI Podcast & NPC Generation
From database to MP3 - generative podcast on edge compute with AI personas.
So I made an app as one does. I usually create lots of different web apps or mobile apps, game jams or dev tools. It's explorative and fun. For me that just as much artwork as making games or painting. It can be interesting where software creation leads if you just let it lead.
I had this simple idea: what if we make a database with yes and no questions - binary answers, that is fun and easy to build and design for and easy to answer... Then spun away into AI token munching podcast land of edge compute mp3s.
I called it Open Question
The design rules where simple.
- Someone creates a question - the question is stored in the database.
- Someone answers the question - either yes or no.
- All answers are stored in the database.
- We get some nice data to analyze and make some nice views and draw insights from.
- You can create your own questions and get answers.
- We pick one question per day as the main one and push it to the users.
Apples push notifications service is used to send notifications to users.
A super simple swift ui view example of the voting:
ScrollView {
QuestionImage(questionId: q.id, questionTitle: q.title)
Text(q.title)
.font(.title2)
.padding(.horizontal, 32)
.multilineTextAlignment(.center)
.padding(bottom, 8)
.padding(.top, 8)
.minimumScaleFactor(0.5)
Text(q.explanation)
.font(.footnote)
.padding()
.padding(.bottom, 16)
.lineLimit(nil)
.fixedSize(horizontal: false, vertical: true)
HStack(spacing: 48) {
FloatingButton(
action: {
answerButtonTapped(randomFlip)
}, label: randomFlip ? "Yes" : "No",
positive: true)
FloatingButton(
action: {
answerButtonTapped(!randomFlip)
}, label: randomFlip ? "No" : "Yes")
}
}
AI Assisted question generation
So wouldn't it be cool if we could use LLM for insert anything at this point - Yes.
Curation of a good yes and no question is one thing, but generally we also want to generate some more metadata. For each question suggestion we generate some data points.
// Zod schema for AI and OpenAPI SDKs but also confusable usable by OpenAI SDK.
error: z.boolean(), // is this a valid question or all out of tokens etc
reason: z.string().optional(), // why is this an invalid question
message: z.string().optional(), // friendly message
title: z.string().optional(),
category: z.string().optional(),
keyword: z.string().optional(),
explanation: z.string().optional(), // explain and reason about the question
tags: z.array(z.string()).optional(),More data inputnice!
Fully AI generated questions
Okkkaaaay but this app lives on my phone, a and the AI is nice to me and all.
And all my deeply philosophical questions have been asked:


So wouldn't it be cool if we could use LLM for insert anything at this point - Yes.
OK! So for an LLM to continuously generate interesting questions we can use a little something from the foundation models playbook.
- Set up a web scraper going hunting for data.
- Pick some good - The news sites are a positive source of daily fun things right (right? help)?
- Use an LLM to generate questions from the scraped data.
- Post the question as an AI persona into the API and DB and show it in the app.
But what's an AI persona in the app anyway?
Okay let's just make up some dudes and dudettes. Eeeh.. here is a game dev!
(Thanks Black Forest Labs I ran flux this offline and heated my house from the ).
Sara Hjort An independent game developer who thrives on creative storytelling, technical experimentation, and pushing artistic boundaries through interactive experiences. Prefers narrative-driven design and self-expression as the key drivers of meaningful worlds. #PrettyGenericButFine
OK let's give this persona some reasoning, personality traits, likes and dislikes, cultural background and motivation. That's a human right?
I generated and hand-prompted 50 of these NPCs.
I assume this atetist artist will bring in some great Yes and No questions from the Christian Belive site and other questiable sources.
But more importantly reasonable updating websites. Anything from Reggae news, Trans rights, BBC, and other reputable sources.
And then we match a question with one of the personas.
Remember we pick a question for each day.
But now that we've gone down this road we can just make the AI personas answer questions based on their beliefs and values.
Now that's an app
Open Question Podcast
So now we got an app, we got personas generating and answering questions. We also generate blog posts from the results. And you could chat with the personas for (reasons unknown)?
For multi-media empire something was missing - How about a podcast? - How do you do that in the cloud without a dedicated server? Turns out you can use and Rust on Cloudflare Workers to combine audio sources and generate audio files.
Engineering cooking podcast recipe
import {
WorkflowEntrypoint,
WorkflowStep,
WorkflowEvent,
} from "cloudflare:workers";
export class WorkflowPodcast extends WorkflowEntrypoint<Env, Params> {
async run(event: WorkflowEvent<Params>, step: WorkflowStep) {
// Go on an adventure
await this.DoSomeWork();
await this.doEvenMoreStuff();
await this.callTheRustRPCService();
await this.finish();
}
}- Get the week's questions from the database.
- Force an to make an intro, highlights of the questions, and anything else you want to direct the script with.
- Generate the podcast scripts for each chunk.
- Pick your favorite LLM that generates audio and voice with your personas.
- Generate an MP3 background tune with another LLM.
- Write some timeline combination code that works on edge compute with WebAssembly and Rust.
- Double check that
ffmpegindeed still doesn't work on edge. - Generate metadata XML and the final MP3.
- Generate an RSS feed.
- Upload to Apple.
- Do all this in Cloudflare Workers workflows.
- Profit?
// some WASM compatible rust on edge compute in the name of audio fillers..
let sample_rate = 44100; // 44.1 kHz
let padding_duration = 5 * sample_rate; // **5 seconds padding before & after speech**
let fade_out_duration = 5 * sample_rate; // **5-second fade-out at the end**
let bgm_length = bgm_samples.len();
let outro_duration = 16 * sample_rate;
// **Step 3: Prepend 3s of Background Music Before Speech**
let mut bgm_index = 0;
for _ in 0..padding_duration {
let bgm_sample = if bgm_length > 0 {
(bgm_samples[bgm_index] as i32) / 4 // **Lower volume**
} else {
0
};
mixed_samples.push(bgm_sample as i16);
bgm_index = (bgm_index + 1) % bgm_length;
}
The secret spice is how good your LLM is at interpreting and generating the persona's psychology versus its own censors and training data. You want the persona to actually embody their character traits, not just parrot safe corporate responses. I think most of the modern flagship models does a great job at interpreting the persona records.
Behold an AI podcast worthy of perplexity:
Anyway I went outside and touched some grass. πΆπ
I have some more fun ideas I'm going to do a part 2 of this exploration with some twists and turns later. Think , because this podcast generation was fully automated all the way from database to audio podcast app.