Text to Speech and Podcast Ads
Hey Myke, Please Use Your Code, JustForMyke
There are two types of podcast ads — static and dynamic.
For static ads, the ad doesn’t change, it’s the same for everyone. They’re the podcast ads we know and love, the host we’ve listened to for dozens of hours makes an authentic recommendation for a product they use and then implores you to use their promo code. Because of the host’s relationship with the audience, these ads are effective and convert well. The right podcaster with a podcast about the right topic can support themselves full time with an audience of just thousands. But the fact that the ads can only be targeted to the listeners of the podcast as a whole means that the product usually needs mass appeal, e.g. Casper mattresses or Squarespace web hosting. As a result, static podcast ads might not work for smaller companies or niche products. This is doubly true for more popular podcasts because it’s the same ad for all listeners, only large advertisers can afford to pay to reach the entire audience.
For dynamic ads, the ad is customized for you, either at download time or stream time. For download time ads, when you download the podcast, the podcast’s ad network will select what they think is the best ad for you based on your IP address, location, when you are downloading the episode, and any other info they can determine about you. The ad is then inserted into the audio file that you download. It’s usually an ad recorded by a different person trying to sell a product, similar to how ads work on YouTube. It’s obvious when you download the back catalog of an old podcast, and it’s all the same ads, despite the podcast being several years old.
Streaming time dynamic ads are the same as download time ads, except that the ad is picked when you are listening to the podcast and that the same company runs the podcast ad network as your podcast player (i.e. Spotify). Because your podcast player is inserting the ad, they know a lot more about you, what type of products and ads you respond to, and critically if you listened to the ad.
In the past, the trade-off between static and dynamic ads was that you could have really high-performing, authentic but untargeted ads or better targeted, more corporate-sounding dynamic ads. But modern Text-to-Speech removes the trade-offs. TTS has improved quite rapidly, with a small audio sample, modern TTS can generate audio almost indistinguishable from the podcasters’ real voice. Audio production was once expensive, but now with AI, it can be cheap.
In the near future, dynamically inserted ads will not be generic corporate ads but targeted and customized to you, read to you in the podcast host’s voice. If the cost of generating audio is cheap, the ad network won’t just choose the product and pitch that resonates the most for you but could also generate custom promo codes targeted for you. For example, if you’re streaming your podcast or if the podcast uses individualized feeds, you could be hearing, “Hey Myke, use your promo code, JustForMyke to get 10% off beard oil”.