PlayAI (fma PlayHT) founder here, this is a native multiturn voice model that is built for conversations like real-time agents or podcasts. Try it through our playground (https://play.ai/playground) or API (https://docs.play.ai/). Feel free to ask anything.
Hey, can you give an example ? The model is not perfect and this is our first version, so will get better and faster for sure. Still I generated the full prompt you referenced and it sounds good to me. Adds some laughter, but this makes it more non-robotic in my mind.
Speaker 1: Oh yes, the deep sea, nature’s basement. Home to creatures so bizarre, even nightmares are like, “Nah, I’ll pass.”
Speaker 2: Right? It's like the ocean was running a clearance sale on leftover parts. “Hey, who wants a fish with a lightbulb head? No one? Alright, let’s just drop this bad boy in the Mariana Trench.”
Speaker 1: Oh man, let’s start with a classic: the anglerfish. It’s a fish that decided it was uh, tired of chasing its food and thought, “What if I just dangle a glow stick on my head and let dinner come to me?”
Speaker 2: Honestly, I respect that. Can you imagine if we had that? Like, I’m sitting on my couch with a glowing Dorito on my forehead, waiting for snacks to find me.
Speaker 1: Dang man, I’d come find you for sure.
PlayAI (fma PlayHT) founder here, this is a native multiturn voice model that is built for conversations like real-time agents or podcasts. Try it through our playground (https://play.ai/playground) or API (https://docs.play.ai/). Feel free to ask anything.
This is impressively low latency. Also, it's cool to see another option for TTS with real-time streaming.
Ouch. If you know Arabic or Hebrew, try selecting those languages and typing something in—it’s hilarious.
Looks like they’re “testing in production.”
The current deployed model is English only, we are rolling out a multilingual version later this week!
Love the idea but this is not good yet. Mine had random changes in pace/cadence of speech and was basically uncanny valley territory
overall impressive. noticed a weird quirk of reading $100 million as "one hundred dollar million" instead of one hundred million dollars
did you listen to the output of your own demo?
> Speaker 1: Dang man, I’d come find you for sure.
that part sounds like a broken robot
Hey, can you give an example ? The model is not perfect and this is our first version, so will get better and faster for sure. Still I generated the full prompt you referenced and it sounds good to me. Adds some laughter, but this makes it more non-robotic in my mind.
https://drive.google.com/file/d/1JzfweTdvCWzJ6Wwv0KdgfaxZcyn...
Speaker 1: Oh yes, the deep sea, nature’s basement. Home to creatures so bizarre, even nightmares are like, “Nah, I’ll pass.” Speaker 2: Right? It's like the ocean was running a clearance sale on leftover parts. “Hey, who wants a fish with a lightbulb head? No one? Alright, let’s just drop this bad boy in the Mariana Trench.” Speaker 1: Oh man, let’s start with a classic: the anglerfish. It’s a fish that decided it was uh, tired of chasing its food and thought, “What if I just dangle a glow stick on my head and let dinner come to me?” Speaker 2: Honestly, I respect that. Can you imagine if we had that? Like, I’m sitting on my couch with a glowing Dorito on my forehead, waiting for snacks to find me. Speaker 1: Dang man, I’d come find you for sure.
if you are developer, then there's an api - https://docs.play.ai/tts-api-reference/endpoints/v1/tts/stre...
i don't think anyone has done real-time multi-speaker dialog generation before
[flagged]
Fake account / bot meant to promote the company. Look at comment history.
Please email hn@ycombinator.com to report these kinds of things.