I Added AI Voice Acting to Delores: A Thimbleweed Park Mini-Adventure

Hey y’all!

I spent the last ~1.5 weeks working on a project I thought you might be interested in.

I loved Thimbleweed Park and was very excited when Delores was released (I finished the game in one sitting after I had seen the announcement on Twitter). I really liked how much the voice acting added to the original game and I had recently seen people using neural networks for vocal synthesis so I extracted the files from both games, stared at the assembly code for a while and started torturing my poor graphics card.

Here you can see the result (Delores sounds like a drunk robot sometimes):

You might be asking yourself: Why did I not use the source code that was released a few days ago?
*sigh*
I had never used IDA before or done much reverse engineering. On my second sleepless night of stepping through the assembly and injecting my own code I was finally able to capture the dialogue in real time. I took a break and scrolled through my Twitter feed - that’s when I saw the announcement that the source code was released :tired_face:
But since I had spent so much time working on my hacky method I decided to finish it (which was nice because I ended up learning a bunch of new things on the way).

Let me know if you have any questions!

14 Likes

Scary.

1 Like

Since you say you used assembly I suppose there won’t be much overlap, but this is somewhat similar to what I want to do with cat noises. The sound was generated here in advance?

We’re sisters, can’t we be freinds? [sic]

(at 5:15) :rofl:

1 Like

Porca tro*a!
:delores:
It’s impressive, man.

Question #1: How did you do that?
Question #2: Is your last name Einstein?

1 Like

Yup! You can see an example at the end of the video. I loved the “friends” part too, I thought it sounded like Delores got a southern accent all of the sudden :sweat_smile:
I saw your project, it looks really cool!

2 Likes

This can all be explained away with a few lines about how the whole town is recovering from Laryngitis.

12 Likes

Thanks!

#1: I might make a detailed video soon. I basically extracted the audio and text files for the voiced lines from the original game and gave it to a neural network to learn how to speak like Delores. I extracted the Delores files as well and let the AI read them. I then looked through the assembly code and found the spot where the dialogue is drawn onto the screen and wrote my own library that I injected into that function. The library takes the dialogue line and looks up the correct audio file for it (previously generated by the neural network) and plays it. You can see the lines that it didn’t find audio for in the command window on the left (commands and other actors’ lines)

#2: nope, same country of origin though :thinking:

4 Likes

Oh, I admit I didn’t watch the whole 14 minute thing. The end was cool. :slight_smile: I was thinking along the lines of doing something similar to what I did with the text, basically in parallel with it.

1 Like

Yes, please, do that.
And what about the neural network? Where does she come from? Does she live with you? Do you feed her? Does she grow well? What about her political ideas? Does she like pizza?

I mean, the scary part is that soon ANYBODY can make YOU say WHATEVER they want.

3 Likes

She’s really hungry most of the time; she usually eats 5 GB of my GPU memory…

Yup, that is pretty scary… While making this I kept thinking about if this is morally wrong but I decided that it was ok for demonstration purposes. What scares me is that if I was a big company and had the resources to actually makes this sound good, I could just not hire actors anymore and make money off their voices.

1 Like

That’s fine, I can spare that. :cat: (Total of 8.)

1 Like

I guess it will eventually happen within a few years. Right now it’s still easier to get the real actor to say the lines with all required nuances.

Hmmm… Definitely scary. Probably an actor could sue you for non authorized use of his\her voice, at least if you make money out of it. Or it will be possible in the near future. It’s a more advanced form of piracy. Increasingly difficult to fight.

1 Like

You only say that because you fear that we’ll use AI to put the Sheriff’s voice tone onto your acting.

2 Likes

Is this possible? :grimacing: You should adopt that little neural monster and try to make it learn italian :sweat_smile:

1 Like

Unfortunately I know nothing about neural networks (that’s something I’d love to learn, but who has time?), but I suppose there might be a way of changing the tone of a recording so that it sounds like the training.

I mean, there are algorithms that are able to replicate voices, but I’ve always seen them replicate on the same language of the training.

But here it would be something different, it’s not replicating the speech, it’s changing the tone.

Anyway we’ll just use your voice :laughing:

20 years from now, and there will be no need to voice acting anymore.

It all depends on what rights you have to the actor’s work. Sometime you have a “buyout” when means you can do anything you want with the recordings, most common is you can do anything you want, but only for that game, movie or TV show. Actors know that AI is coming and the contracts are starting to deal with that.

All that legal mumbo-jumbo aside, it’s not something I would ever do out of respect for their talent any more that I would train AI on someones music or art.

If you were going to record them to make a canon of audio to run AI against, you’d have to be upfront and pay them accordingly.

7 Likes

Interesting, I need to look at this topic. I’m a sound designer and I would love to mess with this stuff. Do you need to know much about coding or are there tools for this like for video fakes? I’m asking about the AI speech synthesis, not putting it in the game.

1 Like

You don’t really need to know much about coding but the process is still a bit more difficult than it was with the deepfake tools. I used Tacotron-2, you can find some implementations online.

3 Likes