Wanted: Adventure games with European voice acting

I’m working on a proof of concept for Rhubarb Lip Sync 2, where I plan to implement full support for multiple European languages. To train the machine learning model, I need a large collection of voice recordings with their dialog text. And since those recordings should be representative of the files Rhubarb will actually be used on, I was thinking of ripping the recordings from a number of adventure games.

Here’s the problem, though: Most games only have English voice acting. So it would be great if you could point me to some games that have good voice acting in other languages. These are the features I’m looking for:

  • Full voice acting in German, French, Russian, Spanish or Italian
    (Ideally: several of those)
  • Studio-quality voice acting. No pops or audible compression artifacts.
    (Sadly, the latter rules out all the 1990s classics.)
  • There should be some program to rip the dialogue recordings. Bonus points if the dialog text can be ripped, too. :wink:

I have too little time these days to keep on top of current games. I’d appreciate any tips!

3 Likes

We could provide you our TWP dubbing files, but they are not studio quality, by the way there are no hisses nor pops.

One of the latest adventure games dubbed in Italian, German, Spanish is “Deponia” from Daedalic. It was released near the year 2012, and maybe there are some tools to rip voice and text.
There’s a huge discount on Deponia via Steam:

… and on a Daedalic bundle:

Are you developing a dangerous A.I. ? :delores:

1 Like

Daedalic was my first thought too. With the exception of “1954 Alcatraz”, German is the native language to those games, thus no lost in translation there.
Here is a tool able to extract the voice files: https://oezmen.eu/gameresources/

Secret Files has that. I assume the follow-up games do too. Btw, is Syberia new enough to not be a '90s classic audio quality wise?

But I actually switched to German not necessarily due to a great desire to play the original but just because the English audio quality was… questionable.

Now you’re just being demanding. :stuck_out_tongue:

I sure hope so! :wink:

I must withdraw the “studio quality” aspect. What’s important is that there are no audible compression artifacts, because they screw up the model. The goal is for these recordings to be as close as possible to what people would really feed through Rhubarb. And since most of its input is not studio quality, this point didn’t really make sense.

I would love to have your Italian TWP recordings!

Thanks for the tip! I actually thought of Deponia, too, but I was certain it only had English and German voice acting. Italian and Spanish is great!

Thank you! I’ll try it out as soon as possible! And I’ll check for other Daedalic games, too.

I must confess that I haven’t played any of these games. I’ll check them out!

The (later) LucasArts games had also voice acting in different languages. (Sam and Max, The Dig, Grim Fandango, MI3, etc.)

The later LucasArts and Sierra adventures were my first thought, too. The problem is that (as far as I know) all of them use DPCM compression, which creates very specific audio artifacts. Training on these files would result in a model that works great audio from old games, but not on current productions. :pensive:

Most DOS games have 8 bit PCM samples and no data reduction. Decoding lossy files meant a huge impact on CPU’s at the time after all. The quality varies, though, which is due to different sample rates and different mastering quality.
Most studios started to used lossy codecs when they went to Windows. There are huge differences in bitrates having a great impact on audio quality there. The early Telltale games are probably among the worst offenders, which they even count as a reason for the announced remastered version of Sam & Max.
Unfortunately, old DOS games were sometimes re-released for download making use of ScummVM’s ability to use compressed files.

1 Like

I can’t speak for other companies, but as far as I’m aware, Sierra has been using DPCM compression from the moment they started using digitized speech. Games like King’s Quest V, Dagger of Amon Ra and others use DPCM to compress 8-bit PCM samples to 4 bits. Later games like Leisure Suit Larry 7 again use DPCM to compress 16-bit PCM samples to 8 bits.

DPCM is a very simple form of codec that yields a fixed 2:1 compression ratio while requiring very little processing time. In fact, old CD drives were so slow that it was actually faster to read and decompress DPCM data than to read twice the amount of uncompressed PCM data without decompression.

The problem with DPCM (as far as my project is concerned) is that it introduces a characteristic noise pattern not found in other kinds of compression.

Yes, that’s regrettable. People took the already-compressed game audio, decompressed it, recompressed it to MP3 or OGG Vorbis and shipped it with ScummVM. The resulting game files are smaller and thus faster to download. But those modern codecs introduce artifacts of their own on top of the original ones.

1 Like

I never had any trouble understanding the voices in old games -perhaps with the exception of King’s Quest V, but I blame that on the too loud music/sfx compared to the voices; not on any compression artifacts.

Then again, I was trained by these.

Perhaps you should train RhubarbLipSync2 with these too? Who knows it can build some immunity to varying quality?

2 Likes

Ever played Wolfenstein 3D? Besides the really bad German, some samples are only 6000Hz, thus they don’t even meet telephony standards and are therefor hard to recognize.
For comparison, most DOS games went with 11025 or 22050 Hz samples.

These are noteworthy examples indeed, since those are assumed to be the first video games to use a lossy codec. You can hear the codec’s artefacts along with the 4-bit-quantization noise, but it is still easy to understand.
It is also apparent, that the decoder in those games used all the CPU time there is, ie. you have a still image until the sample ends.

Here’s also a nice collection of C64 speech samples in games:

The samples are implemented in different ways and the quality is all over the place. Impossible Mission and Ghost Busters still are among the highest quality.
Significantly better sample quality exists in demos, but those use 8 bit samples, which wasn’t a thing when those games were made.

Of course! I wanted to mention that one as an example for non-English speech. But then decided that would be poor taste, even if meant as a joke.

Some of the samples were reused from Castle Wolfenstein by the sound of it (last one in that C64 video)

1 Like

Like “hoompappel” (probably Schutzstaffel). But 22k was always perfectly fine. :slight_smile:

1 Like

I did some more research and it seems like the compression artifacts of the later LucasArts games shouldn’t be a problem after all. :grinning: This should give me a large corpus of recordings in English, German, Spanish, French, and Italian.

Syberia looks like a great tip: It comes in seven (!) languages, including Russian.

Does anybody know of other games that come with Russian voice acting?

1 Like

The Longest Journey does too, doesn’t it?

At least on GOG and Steam, it doesn’t seem to have Russian voice acting.

The localizations are owned by the local publisher or some such, and therefore they’re not available on GOG and Steam.

Proof that I’m not misremembering: https://steamcommunity.com/app/6310/discussions/0/626329821115109091/

The Longest Journey is available in English, German, French, Italian, Norwegian, Russian and Swedish languages.

Discworld Noir is also available in French and German btw — dunno about Russian.