Official Thimbleweed Park Forums

Text Version of Podcasts

We better make one post to keep track of who will work on a certain podcast. To avoid that two or more are doing the same job in parallel. I will create two posts later today for the first two podcasts and then I’ll start on the third.

1 Like

@Someone Okay, that’s true. I agree with that.

I’m not sure if everyone would always check out this post before starting to work. Maybe usual threads would be better than a Wiki, for this reason, so that no one would be able to interfere with someone else’s works. If two people were working on the same paragraphs in the same transcript at the same time, they would at least not be able to mess it up.

ok, so I created two new posts, one for a single podcast transcript.
As you can see the formatting tries to take both closed captions extraction and transcript text into account.
Unfortunately, some parts should be removed from one but not the other and vice versa. So currently, they contain the sum of both. Once we smooth out some extraction scripts AND how to share the common source file, I will edit the transcript post to replace it with the extracted transcript (i.e. removing all the distracting text between < > and all curly brackets { } preserving text in between.)

@besmaller, could you try and upload the captions once more to see if we have covered everything? If you want, I can write a small linux script to extract what we need (especially since I have the original txt files, while a copy and paste from the forum might introduce some symbol characters, like for three dots: (…) )

I’ll get cranking on the next podcasts in the meantime

1 Like

but that would mean someone has still to go and merge everything…

What I was rather suggesting is that 1 single podcast transcript is done by 1 single person. If the quality is good enough, minor corrections could be handled through regular post comments. The risk with a wiki is that it needs perpetual monitoring to avoid some joker messing it up. Note that we can still wiki-fy an existing post (or unwiki-fy if necessary).

So I’m also in favor of regular posts, but each with just 1 owner.
As a better alternative to a dedicated post to keep track of who’s transcribing what, I’d propose this way of working : when you commit yourself to transcribing a certain podcast, you create the new topic first (category: Development/Design, title: Transcript Podcast #n) with an initial entry “coming soon”, which you later replace.

How does that sound?

Google has posted recommendations for how to format transcripts to upload and use for captioning. (I didn’t find this until today) : https://support.google.com/youtube/answer/2734799?hl=en

If you can give me a version of the files like that I can upload them. Alternatively, I can write and run the cleanup script myself, but then we’d need a way to exchange the formatted text files, which as you point out, isn’t easily possible within the discourse forum. We could explore cloud shares or something as an option too.

I know who you mean, but as I wrote above: I would do that/maintain the texts if the work isn’t too much. :slight_smile:

I don’t think that this is a problem. My version history is a wiki post and no one has messed it up - yet. :slight_smile:

The problem is, that you can’t edit the posts after a while. So what happens if we have to rewrite parts of a text? After the post is locked we would only be able to track the edits in the comments below. And …

… this could lead to the situation that the person can’t edit “coming soon” anymore. So the person has to keep “coming soon” in the first post and post the podcast text in the comments below. And shouldn’t we publish the auto-generated texts as a help?

I think it would be feasible this way. There is no necessity to edit the first post, even though it would have been represented more clearly by doing it that way. We could just publish everything in the respective thread.

Our transcription threads here would be just for development purposes anyway. Ron mentioned that he would like to add the final transcriptions to the podcast entries in the dev blog, once they are done.

In my opinion, uploading the modified transcripts on YouTube would be nice to have (in addition to the text versions on Ron’s dev blog). Though, we could also create our own videos with a different software tool that would display the subtitles in a more attractive way - maybe with a TWP style font.

I agree with this. People messing up wiki posts are not a real issue in my opinion, for several reasons:

  • existing users of this forum have always been very respectful of wiki posts
  • new users cannot edit wiki posts. Only users with Trust Level 1 can edit them and this level can be changed in Discourse configuration to make it even more restrictive
  • the quantity of active users of this forum is steadily decreasing, there are less and less users who pay attention to what’s happening here
  • even if somebody manages to mess with a wiki post, anyone with TL1 can easily revert it to its previous version
3 Likes

Sadly.
I hope that the transcriptions will be nonetheless interesting for many readers.

1 Like

But isn’t that usual? If a new forum opens there are many interested users. After a while only a core team remains active. Can’t we see that positive? We are the hardcore TWP fans. :wink:

Okay, let’s see it positive - even though I wonder what the other 19,000 backers are doing now. Well, maybe they just don’t have enough spare time or are already active on other forums or they spend much time with playing TWP or another adventure game. :slight_smile:

A good group always has a limited number of members.

“meglio pochi, ma buoni.”
(A few, but good, is better)

2 Likes

I would assume that the most of them just backed the project to get the game. :wink:

1 Like

I want to publicly thank all of you are putting efforts into this, but especially you, Sushi, for having answered my call about the transcriptoin project in such an efficient way!

I hope you’re importing from the captions of @besmaller, and you’re not doing it manually! Anyway, I envy your listening comprehension. I’ve always had problems with it.

Here some comments/suggestion for the transcripts:

  1. I stil feel the need for a section (or a subsection) entirely dedicated to the transcriptions. By now, every single transcription is in one topic, but it is in random order and mixed up with all the other posts of the development section. Maybe @eviltrout could consider it?
  2. It would be nice, in every post, to link the original podcast (as @cvalenti suggested) AND the link to the youtube video, so that anyone can choose: audio only, audio subtitled or plain text
  3. The references are nice and useful. isn’t there a way to make an active link to them, in order to consult them without disturbing the reading flow scrolling up and down?
1 Like

Your suggestions 2. and 3. are under investigation, but should both be feasible. I am finetuning the formatting and the scripts to extract both the transcript and the closed captions from my “master” text while working through new episodes of the podcast. I want to keep the momentum and finish at least 1 or 2 transcripts per week, rather than debating endlessly on some formatting (which is also fun, but less important).

For 2., I will need to check with @besmaller once he has uploaded the captions.

I am indeed transcribing based on the automatic captions now (thanks to @Someone) , which goes considerably faster, except for some hard to understand mumblings - which are often funny in their own right (For example “Wrong Dilbert talking about wok boxes”).

1 Like

Thanks, I do consider them to be the only real added value of what I do.
One side note though, by making an active link, you can jump to the footnote, but how do you jump back without scrolling? Edit: nevermind, @LowLevel gave me a suggestion how to do that.

1 Like

the reference hyperlinks are in (and working now!)
Also, I replaced the “best of” sections to <b> bold text </b> (although personally I don’t think it stands out well enough), but in all 3 transcripts I have uploaded, there are some issues. I was able to trace it back to only the first line after <b> to be put in bold, but not the subsequent ones until </b>

For example:
<b>
(Ron): Yeah, I really like what Ken has been doing with the cover. It kind of has this Maniac Mansion-feel too, which i think is really nice.

(Gary): Well if anybody can do Maniac Mansion-cover-feel, it would be Ken Macklin.
</b>

ends up as:

(Ron): Yeah, I really like what Ken has been doing with the cover. It kind of has this Maniac Mansion-feel too, which i think is really nice.

(Gary): Well if anybody can do Maniac Mansion-cover-feel, it would be Ken Macklin.

any thoughts?

the fact I can’t even format the whole thing as a single code block looks to me as if there are some hidden characters messing up?

That’s strange. It may be a bug, since HTML isn’t aware of new lines. It ends the formatting at the first end-tag it encounters.

Anyway, if there is no way to get rid of this, you can place a 《b》《/b》(in single angle brackets) at each line.
I like the bold style to put on evidence the BestOf part.

Discourse interprets the whole(?) BBCode and Markdown. That maybe leads to strange results.

Maybe you want to have a look at:
Markdown reference at Wikipedia
BBCode reference at Wikipedia

Official Thimbleweed Park Forums powered by Discourse.