brin_bellway: forget-me-not flowers (Default)
Brin ([personal profile] brin_bellway) wrote2021-06-28 09:13 pm

The awkward transitional period of history we are currently living through

In which *some* but not *all* podcasts use auto-transcripts, so you can't immediately disregard something for being a podcast.

I was hoping Just Plain Wrong had a text version. :(

---

Listen Notes informs me that Google is willing to run their auto-transcriber on anything played through Google Chrome (not just Youtube videos), but 1: fuck Google Chrome, and 2: it sounds like it only *captions* rather than transcribing per se, so a 36-minute episode would require 36 minutes of hanging around watching for each new word/line to pop up (as opposed to dumping the audio into a processing queue, going off to do other things, and getting all the text at once later).

[personal profile] contrarianarchon 2021-07-01 09:57 am (UTC)(link)
If you've got the long-term memory needed to keep that workflow organised then that seems like a pretty viable plan, yeah.

... I wonder how much trouble it would be to make a FLOSS auto-transcriber, it feels like (bad) spoken-language models are a well-solved problem these days. Depends how much fidelity you actually need (probably more than a bad model can get you), I guess, plus any bit of software is fairly costly to make in practice because of the need for options and bug-testing and stuff. ... also this kind of thing is easier and easier the narrower the scope, which means the thing that can do specifically "two guys talking podcasts" is probably orders of magnitude simpler than a general-purpose speech parser. You could even train it on podcasts that *do* offer transcriptions!