Today's the day, part ~3
Oct. 8th, 2022 11:59 am![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
[arguably cw: embarrassment squick]
(part 2)
---
The desktop part that Brother replaced a few months back and still has lying around is a CPU, not a GPU. Probably not worth figuring out a Minimum Viable Desktop setup for.
---
The speed penalty for using this computer instead of the...actually, hang on, I think I may have misinterpreted. The speeds listed in the readme might actually be *relative to the speed at which the large model runs*, not relative to real-time, in which case the devs offer no opinion on how quickly you can expect Whisper to run.
Anyway, as I was saying, the speed at which Whisper runs on my laptop seems to be very dependent on how much is going on in the clip: a stretch of silence sped along at nearly real-time, while 45 minutes of lecture took 12 hours. On the one hand, that makes me feel better about leaving some stretches of silence in for timestamping reasons; on the other hand, that makes it hard to predict how much compute I need. We'll have to find out empirically whether my computer can keep up or not.
(there *are* some rumours of further CPU optimizations to come, but nothing's quite shaken out to a form I can use yet)
---
I had a few moments there of "wait, what the fuck, why is this transcript about murder, what did I miss". I thought Whisper was hallucinating at first, but I scrolled back through the log and it...keeps talking about murder, with the same names and everything. What does it know that I don't??
Then I realised: it's transcribing *the movie playing in the background*.
Whew.
---
Postscript:
I left this in my drafting notepad for a few days, and I notice that in that time, even running two Whispers 24/7, they've fallen further behind. It's looking like empirically, current software on current hardware *cannot* keep up with the incoming data.
I'll set the project aside for a little while, but I'll be keeping a close eye out for available improvements to said software and hardware. I don't think it will be long now.
---
(edit: part 4)
(part 2)
---
The desktop part that Brother replaced a few months back and still has lying around is a CPU, not a GPU. Probably not worth figuring out a Minimum Viable Desktop setup for.
---
The speed penalty for using this computer instead of the...actually, hang on, I think I may have misinterpreted. The speeds listed in the readme might actually be *relative to the speed at which the large model runs*, not relative to real-time, in which case the devs offer no opinion on how quickly you can expect Whisper to run.
Anyway, as I was saying, the speed at which Whisper runs on my laptop seems to be very dependent on how much is going on in the clip: a stretch of silence sped along at nearly real-time, while 45 minutes of lecture took 12 hours. On the one hand, that makes me feel better about leaving some stretches of silence in for timestamping reasons; on the other hand, that makes it hard to predict how much compute I need. We'll have to find out empirically whether my computer can keep up or not.
(there *are* some rumours of further CPU optimizations to come, but nothing's quite shaken out to a form I can use yet)
---
I had a few moments there of "wait, what the fuck, why is this transcript about murder, what did I miss". I thought Whisper was hallucinating at first, but I scrolled back through the log and it...keeps talking about murder, with the same names and everything. What does it know that I don't??
Then I realised: it's transcribing *the movie playing in the background*.
Whew.
---
Postscript:
I left this in my drafting notepad for a few days, and I notice that in that time, even running two Whispers 24/7, they've fallen further behind. It's looking like empirically, current software on current hardware *cannot* keep up with the incoming data.
I'll set the project aside for a little while, but I'll be keeping a close eye out for available improvements to said software and hardware. I don't think it will be long now.
---
(edit: part 4)