AVSubtitles - Forum

Open AI Whisper for all

2023-03-02 05:49:45

Welcome to this topic !

You want to use Whisper for your transcription but it's too complicate to install and maybe you don't have the computer to manage it.
Don't worry, you can do it online, it's free.
First login to your google account and open this link Click me ! and just follow the instructions.
Very easy, you can use whisper now.

However, I recommend that you add code options to transcribe your mp3 for a better accuracy, so add :

For english (because it don't have large model; temperature for accuracy; output_format to have juste an srt) :

!whisper "your.mp3" --model medium --temperature 0.2 --output_format srt

For other languages :

!whisper "your.mp3" --model large --temperature 0.2 --output_format srt

You can see other options with :
!whisper --help
(you can add this code in your notebook page at the end)

The link to the official GitHub to have more info : https://github.com/openai/whisper

What you don't have or want a Google account?
Ok no problem try whisper here : https://replicate.com/openai/whisper

Now do your own subtitles !!

Add GoogleDrive files to your colab

Lionfacial

2023-04-01 05:03:19

Because it's a virtual machine, you can be disconnect from colab when you upload a big file.
The solution is to upload the files on your google drive and after, access them using mounting Google Drive into colab, here the code for that :

from google.colab import drive

drive.mount('/content/gdrive')

Follow the instructions and the files are ready to use.

HowTo Whisper AI

truc1979

2023-04-08 10:54:20

Lionfacial has created a PDF with screenshots. You can download it: PDF How to use AI Whisper.
All credits to him!

Re: HowTo Whisper AI

Lapoiar

2023-04-14 20:17:32

Thanks

Re: HowTo Whisper AI

Lionfacial

2023-06-06 03:40:04

Ok everybody, i don't have a lot of time to test very well Whisper and to understand all the parameters. But i made a code for GoogleColab to simplify the modification of some parameters, if you want to test it, replace or made a new +code and copy this:

#@markdown <h2>Whisper's folder parameters :</h2>



import os, sys, re



#@markdown <h><i><font size=2 color="#AF9B60">The path and name of the file to be transcribed.<br/> File formats can be mp3, mp4, mpeg, mpga, m4a, wav, or webm:</i></h>

input_file = "/content/gdrive/MyDrive/File.wav" #@param {type:"string"}

output_folder = "/content/gdrive/MyDrive/" #@param {type:"string"}

#@markdown <h><i><font size=2 color="#AF9B60"><br/>Format of the final transcription text:</i></h>

output_format = "srt" #@param 

#@markdown <h2><br/>Whisper's language parameters :</h2>

#@markdown <h><i><font size=2 color="#AF9B60">Model to use (better result with "large-v2" for Multilanguage include English)(with ".en" is for English only, good results with "medium.en"):</i></h>

ai_model = "large-v2" #@param 

#@markdown <h><i><font size=2 color="#AF9B60"><br/>Used for providing the language of the audio which results in both improved performance (reduced latency) and accuracy (reduced error rate): <br/>For code language read this page (https://github.com/openai/whisper/blob/main/whisper/tokenizer.py)</i></h>

language_output = "English" #@param 

#@markdown <h2><br/>Whisper's refinement parameters :</h2>

#@markdown <h><i><font size=2 color="#AF9B60">Used for narrowing down the focus of AI model and make it more deterministic or the opposite.<br/> Values near 0 will result in highly focused AI output, values near 1 will introduce randomness to the results:</i></h>

temperature = "0.2" #@param 

#@markdown <h><i><font size=2 color="#AF9B60"><br/>Number of candidates when sampling <b><font color="#E42217">with non-zero temperature</font></b> (default:5, maybe for more or less randomness):</i></h>

best_of = 5 #@param {type:"number"}

#@markdown <h><i><font size=2 color="#AF9B60"><br/>Number of beams in beam search, only applicable <b><font color="#E42217">when temperature is zero</font></b> (default: 5, up for more detecting speech, down to skip some speech):</i></h>

beam_size = 5 #@param {type:"number"}

#@markdown <h><i><font size=2 color="#AF9B60"><br/>Optional patience value to use <b><font color="#E42217">in beam decoding</font></b> (default:1, i think put maximum 2 but it can be more and accept float like 0.5, maybe give more time for the AI before give up):</i></h>

patience = 1.0 #@param {type:"number"}



if language_output != "Automatic" :

  if temperature == 0 :

    !whisper $input_file --model $ai_model --language $language_output --output_format $output_format --temperature $temperature --beam_size $beam_size --patience $patience --output_dir $output_folder

  else:

    !whisper $input_file --model $ai_model --language $language_output --output_format $output_format --temperature $temperature --best_of $best_of --output_dir $output_folder

else:

  if temperature == 0 :

    !whisper $input_file --model $ai_model --output_format $output_format --temperature $temperature --beam_size $beam_size --patience $patience --output_dir $output_folder

  else:

    !whisper $input_file --model $ai_model --output_format $output_format --temperature $temperature --best_of $best_of --output_dir $output_folder



#@markdown <h2><br/>Now you can run the cell!</h2>

Re: HowTo Whisper AI

Lionfacial

2023-06-06 03:52:58

Arf! Too much line break in the code, sorry for that, just delete all the useless line break if it don't work.

Need your help on this!

tmvvmt

2023-12-07 16:13:26

https://www.avsubtitles.com/bbs_posts.php?bbs_topic_id=2441&bbs_msg_id=2442#2442

@Lionfacial @truc1979

Need your help on this! Whisper

Gift from Santa Lionfacial

Lionfacial

2023-12-25 05:39:18

Hello and Merry Christmas.

I share with you my Google Colab Notebook of "Transcribe with OpenAI Whisper", here the link:
https://colab.research.google.com/drive/1Hxu1rVydDU53ADS3CY58oEekbTsX0yVV?usp=sharing

Follow the instructions and enjoy the interface to play with Whisper !

Have a nice day.