Whisper is a speech-to-text transcription service that uses the Hugging Face Transformers library to transcribe audio files. The Whinter web application was created to allow users to easily access Whisper without prior programming knowledge. It is free to use and does not require sign-up or registration.
Currently, Whinter supports multiple Whisper model sizes, including Tiny, Base, and Small. You can choose between these three model sizes to transcribe your data. With the larger model size (i.e., Tiny > Base > Small), the transcription accuracy is likely to increase, but at the same time, the computational load and transcription time increase. Thus, it is advisable to start with the smallest model size and see if the result is sufficient for your needs. This usually works well on high-resource languages such as English and for audio files with clear and standard speech.
Please only use this transcriber if you require the result to follow the VARIAGE transcription guidelines. If not, the regular Whinter interface will return the finished transcription faster.
Choose an audio file to upload and your desired Whisper model. Please be patient! It might take a while for your file to be fully transcribed.
Compatible file formats: WAV, MP3, MP4, FLAC, M4A, OGG, WMA, ACC
Compatible file size: Max. 50 MB
Languages: Whisper automatically detects the language of your audio file. Available languages and more information can be found here.
This service is funded through Prof. Dr. Simone Pfenninger at the English Department of the University of Zurich. The web application was created and is currently maintained by Sarah Baur. For any questions or inquieries, please contact sarah.baur@uzh.ch.
The source code of this application is available on GitHub.
If you use this application or its source code in your research, please cite as:
Baur, S., Pfenninger, S. (2025). Whinter: A Whisper user interface for automatic transcriptions [Computer Software]. GitHub. Retrieved from https://github.com/your-username/whisper-audio-transcription.