Uncloudy voice assistant
How to access automatic subtitling in an interview, conferencing, gathering, conversation in a counter cloud way?
For example in the case of a counter cloud video conferencing like BBB and what happens when somebody complains about the non-seamlessness.
Some thoughts on machine listening:
The process of machine listening has been supported and developed technically through legal, state and military cases and developments. Clouds and voice databases are also rooted in this.
We can unpack the processes of speech recognition in interviewing, transcribing, listening, training, listening to patterns, predicting and follow hybrid ways of automated subtitling, that include computing and not.
Some algorithms
Speech recognition tools
- Vosk: https://alphacephei.com/vosk/install
- otter.ai: https://otter.ai/
- Speech_recognition with Pocketsphinx: https://gitlab.com/nglk/speaking_with_the_machine/-/blob/master/ttssr2_transcribe.py
- Livespeech with Pocketsphinx: https://pypi.org/project/pocketsphinx/
from pocketsphinx import LiveSpeech for phrase in LiveSpeech(): # how to pause the mic occasionally so the live speech print appears often # check timer thread print(phrase)
Experiments
- Gossip Booth: recognise words and keep only the ones with no meaning, like vocal expressions.
- Transcribe only phonemes with coqui STT and CMU Sphinx
- Radioactive Monstrosities: voice upload and vocal transformation on a browser with the use of Web Audio API
Older threads:
- https://pzwiki.wdka.nl/mediadesign/User:Angeliki/Ttssr-Speech_Recognition_Iterations
- classes with Amy
- workshop at Leipzig
- scripts with WordMord
- workshop and references with Jon
- Radioactive
references:
- https://github.com/lowerquality/gentle/pulse
- Meeting/Presentation/discussion with Xiaochang Li in De Krook, Gent. Notes documentation: https://mensuel.framapad.org/p/algolit_documentation_190524 , note Xiaochang Li :https://mensuel.framapad.org/p/algolit_xiaochang_li_190524 https://pubmed.ncbi.nlm.nih.gov/31231075/