Python voice recognition library - always listen?

I recently worked on using a speech recognition library in python to run applications. I intend to end up using a voice activated home automation library using the Raspberry Pi GPIO.

It works for me, it detects my voice and launches the application. The problem is that it seems to hold onto one word that I say (for example, I'm talking about the internet and it starts chrome infinitely many times)

This is unusual behavior from what I've seen during loops. I cannot figure out how to stop it. Do I need to do something from within the loop for it to work correctly? See code below.

http://pastebin.com/auquf1bR

import pyaudio,os
import speech_recognition as sr
r = sr.Recognizer()
with sr.Microphone() as source:
        audio = r.listen(source)

def excel():
        os.system("start excel.exe")

def internet():
        os.system("start chrome.exe")

def media():
        os.system("start wmplayer.exe")

def mainfunction():
        user = r.recognize(audio)
        print(user)
        if user == "Excel":
                excel()
        elif user == "Internet":
                internet()
        elif user == "music":
                media()
while 1:
        mainfunction()

      

+3


source to share


3 answers


The problem is that you only actually listen to the speech once at the beginning of the program, and then just call back recognize

on the same bit of the stored audio. Move the code that actually listens for speech in a loop while

:



import pyaudio,os
import speech_recognition as sr


def excel():
        os.system("start excel.exe")

def internet():
        os.system("start chrome.exe")

def media():
        os.system("start wmplayer.exe")

def mainfunction(source):
    audio = r.listen(source)
    user = r.recognize(audio)
    print(user)
    if user == "Excel":
        excel()
    elif user == "Internet":
        internet()
    elif user == "music":
        media()

if __name__ == "__main__":
    r = sr.Recognizer()
    with sr.Microphone() as source:
        while 1:
            mainfunction(source)

      

+8


source


Just in case, here is an example of how to constantly listen to a keyword in pocketsphinx, it will be easier than sending audio to google continuously. And you may have a more flexible solution.



import sys, os, pyaudio
from pocketsphinx import *

modeldir = "/usr/local/share/pocketsphinx/model"
# Create a decoder with certain model
config = Decoder.default_config()
config.set_string('-hmm', os.path.join(modeldir, 'hmm/en_US/hub4wsj_sc_8k'))
config.set_string('-dict', os.path.join(modeldir, 'lm/en_US/cmu07a.dic'))
config.set_string('-keyphrase', 'oh mighty computer')
config.set_float('-kws_threshold', 1e-40)

decoder = Decoder(config)
decoder.start_utt('spotting')

stream = p.open(format=pyaudio.paInt16, channels=1, rate=16000, input=True, frames_per_buffer=1024)
stream.start_stream()        

while True:
    buf = stream.read(1024)
    decoder.process_raw(buf, False, False)
    if decoder.hyp() != None and decoder.hyp().hypstr == 'oh mighty computer':
        print "Detected keyword, restarting search"
        decoder.end_utt()
        decoder.start_utt('spotting')

      

+8


source


I've spent a lot of time on this topic.

I am currently developing an open source Python 3 program for cross-platform virtual assistants called Athena Voice: https://github.com/athena-voice/athena-voice-client

Users can use it in the same way as Siri, Cortana, or Amazon Echo.

It also uses a very simple "modular" system in which users can easily create their own modules to enhance its functionality. Let me know if this might be helpful.

Otherwise, I recommend looking into Pocketsphinx and Google Python packages for speech and text / text.

In Python 3.4 Pocketsphinx can be installed with

pip install pocketsphinx

      

However, you have to install the PyAudio dependency separately (unofficial download): http://www.lfd.uci.edu/~gohlke/pythonlibs/#pyaudio

Both google packages can be installed using the command:

pip install SpeechRecognition gTTS

      

Google STT: https://pypi.python.org/pypi/SpeechRecognition/

Google TTS: https://pypi.python.org/pypi/gTTS/1.0.2

Pocketsphinx should be used for offline wake-up recognition, and Google STT should be used for active listening.

+2


source







All Articles