Connecting to CMU Sphinx using PHP

I am studying Speech Recognition and the ways it can be implemented on a website. I've found many examples of using it with Python and even Node.js, but I want to be able to use PHP with this.

Is there any way I can access CMUSphinx on a Linux server using PHP to handle my inputs?

thank

+3


source to share


2 answers


Can be done, but use asterisks as an audio capture and processing mechanism. See http://www.voip-info.org/wiki/view/Sphinx

Sample code below after server setup

    function sphinx($filename='', $timeout=3000, $service_port = 1069, $address = '127.0.0.1'){

        /* if a recording has not been passed in we create one */
        if ($filename=="") {
            $filename = "/var/lib/asterisk/sounds/sphinx_".$this->request['agi_uniqueid'];
            $extension = "wav";
            $this->stream_file('beep', 3000, 5);
            $this->record_file($filename, $extension, '0',$timeout);
            $filename=$filename.'.'.$extension;
        }   

        /* Create a TCP/IP socket. */
        $socket = socket_create(AF_INET, SOCK_STREAM, SOL_TCP);
        if ($socket < 0) {
            return false;
        }

        $result = socket_connect($socket, $address, $service_port);
        if ($result < 0) {
           return false;
        }

        //open the file and read in data
        $handle = fopen($filename, "rb");
        $data = fread($handle, filesize($filename));

        socket_write($socket, filesize($filename)."\n");
        socket_write($socket, $data);

        $response = socket_read($socket, 2048);

        socket_close($socket);

        unlink($filename);
        return $response;
   }

      



Another thought after browsing the website is that sphinx 4 allows access to the web service for the recognition processing daemon, i.e. runs sphinx as a daemon (its java!), after which you can make a socket as shown above to feed the .wav directly into it basically using a modification of the above code, so instead of calling the asterisks server to fetch then record the audio, you would use something else, maybe html5, etc. to record sound.

Another thing to consider is that chrome and html5 have built in speech recognition

+1


source


The architecture of such a system depends on the type of sound you want to process. If the sound is long, you can simply save it to a temporary file and call pocketsphinx_continuous as an external tool to handle it:

http://php.net/manual/en/function.shell-exec.php



You call pocketsphinx_continuous -infile file.wav > decode-result.txt

and it gives you the result to display. The problem with this approach is that decode initialization takes time, so you won't be able to use this approach for short files.

If you want to process short samples or want to stream audio, you need some kind of server to load the models and wait for requests. There are various options for how to implement it from a simple manual server listening on a TCP port using a simple protocol and taking the data to more complex solutions like http://unimrcp.org

0


source







All Articles