CodeLabs
  • Introduction
  • Cities
    • Cover
    • Summary
    • Introduction
      • Getting Started
      • Resources
      • Naming Convention
    • Caracol
      • Architecture
        • Intel
        • Operating Systems
        • Hardware Platform
        • Field Requirements
        • Software Development
        • Video Conference Station
          • Applications
          • Entertainment System
            • Services
              • Graphics
              • Video
              • Camera
              • Text To Speech
              • Speech To Text
              • Photography
              • Gaming
                • Super Nintendo Emulators
                • Super Nintendo ROMs
              • Twitter
              • Keyboard
              • Google Drive
              • WiFi
            • Modules
            • Tbd
        • Functionality
        • Bill Of Materials
        • Open Questions
        • Future
    • Chacmultun
      • JHD1313M1
        • Background
          • Inter-Integrated Circuit
          • Linux I2C Subsystem
        • Device Drivers
          • User Space Library UPM
          • User Space Library MRAA
          • User Space I2C Library
          • Kernel Space I2C Driver
    • Chicanna
      • OpenCV
        • Face Recognition
        • Web Streaming
    • Chichen Itza
      • Audio
        • Advanced Linux Sound Architecture
        • PortAudio
        • PyAudio
          • Record
          • Play
      • Text To Speech
        • eSpeak
          • pyttsx
        • VoiceRss
          • VoiceRss Text To Speech Python SDK
      • Speech To Text
        • CMU Sphinx
          • Sphinx Knowledge Base Tool
          • PocketSphinx
          • PyAudio
        • SpeechRecognition
    • Coba
      • Linux Kernel Source Code
        • Git Source Code Management
      • Linux Kernel Development Process
        • A Guide To The Linux Kernel Development Process
        • Challenge
      • Linux Kernel Compilation
      • Linux Kernel Modules
      • Linux Kernel Build System
        • Compilation Kernel Object
        • Compilation Built-In
      • Linux Kernel Patch
        • Patchset
      • Linux Kernel Developer
      • Learn More
    • Dos Pilas
    • Dzibilchaltun
    • Edzna
      • MQ Telemetry Transport
        • MQTT Broker
      • Home Assistant
        • Setup
        • Default Configuration
        • Hello
      • Home Automation
        • Device
          • Light ESP8266
          • Intel Edison
        • Server
          • MQTT
            • Light
            • Sensor
            • Binary Sensor
            • Alarm
          • Dweet
          • Weather
          • Binary Sensor
          • Telegram
          • Camera
          • Automation
      • Shack Automation
        • Configuration Yaml
    • Ek Balam
      • Google Cloud Platform
        • Google API Keys
        • Google Application Default Credentials
      • Google Cloud Translation API
      • Google Cloud Vision API
      • Google Cloud Speech
    • Hochob
      • Introduction
      • Accounts
      • Areas
        • Naming Conventions
      • Digital Media Players
        • Chromecast
      • Keypads
      • Mobile Device
      • Amikoo
        • Architecture
          • Hardware Components
            • Intel® Edison Breakout Board
        • Amikoo
          • Setup
          • Git Repositories
          • Audio
          • Main Intel Edison
          • Main x86
        • Sandbox
          • Tbd
      • Server
        • Setup
          • Root
          • PulseAudio
        • Architecture
        • Media
        • Services
          • Automation
            • Home Assistant
              • Setup
          • Calendar
          • Camera
          • Communications
          • Games
            • Loteria Mexicana
            • Caras Y Gestos
            • Sopa De Letras
          • Image
          • Institution
          • Keypad
          • Messaging
          • Motion
          • Robot
          • Sound
            • Play
            • Speech Synthetizer
          • Stream
          • Usb
          • Survey
        • Applications
          • Captain Maya Lands
          • Southern Maya Highlands
          • Amikoo Maya Lands
          • Central Maya Lowlands
          • Northern Maya Lowlands
          • Console Maya Lands
          • Media Maya Lands
          • Calendar Maya Lands
          • Tradition Maya Lands
          • Techie Maya Lands
          • NohochTata Maya Lands
      • Stations
      • Architecture
      • Release Notes
        • v0.1
        • v0.2
    • Hormiguero
    • Kaminaljuyu
      • WRI Workshop
        • Environmental Protection Agency
      • Arquitecture
        • Bill Of Materials
      • Intel
    • Kanki
      • Device
      • Gateway
    • Kinich KaK Moo
    • Kohunlich
    • La Ruta Puuc
    • Mayapan
    • Mixco Viejo
    • Muyil
    • Palenque
    • Qumarkaj
      • Amazon
      • Amazon Echo
        • Lifx
        • IFTTT
      • Amazon Alexa
        • Alexa Skills Kit
          • Training: Developing Alexa Skills - Hello, Alexa!
          • Training: Alexa - A Free Introduction
          • Projects
            • Intel GDC Visitor Center
              • Cloudy
              • Geek
              • Trivia
              • Kit
            • Voice Control Intel Edison
        • Alexa Voice Service
          • Intel Edison
    • Seibal
    • Tabasqueno
    • Tikal
      • Mexican Red Cross
      • Emergency Medical Services
        • Prehospital Care Strategies
      • Intel
      • Bitalino
      • My Signals
      • Architecture
        • Objectives
    • Tulum
      • Introduction
      • Device
        • Development Board
        • Telegram Bot
        • Bootup
        • Text Editors
        • Setup
        • Main
        • IoT101 Inc
        • Mqtt Clients
        • Sensors
      • Server
        • Setup
        • Dashboard
        • Mqtt Broker
      • Challenge
    • Uxmal
    • Xcambo
    • Xpuhil
    • Xunantunich
    • Sandbox
      • Text To Speech
      • Face Recognition Web Streaming
      • Web Server Flask
      • Speech To Text
      • Image Webpage Flask
      • Audio Capture Playback
      • Image Capture OpenCV
    • Tbd
      • Tbd
      • Linux Kernel Architecture
      • Linux Kernel Device Drivers
      • Tbd
  • Gods
    • Acat
      • Node-RED
        • Installation
          • General
          • Base
          • Npm Packages
          • Node-RED Packages
        • Autostart
      • Projects
        • GPIO
        • MQTT
        • Intel GDC Visitor Center
      • Sandbox
    • Ah Puch
      • Recon Instruments
        • ReconJet
          • Get Started
        • ReconEngage
        • Uplink
        • Intel Developer Zone
          • Recon Dev Kit for Jetâ„¢
      • Bicycles
    • Akna
      • Hardware
        • Intel Edison
        • Sphero
        • Bluetooth Dongle
      • Software
        • Javascript
        • Cylon.js
          • Cylon.js Intel Edison
          • Cylon.js Intel Edison Sphero
            • Examples
        • Sphero Linux API
    • Alom
    • Bacab
      • Getting Started
        • Inventory
        • Accounts
        • Software
      • Proof Of Concepts
        • Doorbell
    • Bitol
      • Drone Software Development Environment
        • PX4
        • JMAVSim
        • Ardupilot
        • SITL Simulator
        • DroneKit
        • QGroundControl
        • MAVLink
        • OpenCV
        • Git Source Code Management
      • Drone Solution Architect
        • Companies
      • Dronecode Project
      • Base Drone Solution
        • Flight Controller Hardware
          • PixHawk
          • Intel® Aero Ready to Fly Drone
          • Emlid
          • Erle Brain
        • Flight Controller Operating System
        • Autopilot
          • Px4
            • Architecture
              • NuttX
              • DriverFramework
            • Shell
          • Ardupilot
          • Software In The Loop
            • jMAVSim
            • DroneKit SITL
        • Communications
          • Mavlink
            • PyMAVLink
        • Ground Control Station
          • QGroundControl
        • Developer APIs
      • Hybrid Drone Solution
      • Autonomous Drone Solution
      • Drone Solutions
        • Erle Robotics
        • Emlid
      • Training
        • Drone Solution Architect
          • Demos
            • Ka’an
            • Múuyal
            • lik’
            • Chak iik’
    • Chaac
    • Bolon Ts'akab
    • Buluc Chabtan
      • Challenge
    • Chaac Uayab Xoc
    • Cacoch
      • Background
        • Open Source
        • Do It Yourself Communities
      • The Project
        • History
        • Career Profile
        • Education Strategy
        • Courses
        • Infrastructure
Powered by GitBook
On this page
  1. Cities
  2. Chichen Itza
  3. Speech To Text
  4. CMU Sphinx

PyAudio

root@edison:~# pip install --upgrade SpeechRecognition
root@edison:~# nano main.py
import collections
import mraa
import os
import sys
import time

# Import things for pocketsphinx
import pyaudio
import wave
import pocketsphinx as ps
import sphinxbase

led = mraa.Gpio(13)  
led.dir(mraa.DIR_OUT)

print("Starting")
while 1:
        #PocketSphinx parameters
        LMD   = "configuration/4842.lm"
        DICTD = "configuration/4842.dic"
        CHUNK = 1024
        FORMAT = pyaudio.paInt16
        CHANNELS = 1
        RATE = 16000
        RECORD_SECONDS = 3
        PATH = 'vcreg'
        p = pyaudio.PyAudio()
        speech_rec = ps.Decoder(lm=LMD, dict=DICTD)
        #Record audio
        stream = p.open(format=FORMAT, channels=CHANNELS, rate=RATE, input=True, frames_per_buffer=CHUNK)
        print("* recording")
        frames = []
        for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
            data = stream.read(CHUNK)
            frames.append(data)
        print("* done recording")
        stream.stop_stream()
        stream.close()
        p.terminate()
        # Write .wav file
        fn = "test.wav"
        #wf = wave.open(os.path.join(PATH, fn), 'wb')
        wf = wave.open(fn, 'wb')
        wf.setnchannels(CHANNELS)
        wf.setsampwidth(p.get_sample_size(FORMAT))
        wf.setframerate(RATE)
        wf.writeframes(b''.join(frames))
        wf.close()

        # Decode speech
        #wav_file = os.path.join(PATH, fn)
        wav_file=fn
        wav_file = file(wav_file,'rb')
        wav_file.seek(44)
        speech_rec.decode_raw(wav_file)
        result = speech_rec.get_hyp()
        recognised= result[0]
        print("* LED section begins")
        print(recognised)
        if recognised == 'ON.':
            led.write(1)
            print "Servo 1"
        else:
            led.write(0)
            print "Servo 0"
        cm = 'espeak "'+recognised+'"'
        os.system(cm)
root@edison:~# python main.py

Errors

>>> buf = stream.read(1024)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/site-packages/pyaudio.py", line 608, in read
    return pa.read_stream(self._stream, num_frames, exception_on_overflow)
TypeError: function takes exactly 2 arguments (3 given)
>>> buf = stream.read(1024)
INFO: ngram_search_fwdtree.c(338): after: 3 root, 4 non-root channels, 11 single-phone words
INFO: ngram_search_fwdflat.c(156): fwdflat: min_ef_width = 4, max_sf_win = 25
* recording
Traceback (most recent call last):
  File "ts.py", line 34, in <module>
    data = stream.read(CHUNK)
  File "/usr/lib/python2.7/site-packages/pyaudio.py", line 608, in read
    return pa.read_stream(self._stream, num_frames, exception_on_overflow)
IOError: [Errno -9981] Input overflowed
INFO: ngram_search_fwdtree.c(430): TOTAL fwdtree 0.00 CPU -nan xRT
INFO: ngram_search_fwdtree.c(433): TOTAL fwdtree 0.00 wall -nan xRT
INFO: ngram_search_fwdflat.c(174): TOTAL fwdflat 0.00 CPU -nan xRT
INFO: ngram_search_fwdflat.c(177): TOTAL fwdflat 0.00 wall -nan xRT
INFO: ngram_search.c(317): TOTAL bestpath 0.00 CPU -nan xRT
INFO: ngram_search.c(320): TOTAL bestpath 0.00 wall -nan xRT

Reboot

PreviousPocketSphinxNextSpeechRecognition

Last updated 7 years ago