CodeLabs
  • Introduction
  • Cities
    • Cover
    • Summary
    • Introduction
      • Getting Started
      • Resources
      • Naming Convention
    • Caracol
      • Architecture
        • Intel
        • Operating Systems
        • Hardware Platform
        • Field Requirements
        • Software Development
        • Video Conference Station
          • Applications
          • Entertainment System
            • Services
              • Graphics
              • Video
              • Camera
              • Text To Speech
              • Speech To Text
              • Photography
              • Gaming
                • Super Nintendo Emulators
                • Super Nintendo ROMs
              • Twitter
              • Keyboard
              • Google Drive
              • WiFi
            • Modules
            • Tbd
        • Functionality
        • Bill Of Materials
        • Open Questions
        • Future
    • Chacmultun
      • JHD1313M1
        • Background
          • Inter-Integrated Circuit
          • Linux I2C Subsystem
        • Device Drivers
          • User Space Library UPM
          • User Space Library MRAA
          • User Space I2C Library
          • Kernel Space I2C Driver
    • Chicanna
      • OpenCV
        • Face Recognition
        • Web Streaming
    • Chichen Itza
      • Audio
        • Advanced Linux Sound Architecture
        • PortAudio
        • PyAudio
          • Record
          • Play
      • Text To Speech
        • eSpeak
          • pyttsx
        • VoiceRss
          • VoiceRss Text To Speech Python SDK
      • Speech To Text
        • CMU Sphinx
          • Sphinx Knowledge Base Tool
          • PocketSphinx
          • PyAudio
        • SpeechRecognition
    • Coba
      • Linux Kernel Source Code
        • Git Source Code Management
      • Linux Kernel Development Process
        • A Guide To The Linux Kernel Development Process
        • Challenge
      • Linux Kernel Compilation
      • Linux Kernel Modules
      • Linux Kernel Build System
        • Compilation Kernel Object
        • Compilation Built-In
      • Linux Kernel Patch
        • Patchset
      • Linux Kernel Developer
      • Learn More
    • Dos Pilas
    • Dzibilchaltun
    • Edzna
      • MQ Telemetry Transport
        • MQTT Broker
      • Home Assistant
        • Setup
        • Default Configuration
        • Hello
      • Home Automation
        • Device
          • Light ESP8266
          • Intel Edison
        • Server
          • MQTT
            • Light
            • Sensor
            • Binary Sensor
            • Alarm
          • Dweet
          • Weather
          • Binary Sensor
          • Telegram
          • Camera
          • Automation
      • Shack Automation
        • Configuration Yaml
    • Ek Balam
      • Google Cloud Platform
        • Google API Keys
        • Google Application Default Credentials
      • Google Cloud Translation API
      • Google Cloud Vision API
      • Google Cloud Speech
    • Hochob
      • Introduction
      • Accounts
      • Areas
        • Naming Conventions
      • Digital Media Players
        • Chromecast
      • Keypads
      • Mobile Device
      • Amikoo
        • Architecture
          • Hardware Components
            • Intel® Edison Breakout Board
        • Amikoo
          • Setup
          • Git Repositories
          • Audio
          • Main Intel Edison
          • Main x86
        • Sandbox
          • Tbd
      • Server
        • Setup
          • Root
          • PulseAudio
        • Architecture
        • Media
        • Services
          • Automation
            • Home Assistant
              • Setup
          • Calendar
          • Camera
          • Communications
          • Games
            • Loteria Mexicana
            • Caras Y Gestos
            • Sopa De Letras
          • Image
          • Institution
          • Keypad
          • Messaging
          • Motion
          • Robot
          • Sound
            • Play
            • Speech Synthetizer
          • Stream
          • Usb
          • Survey
        • Applications
          • Captain Maya Lands
          • Southern Maya Highlands
          • Amikoo Maya Lands
          • Central Maya Lowlands
          • Northern Maya Lowlands
          • Console Maya Lands
          • Media Maya Lands
          • Calendar Maya Lands
          • Tradition Maya Lands
          • Techie Maya Lands
          • NohochTata Maya Lands
      • Stations
      • Architecture
      • Release Notes
        • v0.1
        • v0.2
    • Hormiguero
    • Kaminaljuyu
      • WRI Workshop
        • Environmental Protection Agency
      • Arquitecture
        • Bill Of Materials
      • Intel
    • Kanki
      • Device
      • Gateway
    • Kinich KaK Moo
    • Kohunlich
    • La Ruta Puuc
    • Mayapan
    • Mixco Viejo
    • Muyil
    • Palenque
    • Qumarkaj
      • Amazon
      • Amazon Echo
        • Lifx
        • IFTTT
      • Amazon Alexa
        • Alexa Skills Kit
          • Training: Developing Alexa Skills - Hello, Alexa!
          • Training: Alexa - A Free Introduction
          • Projects
            • Intel GDC Visitor Center
              • Cloudy
              • Geek
              • Trivia
              • Kit
            • Voice Control Intel Edison
        • Alexa Voice Service
          • Intel Edison
    • Seibal
    • Tabasqueno
    • Tikal
      • Mexican Red Cross
      • Emergency Medical Services
        • Prehospital Care Strategies
      • Intel
      • Bitalino
      • My Signals
      • Architecture
        • Objectives
    • Tulum
      • Introduction
      • Device
        • Development Board
        • Telegram Bot
        • Bootup
        • Text Editors
        • Setup
        • Main
        • IoT101 Inc
        • Mqtt Clients
        • Sensors
      • Server
        • Setup
        • Dashboard
        • Mqtt Broker
      • Challenge
    • Uxmal
    • Xcambo
    • Xpuhil
    • Xunantunich
    • Sandbox
      • Text To Speech
      • Face Recognition Web Streaming
      • Web Server Flask
      • Speech To Text
      • Image Webpage Flask
      • Audio Capture Playback
      • Image Capture OpenCV
    • Tbd
      • Tbd
      • Linux Kernel Architecture
      • Linux Kernel Device Drivers
      • Tbd
  • Gods
    • Acat
      • Node-RED
        • Installation
          • General
          • Base
          • Npm Packages
          • Node-RED Packages
        • Autostart
      • Projects
        • GPIO
        • MQTT
        • Intel GDC Visitor Center
      • Sandbox
    • Ah Puch
      • Recon Instruments
        • ReconJet
          • Get Started
        • ReconEngage
        • Uplink
        • Intel Developer Zone
          • Recon Dev Kit for Jet™
      • Bicycles
    • Akna
      • Hardware
        • Intel Edison
        • Sphero
        • Bluetooth Dongle
      • Software
        • Javascript
        • Cylon.js
          • Cylon.js Intel Edison
          • Cylon.js Intel Edison Sphero
            • Examples
        • Sphero Linux API
    • Alom
    • Bacab
      • Getting Started
        • Inventory
        • Accounts
        • Software
      • Proof Of Concepts
        • Doorbell
    • Bitol
      • Drone Software Development Environment
        • PX4
        • JMAVSim
        • Ardupilot
        • SITL Simulator
        • DroneKit
        • QGroundControl
        • MAVLink
        • OpenCV
        • Git Source Code Management
      • Drone Solution Architect
        • Companies
      • Dronecode Project
      • Base Drone Solution
        • Flight Controller Hardware
          • PixHawk
          • Intel® Aero Ready to Fly Drone
          • Emlid
          • Erle Brain
        • Flight Controller Operating System
        • Autopilot
          • Px4
            • Architecture
              • NuttX
              • DriverFramework
            • Shell
          • Ardupilot
          • Software In The Loop
            • jMAVSim
            • DroneKit SITL
        • Communications
          • Mavlink
            • PyMAVLink
        • Ground Control Station
          • QGroundControl
        • Developer APIs
      • Hybrid Drone Solution
      • Autonomous Drone Solution
      • Drone Solutions
        • Erle Robotics
        • Emlid
      • Training
        • Drone Solution Architect
          • Demos
            • Ka’an
            • Múuyal
            • lik’
            • Chak iik’
    • Chaac
    • Bolon Ts'akab
    • Buluc Chabtan
      • Challenge
    • Chaac Uayab Xoc
    • Cacoch
      • Background
        • Open Source
        • Do It Yourself Communities
      • The Project
        • History
        • Career Profile
        • Education Strategy
        • Courses
        • Infrastructure
Powered by GitBook
On this page
  • SpeechRecognition
  • Examples
  • Another Example
  1. Cities
  2. Chichen Itza
  3. Speech To Text

SpeechRecognition

PreviousPyAudioNextCoba

Last updated 7 years ago

SpeechRecognition

Python library for performing speech recognition, with support for several engines and APIs, online and offline.

Speech recognition engine/API support:

  • CMU Sphinx (works offline)

  • Google Speech Recognition

  • Google Cloud Speech API

  • Wit.ai

  • Microsoft Bing Voice Recognition

  • Houndify API

  • IBM Speech to Text

root@edison:~# pip install SpeechRecognition
root@edison:~# pip install wit

Examples

See the examples/ directory in the repository root for usage examples:

  • Recognize speech input from the microphone

  • Transcribe an audio file

  • Save audio data to an audio file

  • Show extended recognition results

  • Calibrate the recognizer energy threshold for ambient noise levels (see recognizer_instance.energy_threshold for details)

  • Listening to a microphone in the background

  • Various other useful recognizer features

root@edison:~# nano main.py
#!/usr/bin/python

# NOTE: this example requires PyAudio because it uses the Microphone class

import speech_recognition as sr

# obtain audio from the microphone
r = sr.Recognizer()
with sr.Microphone() as source:
    print("Say something!")
    audio = r.listen(source)

# recognize speech using Sphinx
try:
    print("Sphinx thinks you said " + r.recognize_sphinx(audio))
except sr.UnknownValueError:
    print("Sphinx could not understand audio")
except sr.RequestError as e:
    print("Sphinx error; {0}".format(e))

# recognize speech using Google Speech Recognition
try:
    # for testing purposes, we're just using the default API key
    # to use another API key, use `r.recognize_google(audio, key="GOOGLE_SPEECH_RECOGNITION_API_KEY")`
    # instead of `r.recognize_google(audio)`
    print("Google Speech Recognition thinks you said " + r.recognize_google(audio))
except sr.UnknownValueError:
    print("Google Speech Recognition could not understand audio")
except sr.RequestError as e:
    print("Could not request results from Google Speech Recognition service; {0}".format(e))
root@edison:~# python main.py

Another Example

root@edison:~# nano main.py
import speech_recognition as sr

r = sr.Recognizer()
m = sr.Microphone()

try:
    print("A moment of silence, please...")
    with m as source: r.adjust_for_ambient_noise(source)
    print("Set minimum energy threshold to {}".format(r.energy_threshold))
    while True:
        print("Say something!")
        with m as source: audio = r.listen(source)
        print("Got it! Now to recognize it...")
        try:
            # recognize speech using Google Speech Recognition
            value = r.recognize_google(audio)

            # we need some special handling here to correctly print unicode characters to standard output
            if str is bytes:  # this version of Python uses bytes for strings (Python 2)
                print(u"You said {}".format(value).encode("utf-8"))
            else:  # this version of Python uses unicode for strings (Python 3+)
                print("You said {}".format(value))
        except sr.UnknownValueError:
            print("Oops! Didn't catch that")
        except sr.RequestError as e:
            print("Uh oh! Couldn't request results from Google Speech Recognition service; {0}".format(e))
except KeyboardInterrupt:
    pass
root@edison:~# python main.py
Github