jarvis python code - NFHN Reader

Python hosting: Host, run, and code Python in the cloud!

I thought it would be cool to create a personal assistant in Python. If you are into movies you may have heard of Jarvis, an A.I. based character in the Iron Man films. In this tutorial we will create a robot.

The features I want to have are:

Recognize spoken voice (Speech recognition)
Answer in spoken voice (Text to speech)
Answer simple commands

For this tutorial you will need (Ubuntu) Linux, Python and a working microphone.

Related course:

Master Computer Vision with OpenCV

Video

This is what you’ll create (watch the whole video, demo at the end):

Recognize spoken voice

Speech recognition can by done using the Python SpeechRecognition module. We make use of the Google Speech API because of it’s great quality.

Answer in spoken voice (Text To Speech)

Various APIs and programs are available for text to speech applications. Espeak and pyttsx work out of the box but sound very robotic. We decided to go with the Google Text To Speech API, gTTS.


sudo pip install gTTS

Using it is as simple as:


from gtts import gTTS
import os
tts = gTTS(text='Hello World', lang='en')
tts.save("hello.mp3")
os.system("mpg321 hello.mp3")

Complete program

The program below will answer spoken questions.





import speech_recognition as sr
from time import ctime
import time
import os
from gtts import gTTS

def speak(audioString):
    print(audioString)
    tts = gTTS(text=audioString, lang='en')
    tts.save("audio.mp3")
    os.system("mpg321 audio.mp3")

def recordAudio():
    
    r = sr.Recognizer()
    with sr.Microphone() as source:
    print("Say something!")
    audio = r.listen(source)

    
    data = ""
    try:
        
        
        data = r.recognize_google(audio)
        print("You said: " + data)
    except sr.UnknownValueError:
        print("Google Speech Recognition could not understand audio")
    except sr.RequestError as e:
        print("Could not request results from Google Speech Recognition service; {0}".format(e))

    return data

def jarvis(data):
    if "how are you" in data:
        speak("I am fine")

    if "what time is it" in data:
        speak(ctime())

    if "where is" in data:
        data = data.split(" ")
        location = data[2]
        speak("Hold on Frank, I will show you where " + location + " is.")
        os.system("chromium-browser https://www.google.nl/maps/place/" + location + "/&amp;")


time.sleep(2)
speak("Hi Frank, what can I do for you?")
while 1:
    data = recordAudio()
    jarvis(data)

Robotics, computer vision and awesome stuff

Speech Engines (TTS)

Speech Recognition

Related course:

Video

Recognize spoken voice

Answer in spoken voice (Text To Speech)

Complete program

Related posts: