Watson Speech: Creating Voice Interfaces Using Speech APIs (W7L168G)

Overview

Voice technologies are rapidly reshaping how businesses interact with customers, automate operations, and analyze conversations. Today’s AI systems can deliver fast, accurate, and natural-sounding speech capabilities across multiple languages, supporting use cases such as customer self-service, live agent assistance, real-time analytics, and more.In this course, learners will explore how to apply IBM Watson Speech to Text and Watson Text to Speech to build voice-enabled solutions tailored to their unique business needs.By the end of the course, participants will have the skills to design, customize, and integrate AI-powered voice solutions that deliver seamless, human-like interactions across multiple platforms.

Audience

This course is intended for

  • Anyone looking to automate transcribing and synthesizing speech by using Watson Speech to Text and Text to Speech Technology
  • Practicing AI specialists looking to add speech capabilities to their existing AI-powered services, such as AI assistants and AI agents
  • Practicing Data Scientists looking to get insights from speech and text analysis
  • Business leaders looking to understand the capabilities of Watson Speech to Text and Text to Speech, and apply this technology to solve related domain problems
  • Anyone looking to understand the process of integrating speech-to-text or text-to-speech with an AI assistant or AI agent 

Prerequisites

Before taking this course, you should have:

  • Basic Python
  • Basic knowledge of RESTful API
  • General use of IBM Cloud and an IBM Cloud account

Objective

After completing this course, you should be able to:

  • Explain the value of speech recognition and common Watson Speech business use cases
  • Leverage the Watson Speech to Text API to build a simple working prototype that transcribes speech to text for a business problem
  • Leverage the Watson Text to Speech API to build a simple working prototype that synthesizes text to speech for a business problem
  • Leverage the watsonx.ai API to integrate IBM Granite LLMs for language translation
  • Infuse voice capabilities into an AI assistant built with watsonx Assistant by integrating it with Watson Speech 
mostrar detailes

Course Outline

  • Unit 1. Introduction to Speech Transcription, Synthesis and the Watson Speech services
  • Unit 2. In-depth exploration of Watson Speech to Text
  • Exercise A: Prepare the Lab Environment for Speech to Text
  • Exercise B: Transcribe with the English US Telephony Model
  • Exercise C: Train a Language Model
  • Exercise D: Using Grammars
  • Exercise E: Language Translation with a Granite LLM
  • Exercise F: Additional Speech to Text Features
  • Unit 3. In-depth exploration of Watson Text to Speech
  • Exercise A: Prepare the Lab Environment for Text to Speech
  • Exercise B: Synthesis Using a Standard English Model
  • Exercise C: Customizing Models
  • Exercise D: Additional Text to Speech Features
  • Unit 4. Adding a Voice Interface with Watson Speech Services
  • Exercise: Integrating watsonx Assistant with Watson Speech