Phelma Formation 2022

Speech Processing (SICOM-SIGMA S9) - 5PMSPAR0

  • Number of hours

    • Lectures 8.0
    • Projects 0
    • Tutorials 8.0
    • Internship 0
    • Laboratory works 4.0

    ECTS

    ECTS 2.0

Goal(s)

This serie of lectures will cover the fundamentals of automatic speech processing including fundamentals of speech production and perception, acoustic phonetics, speech signal analysis and transformation, automatic speech recognition, Text-to-speech synthesis, statistical voice conversion.

Contact Thomas HUEBER

Content(s)

This serie of lectures will cover the fundamentals of automatic speech processing:

  • Introduction to speech science (speech production/perception, acoustic phonetics)
  • Speech signal analysis (STFT, cepstral analysis, pitch detection, voice transformation)
  • Automatic speech recognition (template matching, HMM-based approach, neural approaches including LSTM, CTC, and seq2seq+attention model)
  • Voice conversion (using Gaussian mixture regression and neural approaches)
  • Text-to-speech synthesis (from unit selection to neural TTS)


Prerequisites

Basics of digital signal processing, machine learning

Test

Exam + lab work



1ère session : Examen écrit présentiel
2ème session : Rapport sur miniprojet Python

Additional Information

Course list
Curriculum->Engineering degree->Semester 9
Curriculum->Double-Diploma Engineer/Master->Semester 9