A hybrid text-to-speech based on sub-band approach

Takuma Inoue, Sunao Hara, Masanobu Abe

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

This paper proposes a sub-band speech synthesis approach to develop high-quality Text-to-Speech (TTS). For the low-frequency band and high-frequency band, Hidden Markov Model (HMM)-based speech synthesis and waveform-based speech synthesis are used, respectively. Both speech synthesis methods are widely known to show good performance and to have benefits and shortcomings from different points of view. One motivation is to apply the right speech synthesis method in the right frequency band. Experiment results show that in terms of the smoothness the proposed approach shows better performance than waveform-based speech synthesis, and in terms of the clarity it shows better than HMM-based speech synthesis. Consequently, the proposed approach combines the inherent benefits from both waveform-based speech synthesis and HMM-based speech synthesis.

Original languageEnglish
Title of host publication2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9786163618238
DOIs
Publication statusPublished - Feb 12 2014
Event2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014 - Chiang Mai, Thailand
Duration: Dec 9 2014Dec 12 2014

Other

Other2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014
CountryThailand
CityChiang Mai
Period12/9/1412/12/14

ASJC Scopus subject areas

  • Signal Processing
  • Information Systems

Fingerprint Dive into the research topics of 'A hybrid text-to-speech based on sub-band approach'. Together they form a unique fingerprint.

Cite this