Abstract
This article presents a new algorithm used in order to convert the speech of one speaker so that it sounds like that of another speaker. This algorithm flexibly converts voice quality using two major technical developments. Firstly, the modification of formant frequencies and spectral intensity using piecewise linear voice conversion rules. This enables the control of spectrum parameters in detail. The conversion rules are generated automatically for any pair of speakers. The reliability of the conversion rules is guaranteed because they are statistically generated using training data. Secondly, this algorithm provides the ability to produce speech with the desired formant structure by controlling formant frequencies, formant bandwidths and spectral intensity. Speech is iteratively modified in order to achieve the specified formant structure. Listening tests prove that the proposed algorithm converts speaker individuality while maintaining high speech quality.
Original language | English |
---|---|
Pages (from-to) | 153-164 |
Number of pages | 12 |
Journal | Speech Communication |
Volume | 16 |
Issue number | 2 |
DOIs | |
Publication status | Published - Feb 1995 |
Externally published | Yes |
Keywords
- Formant frequency
- Listening test
- Piecewise linear
- Spectral intensity
- Spectrum tilt
- Voice conversion
ASJC Scopus subject areas
- Software
- Modelling and Simulation
- Communication
- Language and Linguistics
- Linguistics and Language
- Computer Vision and Pattern Recognition
- Computer Science Applications