Abstract
In this paper a novel speech feature generationbased acoustic model training method is proposed. For decades, speaker adaptation methods have been widely used. All existing adaptation methods need adaptation data. However, our proposed method creates speaker-independent acoustic models that cover not only known but also unknown speakers. We do this by adopting inverse maximum likelihood linear regression (MLLR) transformation-based feature generation, and then train our models using these features. First we obtain MLLR transformation matrices from a limited number of existing speakers. Then we extract the bases of the MLLR transformation matrices using PCA. The distribution of the weight parameters to express the MLLR transformation matrices for the existing speakers are estimated. Next we generate pseudo-speaker MLLR transformations by sampling the weight parameters from the distribution, and apply the inverse of the transformation to the normalized existing speaker features to generate the pseudospeakers' features. Finally, using these features, we train the acoustic models. Evaluation results show that the acoustic models which are created are robust for unknown speakers.
Original language | English |
---|---|
Pages | 726-730 |
Number of pages | 5 |
Publication status | Published - Dec 1 2011 |
Externally published | Yes |
Event | Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2011, APSIPA ASC 2011 - Xi'an, China Duration: Oct 18 2011 → Oct 21 2011 |
Other
Other | Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2011, APSIPA ASC 2011 |
---|---|
Country/Territory | China |
City | Xi'an |
Period | 10/18/11 → 10/21/11 |
ASJC Scopus subject areas
- Information Systems
- Signal Processing