A segment-based approach to voice conversion

Research output: Chapter in Book/Report/Conference proceedingConference contribution

26 Citations (Scopus)

Abstract

A voice conversion algorithm that uses speech segments as conversion units is proposed. Input speech is decomposed into speech segments by a speech recognition module, and the segments are replaced by speech segments uttered by another speaker. This algorithm makes it possible to convert not only the static characteristics but also the dynamic characteristics of speaker individuality. The proposed voice conversion algorithm was used with two male speakers. Spectrum distortion between target speech and the converted speech was reduced to one-third the natural spectrum distortion between the two speakers. A listening experiment showed that, in terms of speaker identification accuracy, the speech converted by segment-sized units gave a score 20% higher than the speech converted frame-by-frame.

Original languageEnglish
Title of host publicationProceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing
Editors Anon
PublisherPubl by IEEE
Pages765-768
Number of pages4
Volume2
ISBN (Print)078030033
Publication statusPublished - 1991
Externally publishedYes
EventProceedings of the 1991 International Conference on Acoustics, Speech, and Signal Processing - ICASSP 91 - Toronto, Ont, Can
Duration: May 14 1991May 17 1991

Other

OtherProceedings of the 1991 International Conference on Acoustics, Speech, and Signal Processing - ICASSP 91
CityToronto, Ont, Can
Period5/14/915/17/91

Fingerprint

static characteristics
speech recognition
dynamic characteristics
Speech recognition
modules
Experiments

ASJC Scopus subject areas

  • Signal Processing
  • Electrical and Electronic Engineering
  • Acoustics and Ultrasonics

Cite this

Abe, M. (1991). A segment-based approach to voice conversion. In Anon (Ed.), Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing (Vol. 2, pp. 765-768). Publ by IEEE.

A segment-based approach to voice conversion. / Abe, Masanobu.

Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing. ed. / Anon. Vol. 2 Publ by IEEE, 1991. p. 765-768.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abe, M 1991, A segment-based approach to voice conversion. in Anon (ed.), Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing. vol. 2, Publ by IEEE, pp. 765-768, Proceedings of the 1991 International Conference on Acoustics, Speech, and Signal Processing - ICASSP 91, Toronto, Ont, Can, 5/14/91.
Abe M. A segment-based approach to voice conversion. In Anon, editor, Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing. Vol. 2. Publ by IEEE. 1991. p. 765-768
Abe, Masanobu. / A segment-based approach to voice conversion. Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing. editor / Anon. Vol. 2 Publ by IEEE, 1991. pp. 765-768
@inproceedings{90af02462f424e3da4fdf89393485cd3,
title = "A segment-based approach to voice conversion",
abstract = "A voice conversion algorithm that uses speech segments as conversion units is proposed. Input speech is decomposed into speech segments by a speech recognition module, and the segments are replaced by speech segments uttered by another speaker. This algorithm makes it possible to convert not only the static characteristics but also the dynamic characteristics of speaker individuality. The proposed voice conversion algorithm was used with two male speakers. Spectrum distortion between target speech and the converted speech was reduced to one-third the natural spectrum distortion between the two speakers. A listening experiment showed that, in terms of speaker identification accuracy, the speech converted by segment-sized units gave a score 20{\%} higher than the speech converted frame-by-frame.",
author = "Masanobu Abe",
year = "1991",
language = "English",
isbn = "078030033",
volume = "2",
pages = "765--768",
editor = "Anon",
booktitle = "Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing",
publisher = "Publ by IEEE",

}

TY - GEN

T1 - A segment-based approach to voice conversion

AU - Abe, Masanobu

PY - 1991

Y1 - 1991

N2 - A voice conversion algorithm that uses speech segments as conversion units is proposed. Input speech is decomposed into speech segments by a speech recognition module, and the segments are replaced by speech segments uttered by another speaker. This algorithm makes it possible to convert not only the static characteristics but also the dynamic characteristics of speaker individuality. The proposed voice conversion algorithm was used with two male speakers. Spectrum distortion between target speech and the converted speech was reduced to one-third the natural spectrum distortion between the two speakers. A listening experiment showed that, in terms of speaker identification accuracy, the speech converted by segment-sized units gave a score 20% higher than the speech converted frame-by-frame.

AB - A voice conversion algorithm that uses speech segments as conversion units is proposed. Input speech is decomposed into speech segments by a speech recognition module, and the segments are replaced by speech segments uttered by another speaker. This algorithm makes it possible to convert not only the static characteristics but also the dynamic characteristics of speaker individuality. The proposed voice conversion algorithm was used with two male speakers. Spectrum distortion between target speech and the converted speech was reduced to one-third the natural spectrum distortion between the two speakers. A listening experiment showed that, in terms of speaker identification accuracy, the speech converted by segment-sized units gave a score 20% higher than the speech converted frame-by-frame.

UR - http://www.scopus.com/inward/record.url?scp=0026369941&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0026369941&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:0026369941

SN - 078030033

VL - 2

SP - 765

EP - 768

BT - Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing

A2 - Anon, null

PB - Publ by IEEE

ER -