FiCa speech dataset


The FiCa speech dataset is a private dataset consisting of 92 minutes of audio from a single female speaker. This dataset was originally created in order to train a TTS system capable of synthesizing short feedback responses such as "mhm", "oh", "wow". This work was published at SigDial 2024.

SigDial paper: Mhm... Yeah? Okay! Evaluating the Naturalness and Communicative Function of Synthesized Feedback Responses in Dialogue
Authors: Carol Figueroa, Marcel de Korte, Magalie Ochs, Gabriel Skantze
Voice Talent: Carol Figueroa


This speech dataset consists of different speech recordings:

Feedback examples from the dataset

Access to the feedback imitations and conversational feedback responses can be requested. Please contact the first author Carol Figueroa


Feedback imitations Conversational feedback responses

To cite this dataset please use the following:

  @inproceedings{figueroa2024mhm,
  title={Mhm... Yeah? Okay! Evaluating the Naturalness and Communicative Function of Synthesized Feedback Responses in Spoken Dialogue},
  author={Figueroa, Carol and de Korte, Marcel and Ochs, Magalie and Skantze, Gabriel},
  booktitle={Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue},
  pages={544--553},
  year={2024}
}