Challenges and Perspectives on Real-time Singing Voice Synthesis

Leonardo Araujo Zoehler Brum, Edward David Moreno


This paper describes the state of art of real-time singing voice synthesis and presents its concept, applications and technical aspects. A technological mapping and a literature review are made in order to indicate the latest developments in this area. We made a brief comparative analysis among the selected works. Finally, we have discussed challenges and future research problems.


Real-time singing voice synthesis;Sound Synthesis ;TTS;MIDI;Computer Music

Full Text:



KHAN, N. U.; LEE, J. C. HMM Based Duration Control for Singing TTS. In: Advances in Computer Science and Ubiquitous Computing. [S.l.]: Springer, 2015. p. 137-143.

ALIVIZATOU-BARAKOU, M. et al. Intangible cultural heritage and new technologies: challenges and opportunities for cultural preservation and development. In: Mixed reality and gamification for cultural heritage. [S.l.]: Springer, 2017. p. 129-158.

KENMOCHI, H. Singing synthesis as a new musical instrument. In: IEEE. 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). [S.l.], 2012. p. 5385-5388.

KAGAMI, S. et al. Development of realtime japanese vocal keyboard. Information Processing Society of Japan INTERACTION, p. 837-842, 2012.

BADER, R. Springer handbook of systematic musicology. [S.l.]: Springer, 2018.

BLATTER, A. Revisiting Music Theory: Basic Principles. [S.l.]: Taylor & Francis, 2016.

MACNEILAGE, P. F. The frame/content theory of evolution of speech production. Behavioral and brain sciences, Cambridge University Press, v. 21, n. 4, p. 499-511, 1998.

DELALEZ, S.; D’ALESSANDRO, C. Adjusting the frame: Biphasic performative control of speech rhythm. In: Proceedings of Interspeech 2017. [S.l.: s.n.], 2017. p. 864-868.

LOY, G. Musimathics: the mathematical foundations of music. [S.l.]: MIT press, 2011. v. 2.

RUSS, M. Sound synthesis and sampling. [S.l.]: Taylor & Francis, 2004.

BRUM, L. A. Z. Technical aspects of concatenation-based singing voice synthesis. Scientia Plena, v. 8, n. 3 (a), 2012.

HOWARD, D. Virtual choirs. In: The Routledge Companion to Music, Technology, and Education. [S.l.]: Routledge, 2017. p. 305-314.

OURA, K. et al. Recent development of the hmm-based singing voice synthesis system—sinsy. In: Seventh ISCA Workshop on Speech Synthesis. [S.l.: s.n.], 2010.

CHAN, P. Y. et al. SERAPHIM: A Wavetable Synthesis System with 3D Lip Animation for Real-Time Speech and Singing Applications on Mobile Platforms. In: INTER-SPEECH. [S.l.: s.n.], 2016. p. 1225-1229.

PETERSEN, K.; VAKKALANKA, S.; KUZNIARZ, L. Guidelines for conducting systematic mapping studies in software engineering: An update. Information and Software Technology, Elsevier, v. 64, p. 1-18, 2015.

KUBOZONO, H. Handbook of Japanese phonetics and phonology. [S.l.]: Walter de Gruyter GmbH & Co KG, 2015. v. 2.

FEUGÈRE, L. et al. Cantor digitalis: chironomic parametric synthesis of singing. EURASIP Journal on Audio, Speech, and Music Processing, Springer, v. 2017, n. 1, p. 2, 2017.

BEUX, S. L.; FEUGERE, L.; D’ALESSANDRO, C. Chorus digitalis: experiment in chironomic choir singing. In: [S.l.: s.n.], 2011.

DONG, M. et al. I2r speech2singing perfects everyone’s singing. In: Fifteenth Annual Conference of the International Speech Communication Association. [S.l.: s.n.], 2014.

MORISE, M. et al. v. morish’09: A morphing-based singing design interface for vocal melodies. In: SPRINGER. International Conference on Entertainment Computing. [S.l.], 2009. p. 185-190.

GU, H.-Y.; LIAO, H.-L. Mandarin singing voice synthesis using an hnm based scheme. In: IEEE. 2008 Congress on Image and Signal Processing. [S.l.], 2008. v. 5, p. 347-351.

YU, J. A real-time 3d visual singing synthesis: From appearance to internal articulators. In: SPRINGER. International Conference on Multimedia Modeling. [S.l.], 2017. p. 53-64.


Copyright (c) 2020 Leonardo Araujo Zoehler Brum, Edward David Moreno

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Indexing databases: