...that deal with pure data
You are not logged in.
Here is my first shot at a speech formant synthesizer. It only does the vowels for now because I couldn't yet figure out how to extract the envelopes needed for consonants from my recordings. Also I don't know if just a simple envelope will be enough for them. Any pointers to speech synthesis online resources on the subject will be greatly appreciated.
Formants are completely customizable on text files, so it should be straightforward to adapt it for another language.
All comments (even flame :P) welcome.
Thanks, I'm quite amazed how clear the vowels are - excellent! Sorry I can't help you with further links about speech synthesis beyond the ones you would find with google yourself.
Also, formants for consonants? I was living with the impression that formants exclusively describe vowels. I also thought so far that in speech synthesis they use filtered noise for the various consonants.
Thank you! I'm not too happy with the "u" vowel but the others sound quite clear to me too.
I am also trying to get the consonants by filtering a noise source with filters using vline~ envelopes. I was able to analyse the formant frequencies using a free software called sonic visualizer, but I couldn't quite grasp how to analyze the consonants with it. There are quite good tutorial on the web for formant synthesis, but I couldn't find anything about consonant synthesis. So i'm playing with prosody right now until I find a good resource.
I think you're probably on the right track by filtering noise for consonants. They're unpitched and contain a lot of frequencies. I'm not familiar with sonic visualizer, but from my experience the displays on spectral analyzers are pretty slow and won't give you a good reading on consonants. You might just want to use your ears and get as close as possible. Channel vocoders emulate consonants by listening for sounds with high frequency content and switch over to filtered noise when it hears them. It's not perfect as it's indiscriminate about what consonants trigger the noise, but because consonants are so short they can often be passable. So just doing one better by tuning noise for each consonant might make a world of difference.
I've attached a little formant synth I made a year or so ago. It's actually meant to be sort of a "talking" synth, but it only says the vowels "a," "e," "i," "o," and "u." You press the key on your qwerty keyboard and then play some notes to hear them. It also has a silly little tune built in. I think it's hilarious ;-p.
Last edited by Maelstorm (2009-11-12 04:41:58)
I think the hard part is making the consonant-vowel and consonant-consonant transitions sound intelligible. But if Homer Dudley could do it in 1939, it certainly can be done in Pd (and I want to know if anyone's done it).
I like the buzzy robotic sound that this implementation has.
You can usually take consonants like this as being filtered noise, with a very quick transition into whatever vowel comes next. I think the centre frequency of the noise and the transition is what defines individual consonants from one another.
I found this page quite helpful:
https://ccrma.stanford.edu/CCRMA/Course … ition.html
yea! 8 ]
the thread has been quiet for some time but thought Id also like to thank you for the formant sample pd files..really useful and hopefully once I get back into the pd interface I can make proper use of them. One question I have about jimqode's example is I seem to get some clipping artifacts when changing vowels and was wondering if there is a way to remove it? I added a little lop~ which seems to help a little but only with lower frequency settings. I'm running on asio drivers and so at least I hope I can rule out my soundcard (LYNX).
I'm working on a kind of random vowel generator so getting the vowels to flow in a speechy way but not understandable is what im heading for.