Double ampersand spoken as XML entity when punctuation level set to all


06/08/2014 03:17:47 PM UTC, Joanmarie: post #44270

Try the following:

$ python3
>>> import speechd
>>> c = speechd.SSIPClient('foo')
>>> c.set_punctuation(speechd.PunctuationMode.ALL)
>>> c.speak("hello && goodbye")

Results: "Hello ampersand ampersand amp semicolon goodbye"

04/06/2015 06:07:46 AM UTC, Luke Yelavich: post #52113

 * Assigned to:  -> Luke Yelavich
 * State: New -> Open
From further testing, reading code and logs, this varies from synthesizer to synthesizer.

As you may be aware, Speech Dispatcher clients can either set the data mode to either SSML, or text. In SSML mode, the client sends pure SSML through to Speech Dispatcher, which at a glance appears to be sent directly through to the chosen speech synthesizer, in this case espeak. In text mode, which is the default, Speech Dispatcher constructs an SSML string for the chosen speech synthesizer to process, and its up to the synthesizer driver or the synthesizer itself to strip away the SSML, should it not be capable of handling SSML code. If the data mode is pure text, Speech Dispatcher replaces < and > characters, as well as & characters with &lt, &gt, &amp; etc. This is also done if the data mode is SSML, and the character occurs outside an SSML tag.

I suspect this is done to make sure a synthesizer doesn't attempt to treat these characters as part of SSML tags when they are not supposed to be. Ultimately the user should still be hearing the correct characters spoken when punctuation mode is set to all. I have a solution in mind, which shouldn't be too difficult to implement.

Thanks for the bug report, I will be getting to the rest of your queries shortly, (finally got proper access to the tracker. :))
Total records: 2

Note: You need to log in before you can post comments.