Speech Group
Speech Composer

The system
Mailing list
The rhetor

Ελληνική έκδοση

Gerasimos Xydas
Last updated:
3 May 2004

Features of DEMOSTHeNES version 2

  1. Multi Characters
  2. DEMOSTHeNES is a multi-voice text-to-speech system. It offers a selection of different voices (males and females), as well as "character" definition based on these voices. Each characters define its own TtS process (e.g. letter-to-sound conversion, prosody model), allowing the personalization of the produced speech.

  3. Performance
  4. DEMOSTHeNES novel architecture allows efficient implementations to be developed. In server mode operation, DEMOSTHeNES is able to serve each session at more than 200*realtime, offering many channels in telecom applications.

  5. Text Analysis
  6. The Text Analyzer is based on finite state automata (FSA) engine and is able to identify:

      More than 800 acronyms in all declensions, with configurable pronunciation: for example 'το Ι.Κ.Α.' or το 'ΙΚΑ' can be pronounced either as 'το Ίδρυμα Κοινωνικών Ασφαλίσεων' or as 'το Ίκα'.

      Several forms of dates and times. E.g '21/2/2001' -> 'Εικοσιμία δευτέρου του δύο χιλιάδες ένα' and '18:45' -> 'Δεκαοκτώ και σαράνταπέντε'

      Numerics, Latin numbers and Greek numbers

      Abbreviations (e.g. 'κλπ' or 'κ.λ.π.' -> 'και λοιπά')

      Other marks (e.g. '(' -> 'παρένθεση' and ')' -> 'κλείνει παρένθεση')

  7. Natural Language Processing
  8. In DEMOSTHeNES, text is being analyzed in order to extract grammatical and syntactical information. Such information is being exploited during prosody generation for more realistic tonal balance of words of different part of speech.

  9. Pronunciation Generator
  10. The Pronunciation Generator deals with the coarticulation effects of each language in order to best convey the pronunciation of words. Currently, it supports Greek and English.

  11. Polyglot
  12. DEMOSTHeNES is a polyglot system that means it can handle text with more than one languages at the same time (e.g. a Greek document that contains an English paragraph). It currently supports the Greek and the English language (currently both with Greek pronunciation).

  13. Prosody Generator
  14. DEMOSTHeNES introduces features for increasing the naturalness and reducing the predictability in the produced speech. In version 2, prosody generation is based on machine learning approaches from large speech corpora.

  15. Voices
  16. DEMOSTHeNES comes bundled with 3 new di-cluster based natural Greek voices (2 males and 1 female) and the MBROLA synthesizer. This voices consist of 1081 di-clusters that captures most of the co-articulation events in Greek, and has been carefully recorded in order for each cluster to accentuate its acoustic features.

  17. Expandability
  18. DEMOSTHeNES’s component architecture allows it to be expanded in several ways. New languages, new hues and voices, signal processing modules, language processing modules and much more can be ported to this platform.

  19. Customization
  20. The modules of DEMOSTHeNES are fully customized and furthermore, they can form independent applications (e.g. the Greek-to-IPA converter to be used in dictionary applications).

  21. Other features

The operation of DEMOSTHeNES can be configured per module (the scale of configuration depends on the version). For example, the end user can select whether acronyms will be expanded on not. Furthermore, DEMOSTHeNES can be bundled with other synthesizers, as the formant based synthesizer module VMOD_FORMANT, which currently delivers a lower quality voice than the MBROLA does.