Click here to Skip to main content
15,893,722 members
Articles / Mobile Apps / iPhone

ESpeakEngine - Objective-C speech synthesizer

Rate me:
Please Sign up or sign in to vote.
4.80/5 (3 votes)
23 Jan 2012BSD2 min read 74.9K   3K   22  
ESpeakEngine - Objective-C speech synthesizer
  • ESpeakTest.zip
    • __MACOSX
    • ESpeakTest
      • .DS_Store
      • .git
        • branches
        • COMMIT_EDITMSG
        • config
        • description
        • HEAD
        • hooks
          • applypatch-msg.sample
          • commit-msg.sample
          • post-commit.sample
          • post-receive.sample
          • post-update.sample
          • pre-applypatch.sample
          • pre-commit.sample
          • prepare-commit-msg.sample
          • pre-rebase.sample
          • update.sample
        • index
        • info
          • exclude
        • logs
          • HEAD
          • refs
            • heads
              • master
        • objects
          • 00
            • e88867c94b8f6728422a4ad099a53a9f392cb7
          • 01
            • 477be5465c04d4cfd46ab95003f17445731d74
          • 02
            • 37597b848a1890bb30ba0ff4102f8107cafdcb
            • 6363f6a3bb9fe78b03b09f4d0fdce472b95e20
            • 69a98a29eb1baf79dd1f6f2bbc9370b9c4c9cb
            • 78ea2136d97866104e3789d7950d39632b5046
          • 03
            • 7b0f820ace631b85715dfb13cc21fe26daa44d
            • dac4f6baff6f5a2b06f5a68b6daa265c16e0a2
            • edde41daf54530ac1fe426e9349a5544b48ccc
          • 05
            • b2d0d21b687b4190600f6ea0206972c960dd1c
          • 06
            • 23379b22e1417ed806ff4c48337afbe9457cdf
          • 0c
            • 2d13a65548266d9c8a6758599c9a86faf53b28
          • 0d
            • 51695daffab058d065e942018f5c56067942fe
            • 6fa2a91273ee3271fb1d3202d7f6ad86379ef7
          • 10
            • e91b22ea1e6c67962cd858e8d03de74e7abf6f
          • 11
            • 8095eb5859b9c2ee14be00adeb87ee66eaec94
          • 12
            • 83826f9250bed7326aed2542a0e3c5b47edbec
            • ce1096a4ef6c4538a43ebd951dcaf2b624b717
          • 13
            • 664a34fe6a3b585603e00df15b17bcef3591e0
            • 93b3551c88468dccede7550726aae27a931555
          • 14
            • 1160885f6955ee4855933ce984bc117649d220
            • 417c1fdfdb3af16ef717d90a078f5c9b8468c3
          • 15
            • 96e3c77205e1688ad5968359106b5ea92d322a
          • 17
            • 004519410f433308228397abc30d2b69e55b26
          • 19
            • 3700525eaa0aa02256a4211c820007bb005995
            • da34a5b9b6e585ece12d4a6ec24341fa98f50c
            • fd69d3bec6857b0d4b29f4f26c03ac0dc727e3
          • 1a
            • 61238c5c772724cb12490eb555630bcc0e7870
            • 9e53b6e584bd1e8245f3fc69751211b7d4e86a
          • 1c
            • 2992dd5455746aef85905cc0293031e3e9ceaa
            • a6f6abadb2e86be2baf2f823a27d16c43d2528
          • 1e
            • 9a757f891fd7ece2ff350dc9225837cb7e70f7
          • 1f
            • 2eb929019174a9b4c2654c1398ed26aca50e2c
          • 20
            • 5e59c21f973ea13de6deab139efee2f171027a
          • 21
            • 4d6f9d708a7691c1abf13b0801ee9a54a44ae6
          • 22
            • a3b80f6954fd663b59f66210e7b658c5db3a3f
            • a95d18a3ead5d0274abab7986121dccafc1842
          • 23
            • 8c69126e18387f9c2b3d059db6d7df1cd91a6f
          • 25
            • 55d5d8bec5ee0524034d96ea5aa95eeb1a90dd
            • 5fe35ca3bf7c8743abf0f95518aae822aac4f0
            • bfdbf47ea384bd8118ed197a8842cb2104eb2a
            • d78c05ca841d2019f204184927282dc69ebd61
          • 28
            • 59e093f75f1f88ca412e0bde9345afc01f15ac
          • 29
            • 91e99d136fe8e13f7c0c99eaa27c8ca397186d
          • 2a
            • 736d4d3f3c3ffe36bea620d6c28b672efc5867
            • fb1adc71b7f4e90bd02355dd8cdb75e8d23870
          • 2e
            • 1534872410e5819a69396cdbb57225baa3a888
            • 9dd1f58881b69e931f3b5a73fef461eb9b838b
          • 30
            • 441d7e239ccd9169d15b04efc59be7a158a4a5
            • dea89200864b8e84dc0fec67850491d460a78d
          • 31
            • 02c9a1a9a5db99cadf90502de610f69c994913
            • bd479a3d5873c8a970de821a84d2addc864839
          • 32
            • 49dd4f8012e94af03784f043478ebab38fd0b7
          • 33
            • 00c360065de4a18e093dd02df3f6419b656f74
            • 430a22c186f11e5cf838e81dacb386a7e4c8d0
            • 8b8c450f3027ea5d3751fc1e0b4bf969cc1afc
          • 35
            • a4a3fc517a71ec8bad8f9d0c71bd9da960c494
          • 36
            • 665314753a98e11162485805608fe1aca7bc1e
            • 669d3f0592eaeb8465b81341b486a4c9c1eb11
            • a4bff6503dfb82fcdc7a160cc36f4cdd9e333d
          • 3a
            • f2aa1af75587b7a29685fa2dbbebbd2b3592c2
          • 3b
            • 105a7fb0f9b3bb28a87a7212dfcef08daf53c0
          • 3d
            • d75260a68d0fb7b88fe70d0e3e73c916b963e5
          • 3e
            • 520ec5c8b88db8f95f6a230273278fd39bfbd6
          • 41
            • 22d96b39bf2bc1025a1c57ef01b8b03936773d
          • 42
            • dbdc5586bc89d681d23851f4a5aafd79b414cb
            • de58882d0c80e94597b575893afc8e99bd431e
          • 46
            • 07dd079dd2748f8e9cde034268e6cd68278644
            • 317618d912082d6070e4e3972809824bd35395
          • 47
            • 7b28ff8f86a3158a71c4934fbd3a2456717d7a
          • 4b
            • 6a9ae550599ea85d77fea7c0f71b4d5aaf1ba8
          • 4c
            • 6239268d21312d311a504ad0d8aeb0e4f1030b
          • 4d
            • ec15997c4e00c7a764156687431ac8c47676a5
          • 4e
            • 2b9d23e84059b93883e1a0c0f7a859a23b087e
          • 4f
            • 1904e5164410f93689beb55d46901b214271fe
            • 8f5e88f436d478b126c5c4eccf3568e398c26f
            • e4188e53b10cc21b50c3bf47e9ef3b2fc4c641
          • 50
            • 1b5a4a8620d5bd9a545c8941cdcbd565fca1c6
          • 52
            • 1be164ce6c87e1d6df58fd82ab160c8f6255db
            • 692c38546eb82aee2a7550c93798f70f02dc9a
            • c5ac93561331143a9caea14d6c0f008216b4e9
          • 53
            • 6957cb8fe03bde580784e6f97537ec3444e9c3
            • 7beb3ba82da8af147f028685e61fc839cad713
            • c2a70482993f53d6df321687b5cc4d9e95abc1
            • cb31446e077a5f159c831e126a0e3f9a2d0ed2
          • 56
            • 9f9d05432267a13b75ca2562beb85e1e1c17db
          • 58
            • 1cd883fed6aa3b84b580a543b6ec8998f4d327
          • 5a
            • 24e11ad4dc2842c79033ad323f02e2e6c2f566
            • 85640ee385afef9b9dc9b7d2889d47b81aa1bd
          • 5c
            • 3583da4700ab6982766d187d1195b37f9a3fb5
          • 5e
            • bb6a35716f489f249db8bb0e9df7dde150eba4
          • 5f
            • 3297d3a2009a6051a8ebc606bc674056ea03fc
          • 61
            • 124ff298a7d392b816cb1a71095ed1ec8ffe6b
          • 65
            • 3c3f5c4a2d2e44b8a188b88b64278085ebde27
          • 68
            • 1cad9627cb3af687a30507f05114c89ef9340f
          • 69
            • cceefb779a5236074db6871d7523d92f8a709f
          • 6a
            • 8d5efd87553a3f0e977636c6b819cddf3a99e7
            • cadba6aea97cd920745428a1c4ccd998581cc7
          • 6c
            • 65e3c6851f204d9c4cd9b616b46a9ff425b3c2
          • 6d
            • 826477b6ee1ecb3e502ffe3c26387d1bce3961
          • 6e
            • 11c93121ab5d535e4f2d50253ee4a527694a9f
          • 71
            • 99341c34f93f5fa5219ff479e82edaee5d7936
            • ecab7197ec9646efdae05bb02b465f5b5e361a
          • 73
            • ac62a4ab12374bbf6f72539b2e104d10d7d394
          • 74
            • 00c07a5c17fbbead0d252a22f1fcdb6e5f15c4
            • 0601d129aa08fd59be839301c923b684361dbe
          • 79
            • 2d8a9f9ef248c4358c36000722ba0c53a76497
          • 7c
            • bdab338114c51e83e0b54c67280b91872211d5
          • 7d
            • 276eb2b779d73c46d8ed97e4be287bec96c828
          • 7e
            • 6c16a2c28e97392d20d4f4c243ecd6f6f40a91
            • f93a5edd61fddc97d982242d7654e5ab07a09e
          • 7f
            • 4631899e208f50b855ad579b726eadea70f67d
            • fccbe6dd68c968e78da5b6265a13c62c1fc639
          • 82
            • 2c9a312addee71797811c17690f2be8746bea4
            • 98f98722e5f5f405e9631eb4a5064d87114424
          • 83
            • 71a46410d32c3f000db4c7b11254f48a3d6055
          • 84
            • 79e658ebe74cdd0b9525a41dededf8b9839858
            • ccc3a6d4681e19dbe6b982ddcee2e760944d55
          • 85
            • ebb03e3089c5055e4f76d272866738a90a7842
          • 88
            • 1634035cad7fbc213a9def0b5e5ef7f15dffa2
            • 48d6820e826b907349234a642535725247f837
          • 89
            • 28f0ff69aa677f7c5f96053ca70589552e17b0
            • e6c82914aa9457a644d5a498fa643f98b9ade3
          • 8f
            • c65d4bab0f132f34e70868a961188deabc55dc
            • d4a63a1a24c8b25eab5ed28c135457a8332e6b
          • 92
            • 3d517415d489cc9b3f91638c14264dd0df55a3
            • a1582817dc2f8256db5a02bed320a05f6e5e43
          • 95
            • 8799c61770d05ba341183cd2d6a107a1ec093c
          • 96
            • 35ac150af1804b398d67cf4703d718a16806a7
          • 97
            • 8ec49ed00c46862d89580efe68f7efd98c93c3
            • a337d34bb9ab89812b5e79c3bc2bb784d48953
          • 98
            • 9f9eab7b5ee98f4b6acf35fe8b4ef86db3a62e
          • 9b
            • 06e0bd24aa4658ed8009be3e2fe7e32ccee54e
            • 280bf8bc106ca904c9b33a90d0822c4c9b03fd
            • 2d891f12030afadc1e737914a1759ab59ef01f
            • a872a49ab896e3d7c6203bedfd2502d8cb6521
            • c60c7cbd38db3307551ae17ef2a8a5d623b3a0
          • 9d
            • e1630d90e22bf6df53a0093c212e9e1d7da9b6
            • eba8432350a07d0ab15189bb124d48b836fb62
          • 9e
            • 9c4e7476f3dce3b6808b0c8ea917a4d8503d7f
          • a3
            • 72fb170d6218ccc7298335f1ea55ff674fb3cc
          • a5
            • 4cb02cd26c7ecc2bd08debb1cfe85c3546c039
            • 504505ef08ce6040091e48eb5c1653a34574b0
          • a6
            • 56d2c7f2e9509fef687b3174282f6ec9b61498
            • e0f46bb57f876255cdb5767729aefccbe3a33e
          • a7
            • a8223dbda4d4cd47ae8796ad2be9bc70e46754
          • a9
            • 09e1af21f9dc9dccf095d02920266ea11fe7f5
            • 0c7b72565da6e015014e343f10ae50197587c9
          • aa
            • 80edaad05f0f16c6195e72a0130d803b58bcbc
          • ae
            • 76a4c4deb77d53e7ee512c3e85f45ba802dd08
            • 9247d41055f6721d3500645539ee00ae29d7e4
            • a3d895c09d5eae5411aec5fb5ecbe82451bd50
          • b0
            • d4979c857d151ef5cd27248926fb112c0c3cee
          • b1
            • a874be65ad7df189c933f4c576475f29d15f67
          • b2
            • 7a8114a03ed27348a40f37428107e02f67acc8
            • fd9d084c6df70da314916ee674421872f9ef50
          • b3
            • 2b6a6660b12a14fc5e8749eff9ba696b7f6061
          • b4
            • 8b1788b979853cdd57bb6cec6b9cc4dd7925bc
          • b7
            • 39a86e488eeec6f5693b7ee2590f84d6c95191
          • b8
            • 519559d3db834193518d4f4c4fed777734c8ee
            • 6f59306a576f4548ff3bd62e344570cc538932
            • f782946f4728a629eda8038e82dd165dadee1d
          • ba
            • 7c42cc48ae095ccfffeb03cf36d519d7cbe5a3
          • bc
            • bb2a0058b7866064b779ae246244c5342a0ccb
          • bd
            • 336a9884877889e2f85ee1849f8a068d2c6279
          • be
            • 1b6246a01d3d9c01dac0f888454369e0b32892
          • c0
            • a5475ece89c3274e62f9132560ec8afdd569d4
          • c2
            • 34f46877a761d5c98fda2cbbeb172dc2efc81a
            • 76bec0dd56b68a8e0ed0062b38dd149422b45d
          • c3
            • 2db3968df0779b3c3a47899085625a7b86339b
          • c6
            • 32e263c2a639311be4ca1512aadee97a91bc9a
          • cb
            • c4fb3a5d3dcc07f26b5ca02eb53a52b897aa80
          • cd
            • 02abedb89d8975a792d73de536d2ce82260553
          • ce
            • 800f70be34d9fba11f7d09b0c329b3f259507b
          • cf
            • 584b7dd3f005be3cffb47d27f1bfe980a6e53b
            • f13bac9dd46f2c2a2fb66db64ce78caffd5b5a
          • d0
            • 2b035cae9b12c66a5e0d5ae9cf57433041ed40
            • 864f3d6209ad6f3e8ab51dc880836e67c96fc5
            • b729579de2537a63cebdb809e8bdfb5a897a33
          • d2
            • 5865608d7d973e77c77d13bbdbe3ae454d864d
          • d3
            • d7720074a2fc7e3bfbc18cc1bacad0f9ea61d2
          • d5
            • 06e7f9f85667d054794f808f9098b2a1625372
            • 9fe79529970f627e83857f9bd516f5474bf685
          • d6
            • 737d787752a799349095b49cf81024dfd00fb6
            • 811d3ae4f710d3eb8a16c690cd42bebc4faf4e
          • d7
            • 59cfdf018f8814d995588d176dadde0820b5ca
          • d8
            • ecd252c04bc00a6dcc6d0b84d64285726b65d4
            • f35efaa2dc2293277365d811c9b6222cfe59b8
          • da
            • c1e4d0643b01e99052f575b5f6add2e10a2143
            • ef5160c9c1500d3a9f6b0eadf434ef702a009f
          • db
            • 63d6ca3d87629e693327a2ba4a719e79847b6c
            • de212341a0b131224b3e123f91d154348070ba
          • dc
            • 51396ce24aad46d1c761e2c96f8aee68d4b622
          • de
            • 4786c94cfa90c84982ca06a2a9623035993dfc
          • df
            • 70f4387ca97d30c66d7dc87d5abca9b0ffa820
            • 7fa77bc23477530674f05f4f619803983d5eb0
          • e0
            • 783ec3b1e38ce0f77b5b018d9b8641a7b924c3
            • ea6d6394ab8409463adfeb753f1632ef09a47f
          • e4
            • 16c6dc5e784243ad8d33000139178fa178de87
            • 1d3105c048e789fe605c780efe2881df833ea8
          • e7
            • 178e4d5666a3b75aa48de33b966f0590665a8e
          • e8
            • 5978c76abd7430436356cd2a126018642d6a2d
          • ea
            • dd707322e08d90e3a2208aef9060706c340799
          • eb
            • d92ffb3ec9499292d1faa7ca60adcc5edac74f
          • ed
            • 05f4240da98a569e3c9f9a5b9e2301d7caa79e
          • ee
            • 3bd233e2bc458aab63eb36f613f853f72e6bf7
          • f2
            • 033dc11fee0a1db4a2cae358c1a808149a5aa2
            • 130ba4f980783f8605eb50387ffc2e54f66b3c
          • f3
            • 10f868efea6408fc49fd4e6d6c293db2098114
            • e97b52355ea6f5384790bb3cda677222be900c
          • f4
            • 3ef84fa6caf08bc6e9e442e15d71eaa8c6962a
            • ede3296c4fb294a58715e61ecef26f74ed898a
          • f5
            • 0e89c905a3e8dd519f415585283757fc85d2d6
          • f9
            • 624b31623b500b57b74e9765ac7a2d9f039c65
          • fa
            • 4eece0b1cc7dababed3d09d560f38e00ea8755
          • fc
            • 377156b487a559efb384ec2b6e551d6c2085c6
            • 60f41672c8fb6fc69c1548552988f046a2aa14
          • fe
            • 7c4d8d002499fbaff530adfaa0543d626536dd
          • ff
            • a94206372e46914dcadeb10c212a9674daf62e
          • info
          • pack
        • refs
          • heads
            • master
          • tags
      • ESpeakTest.xcodeproj
        • project.pbxproj
        • project.xcworkspace
          • contents.xcworkspacedata
          • xcuserdata
            • jozefbozek.xcuserdatad
              • UserInterfaceState.xcuserstate
        • xcuserdata
      • ESpeakTest
      • ESpeakTestTests
  • eSpeak_1.0.zip
/***************************************************************************
 *   Copyright (C) 2005 to 2010 by Jonathan Duddington                     *
 *   email: jonsd@users.sourceforge.net                                    *
 *                                                                         *
 *   This program is free software; you can redistribute it and/or modify  *
 *   it under the terms of the GNU General Public License as published by  *
 *   the Free Software Foundation; either version 3 of the License, or     *
 *   (at your option) any later version.                                   *
 *                                                                         *
 *   This program is distributed in the hope that it will be useful,       *
 *   but WITHOUT ANY WARRANTY; without even the implied warranty of        *
 *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the         *
 *   GNU General Public License for more details.                          *
 *                                                                         *
 *   You should have received a copy of the GNU General Public License     *
 *   along with this program; if not, see:                                 *
 *               <http://www.gnu.org/licenses/>.                           *
 ***************************************************************************/

#include "StdAfx.h"

#include <stdio.h>
#include <ctype.h>
#include <wctype.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>

#include "speak_lib.h"
#include "speech.h"
#include "phoneme.h"
#include "synthesize.h"
#include "voice.h"
#include "translate.h"


extern FILE *f_log;
static void SmoothSpect(void);


// list of phonemes in a clause
int n_phoneme_list=0;
PHONEME_LIST phoneme_list[N_PHONEME_LIST];

int mbrola_delay;
char mbrola_name[20];

SPEED_FACTORS speed;

static int  last_pitch_cmd;
static int  last_amp_cmd;
static frame_t  *last_frame;
static int  last_wcmdq;
static int  pitch_length;
static int  amp_length;
static int  modn_flags;
static int  fmt_amplitude=0;

static int  syllable_start;
static int  syllable_end;
static int  syllable_centre;

static voice_t *new_voice=NULL;

int n_soundicon_tab=N_SOUNDICON_SLOTS;
SOUND_ICON soundicon_tab[N_SOUNDICON_TAB];

#define RMS_GLOTTAL1 35   // vowel before glottal stop
#define RMS_START 28  // 28

#define VOWEL_FRONT_LENGTH  50



// a dummy phoneme_list entry which looks like a pause
static PHONEME_LIST next_pause;


const char *WordToString(unsigned int word)
{//========================================
// Convert a phoneme mnemonic word into a string
	int  ix;
	static char buf[5];

	for(ix=0; ix<3; ix++)
		buf[ix] = word >> (ix*8);
	buf[4] = 0;
	return(buf);
}



void SynthesizeInit()
{//==================
	last_pitch_cmd = 0;
	last_amp_cmd = 0;
	last_frame = NULL;
	syllable_centre = -1;

	// initialise next_pause, a dummy phoneme_list entry
//	next_pause.ph = phoneme_tab[phonPAUSE];   // this must be done after voice selection
	next_pause.type = phPAUSE;
	next_pause.newword = 0;
}



static void EndAmplitude(void)
{//===========================
	if(amp_length > 0)
	{
		if(wcmdq[last_amp_cmd][1] == 0)
			wcmdq[last_amp_cmd][1] = amp_length;
		amp_length = 0;
	}
}



static void EndPitch(int voice_break)
{//==================================
	// posssible end of pitch envelope, fill in the length
	if((pitch_length > 0) && (last_pitch_cmd >= 0))
	{
		if(wcmdq[last_pitch_cmd][1] == 0)
			wcmdq[last_pitch_cmd][1] = pitch_length;
		pitch_length = 0;
	}

	if(voice_break)
	{
		last_wcmdq = -1;
		last_frame = NULL;
		syllable_end = wcmdq_tail;
		SmoothSpect();
		syllable_centre = -1;
		memset(vowel_transition,0,sizeof(vowel_transition));
	}
}  // end of EndPitch



static void DoAmplitude(int amp, unsigned char *amp_env)
{//=====================================================
	long *q;

	last_amp_cmd = wcmdq_tail;
	amp_length = 0;       // total length of vowel with this amplitude envelope

	q = wcmdq[wcmdq_tail];
	q[0] = WCMD_AMPLITUDE;
	q[1] = 0;        // fill in later from amp_length
	q[2] = (long)amp_env;
	q[3] = amp;
	WcmdqInc();
}  // end of DoAmplitude



static void DoPitch(unsigned char *env, int pitch1, int pitch2)
{//============================================================
	long *q;

	EndPitch(0);

	if(pitch1 == 255)
	{
		// pitch was not set
		pitch1 = 55;
		pitch2 = 76;
		env = envelope_data[PITCHfall];
	}
	last_pitch_cmd = wcmdq_tail;
	pitch_length = 0;       // total length of spect with this pitch envelope

	if(pitch2 < 0)
		pitch2 = 0;

	q = wcmdq[wcmdq_tail];
	q[0] = WCMD_PITCH;
	q[1] = 0;   // length, fill in later from pitch_length
	q[2] = (long)env;
	q[3] = (pitch1 << 16) + pitch2;
	WcmdqInc();
}  //  end of DoPitch



int PauseLength(int pause, int control)
{//====================================
	int len;

	if(control == 0)
	{
		if(pause >= 200)
			len = (pause * speed.clause_pause_factor)/256;
		else
			len = (pause * speed.pause_factor)/256;
	}
	else
		len = (pause * speed.wav_factor)/256;

	if(len < 5) len = 5;      // mS, limit the amount to which pauses can be shortened
	return(len);
}


static void DoPause(int length, int control)
{//=========================================
// control = 1, less shortening at fast speeds
	int len;

	if(length == 0)
		len = 0;
	else
	{
		len = PauseLength(length, control);

		len = (len * samplerate) / 1000;  // convert from mS to number of samples
	}

	EndPitch(1);
	wcmdq[wcmdq_tail][0] = WCMD_PAUSE;
	wcmdq[wcmdq_tail][1] = len;
	WcmdqInc();
	last_frame = NULL;

	if(fmt_amplitude != 0)
	{
		wcmdq[wcmdq_tail][0] = WCMD_FMT_AMPLITUDE;
		wcmdq[wcmdq_tail][1] = fmt_amplitude = 0;
		WcmdqInc();
	}
}  // end of DoPause


extern int seq_len_adjust;   // temporary fix to advance the start point for playing the wav sample


static int DoSample2(int index, int which, int std_length, int control, int length_mod, int amp)
{//=============================================================================================
	int length;
	int wav_length;
	int wav_scale;
	int min_length;
	int x;
	int len4;
	long *q;
	unsigned char *p;

	index = index & 0x7fffff;
	p = &wavefile_data[index];
	wav_scale = p[2];
	wav_length = (p[1] * 256);
	wav_length += p[0];    //  length in bytes

	if(wav_length == 0)
		return(0);

	min_length = speed.min_sample_len;

	if(wav_scale==0)
		min_length *= 2;  // 16 bit samples
	else
	{
		// increase consonant amplitude at high speeds, depending on the peak consonant amplitude
//		x = ((35 - wav_scale) * speed.loud_consonants);
//		if(x < 0) x = 0;
//		wav_scale = (wav_scale * (x+256))/256;
	}

	if(std_length > 0)
	{
		std_length = (std_length * samplerate)/1000;
		if(wav_scale == 0)
			std_length *= 2;

		x = (min_length * std_length)/wav_length;
		if(x > min_length)
			min_length = x;
	}
	else
	{
		// no length specified, use the length of the stored sound
		std_length = wav_length;
	}

	if(length_mod > 0)
	{
		std_length = (std_length * length_mod)/256;
	}

	length = (std_length * speed.wav_factor)/256;

	if(control & pd_DONTLENGTHEN)
	{
		// this option is used for Stops, with short noise bursts.
		// Don't change their length much.
		if(length > std_length)
		{
			// don't let length exceed std_length
			length = std_length;
		}
		else
		{
			// reduce the reduction in length
//			length = (length + std_length)/2;
		}
	}

	if(length < min_length)
		length = min_length;


	if(wav_scale == 0)
	{
		// 16 bit samples
		length /= 2;
		wav_length /= 2;
	}

	if(amp < 0)
		return(length);

	len4 = wav_length / 4;

	index += 4;

	if(which & 0x100)
	{
		// mix this with synthesised wave
		last_wcmdq = wcmdq_tail;
		q = wcmdq[wcmdq_tail];
		q[0] = WCMD_WAVE2;
		q[1] = length | (wav_length << 16);   // length in samples
		q[2] = long(&wavefile_data[index]);
		q[3] = wav_scale + (amp << 8);
		WcmdqInc();
		return(length);
	}

	if(length > wav_length)
	{
		x = len4*3;
		length -= x;
	}
	else
	{
		x = length;
		length = 0;
	}

	last_wcmdq = wcmdq_tail;
	q = wcmdq[wcmdq_tail];
	q[0] = WCMD_WAVE;
	q[1] = x;   // length in samples
	q[2] = long(&wavefile_data[index]);
	q[3] = wav_scale + (amp << 8);
	WcmdqInc();


	while(length > len4*3)
	{
		x = len4;
		if(wav_scale == 0)
			x *= 2;

		last_wcmdq = wcmdq_tail;
		q = wcmdq[wcmdq_tail];
		q[0] = WCMD_WAVE;
		q[1] = len4*2;   // length in samples
		q[2] = long(&wavefile_data[index+x]);
		q[3] = wav_scale + (amp << 8);
		WcmdqInc();

		length -= len4*2;
	}

	if(length > 0)
	{
		x = wav_length - length;
		if(wav_scale == 0)
			x *= 2;
		last_wcmdq = wcmdq_tail;
		q = wcmdq[wcmdq_tail];
		q[0] = WCMD_WAVE;
		q[1] = length;   // length in samples
		q[2] = long(&wavefile_data[index+x]);
		q[3] = wav_scale + (amp << 8);
		WcmdqInc();
	}

	return(length);
}  // end of DoSample2



int DoSample3(PHONEME_DATA *phdata, int length_mod, int amp)
{//=========================================================
	int amp2;
	int len;
	EndPitch(1);

	if(amp == -1)
	{
		// just get the length, don't produce sound
		amp2 = amp;
	}
	else
	{
		amp2 = phdata->sound_param[pd_WAV];
		if(amp2 == 0)
			amp2 = 100;
		amp2 = (amp2 * 32)/100;
	}

	seq_len_adjust=0;

	if(phdata->sound_addr[pd_WAV] == 0)
	{
		len = 0;
	}
	else
	{
		len = DoSample2(phdata->sound_addr[pd_WAV], 2, phdata->pd_param[pd_LENGTHMOD]*2, phdata->pd_control, length_mod, amp2);
	}
	last_frame = NULL;
	return(len);
}  // end of DoSample3




static frame_t *AllocFrame()
{//=========================
	// Allocate a temporary spectrum frame for the wavegen queue. Use a pool which is big
	// enough to use a round-robin without checks.
	// Only needed for modifying spectra for blending to consonants

#define N_FRAME_POOL  N_WCMDQ
	static int ix=0;
	static frame_t frame_pool[N_FRAME_POOL];

	ix++;
	if(ix >= N_FRAME_POOL)
		ix = 0;
	return(&frame_pool[ix]);
}


static void set_frame_rms(frame_t *fr, int new_rms)
{//=================================================
// Each frame includes its RMS amplitude value, so to set a new
// RMS just adjust the formant amplitudes by the appropriate ratio

	int x;
	int h;
	int ix;

	static const short sqrt_tab[200] = {
	  0, 64, 90,110,128,143,156,169,181,192,202,212,221,230,239,247,
	256,263,271,278,286,293,300,306,313,320,326,332,338,344,350,356,
	362,367,373,378,384,389,394,399,404,409,414,419,424,429,434,438,
	443,448,452,457,461,465,470,474,478,483,487,491,495,499,503,507,
	512,515,519,523,527,531,535,539,543,546,550,554,557,561,565,568,
	572,576,579,583,586,590,593,596,600,603,607,610,613,617,620,623,
	627,630,633,636,640,643,646,649,652,655,658,662,665,668,671,674,
	677,680,683,686,689,692,695,698,701,704,706,709,712,715,718,721,
	724,726,729,732,735,738,740,743,746,749,751,754,757,759,762,765,
	768,770,773,775,778,781,783,786,789,791,794,796,799,801,804,807,
	809,812,814,817,819,822,824,827,829,832,834,836,839,841,844,846,
	849,851,853,856,858,861,863,865,868,870,872,875,877,879,882,884,
	886,889,891,893,896,898,900,902};

	if(voice->klattv[0])
	{
		if(new_rms == -1)
		{
			fr->klattp[KLATT_AV] = 50;
		}
		return;
	}
 
	if(fr->rms == 0) return;    // check for divide by zero
	x = (new_rms * 64)/fr->rms;
	if(x >= 200) x = 199;

	x = sqrt_tab[x];   // sqrt(new_rms/fr->rms)*0x200;

	for(ix=0; ix < 8; ix++)
	{
		h = fr->fheight[ix] * x;
		fr->fheight[ix] = h/0x200;
	}
}   /* end of set_frame_rms */



static void formants_reduce_hf(frame_t *fr, int level)
{//====================================================
//  change height of peaks 2 to 8, percentage
	int  ix;
	int  x;

	if(voice->klattv[0])
		return;
 
	for(ix=2; ix < 8; ix++)
	{
		x = fr->fheight[ix] * level;
		fr->fheight[ix] = x/100;
	}
}


static frame_t *CopyFrame(frame_t *frame1, int copy)
{//=================================================
//  create a copy of the specified frame in temporary buffer
	frame_t *frame2;

	if((copy==0) && (frame1->frflags & FRFLAG_COPIED))
	{
		// this frame has already been copied in temporary rw memory
		return(frame1);
	}

	frame2 = AllocFrame();
	if(frame2 != NULL)
	{
		memcpy(frame2,frame1,sizeof(frame_t));
		frame2->length = 0;
		frame2->frflags |= FRFLAG_COPIED;
	}
	return(frame2);
}


static frame_t *DuplicateLastFrame(frameref_t *seq, int n_frames, int length)
{//==========================================================================
	frame_t *fr;

	seq[n_frames-1].length = length;
	fr = CopyFrame(seq[n_frames-1].frame,1);
	seq[n_frames].frame = fr;
	seq[n_frames].length = 0;
	return fr;
}


static void AdjustFormants(frame_t *fr, int target, int min, int max, int f1_adj, int f3_adj, int hf_reduce, int flags)
{//====================================================================================================================
	int x;

//hf_reduce = 70;      // ?? using fixed amount rather than the parameter??

	target = (target * voice->formant_factor)/256;

	x = (target - fr->ffreq[2]) / 2;
	if(x > max) x = max;
	if(x < min) x = min;
	fr->ffreq[2] += x;
	fr->ffreq[3] += f3_adj;

	if(flags & 0x20)
	{
		f3_adj = -f3_adj;   //. reverse direction for f4,f5 change
	}
	fr->ffreq[4] += f3_adj;
	fr->ffreq[5] += f3_adj;

	if(f1_adj==1)
	{
		x = (235 - fr->ffreq[1]);
		if(x < -100) x = -100;
		if(x > -60) x = -60;
		fr->ffreq[1] += x;
	}
	if(f1_adj==2)
	{
		x = (235 - fr->ffreq[1]);
		if(x < -300) x = -300;
		if(x > -150) x = -150;
		fr->ffreq[1] += x;
		fr->ffreq[0] += x;
	}
	if(f1_adj==3)
	{
		x = (100 - fr->ffreq[1]);
		if(x < -400) x = -400;
		if(x > -300) x = -400;
		fr->ffreq[1] += x;
		fr->ffreq[0] += x;
	}
	formants_reduce_hf(fr,hf_reduce); 
}


static int VowelCloseness(frame_t *fr)
{//===================================
// return a value 0-3 depending on the vowel's f1
	int f1;

	if((f1 = fr->ffreq[1]) < 300)
		return(3);
	if(f1 < 400)
		return(2);
	if(f1 < 500)
		return(1);
	return(0);
}


int FormantTransition2(frameref_t *seq, int &n_frames, unsigned int data1, unsigned int data2, PHONEME_TAB *other_ph, int which)
{//==============================================================================================================================
	int ix;
	int formant;
	int next_rms;

	int len;
	int rms;
	int f1;
	int f2;
	int f2_min;
	int f2_max;
	int f3_adj;
	int f3_amp;
	int flags;
	int vcolour;

#define N_VCOLOUR  2
// percentage change for each formant in 256ths
static short vcolouring[N_VCOLOUR][5] = {
	{243,272,256,256,256},         // palatal consonant follows
	{256,256,240,240,240},         // retroflex
};

	frame_t *fr = NULL;

	if(n_frames < 2)
		return(0);

	len = (data1 & 0x3f) * 2;
	rms = (data1 >> 6) & 0x3f;
	flags = (data1 >> 12);

	f2 = (data2 & 0x3f) * 50;
	f2_min = (((data2 >> 6) & 0x1f) - 15) * 50;
	f2_max = (((data2 >> 11) & 0x1f) - 15) * 50;
	f3_adj = (((data2 >> 16) & 0x1f) - 15) * 50;
	f3_amp = ((data2 >> 21) & 0x1f) * 8;
	f1 = ((data2 >> 26) & 0x7);
	vcolour = (data2 >> 29);

//	fprintf(stderr,"FMT%d %3s  %3d-%3d f1=%d  f2=%4d %4d %4d  f3=%4d %3d\n",
//		which,WordToString(other_ph->mnemonic),len,rms,f1,f2,f2_min,f2_max,f3_adj,f3_amp);

	if((other_ph != NULL) && (other_ph->mnemonic == '?'))
		flags |= 8;

	if(which == 1)
	{
		/* entry to vowel */
		fr = CopyFrame(seq[0].frame,0);
		seq[0].frame = fr;
		seq[0].length = VOWEL_FRONT_LENGTH;
		if(len > 0)
			seq[0].length = len;
		seq[0].frflags |= FRFLAG_LEN_MOD2;              // reduce length modification
		fr->frflags |= FRFLAG_LEN_MOD2;

		next_rms = seq[1].frame->rms;

if(voice->klattv[0])
{
//	fr->klattp[KLATT_AV] = 53;   // reduce the amplituide of the start of a vowel
   fr->klattp[KLATT_AV] = seq[1].frame->klattp[KLATT_AV] - 4;
}
		if(f2 != 0)
		{
			if(rms & 0x20)
			{
				set_frame_rms(fr,(next_rms * (rms & 0x1f))/30);
			}
			AdjustFormants(fr, f2, f2_min, f2_max, f1, f3_adj, f3_amp, flags);

			if((rms & 0x20) == 0)
			{
				set_frame_rms(fr,rms*2);
			}
		}
		else
		{
			if(flags & 8)
				set_frame_rms(fr,(next_rms*24)/32);
			else
				set_frame_rms(fr,RMS_START);
		}

		if(flags & 8)
		{
//			set_frame_rms(fr,next_rms - 5);
			modn_flags = 0x800 + (VowelCloseness(fr) << 8);
		}
	}
	else
	{
		// exit from vowel
		rms = rms*2;
		if((f2 != 0) || (flags != 0))
		{

			if(flags & 8)
			{
				fr = CopyFrame(seq[n_frames-1].frame,0);
				seq[n_frames-1].frame = fr;
				rms = RMS_GLOTTAL1;
	
				// degree of glottal-stop effect depends on closeness of vowel (indicated by f1 freq)
				modn_flags = 0x400 + (VowelCloseness(fr) << 8);
			}
			else
			{
				fr = DuplicateLastFrame(seq,n_frames++,len);
				if(len > 36)
					seq_len_adjust += (len - 36);
	
				if(f2 != 0)
				{
					AdjustFormants(fr, f2, f2_min, f2_max, f1, f3_adj, f3_amp, flags);
				}
			}

			set_frame_rms(fr,rms);

			if((vcolour > 0) && (vcolour <= N_VCOLOUR))
			{
				for(ix=0; ix<n_frames; ix++)
				{
					fr = CopyFrame(seq[ix].frame,0);
					seq[ix].frame = fr;
					
					for(formant=1; formant<=5; formant++)
					{
						int x;
						x = fr->ffreq[formant] * vcolouring[vcolour-1][formant-1];
						fr->ffreq[formant] = x / 256;
					}
				}
			}
		}
	}

	if(fr != NULL)
	{
		if(flags & 4)
			fr->frflags |= FRFLAG_FORMANT_RATE;
		if(flags & 2)
			fr->frflags |= FRFLAG_BREAK;       // don't merge with next frame
	}

	if(flags & 0x40)
		DoPause(12,0);  // add a short pause after the consonant

	if(flags & 16)
		return(len);
	return(0);
} //  end of FormantTransition2



static void SmoothSpect(void)
{//==========================
	// Limit the rate of frequence change of formants, to reduce chirping

	long *q;
	frame_t *frame;
	frame_t *frame2;
	frame_t *frame1;
	frame_t *frame_centre;
	int ix;
	int len;
	int pk;
	int modified;
	int allowed;
	int diff;

	if(syllable_start == syllable_end)
		return;

	if((syllable_centre < 0) || (syllable_centre == syllable_start))
	{
		syllable_start = syllable_end;
		return;
	}

	q = wcmdq[syllable_centre];
	frame_centre = (frame_t *)q[2];

	// backwards
	ix = syllable_centre -1;
	frame = frame2 = frame_centre;
	for(;;)
	{
		if(ix < 0) ix = N_WCMDQ-1;
		q = wcmdq[ix];

		if(q[0] == WCMD_PAUSE || q[0] == WCMD_WAVE)
			break;

		if(q[0] <= WCMD_SPECT2)
		{
			len = q[1] & 0xffff;

			frame1 = (frame_t *)q[3];
			if(frame1 == frame)
			{
				q[3] = (long)frame2;
				frame1 = frame2;
			}
			else
				break;  // doesn't follow on from previous frame

			frame = frame2 = (frame_t *)q[2];
			modified = 0;

			if(frame->frflags & FRFLAG_BREAK)
				break;

			if(frame->frflags & FRFLAG_FORMANT_RATE)
				len = (len * 12)/10;      // allow slightly greater rate of change for this frame (was 12/10)

			for(pk=0; pk<6; pk++)
			{
				int f1, f2;

				if((frame->frflags & FRFLAG_BREAK_LF) && (pk < 3))
					continue;

				f1 = frame1->ffreq[pk];
				f2 = frame->ffreq[pk];

				// backwards
				if((diff = f2 - f1) > 0)
				{
					allowed = f1*2 + f2;
				}
				else
				{
					allowed = f1 + f2*2;
				}

				// the allowed change is specified as percentage (%*10) of the frequency
				// take "frequency" as 1/3 from the lower freq
				allowed = (allowed * formant_rate[pk])/3000;
				allowed = (allowed * len)/256;

				if(diff > allowed)
				{
					if(modified == 0)
					{
						frame2 = CopyFrame(frame,0);
						modified = 1;
					}
					frame2->ffreq[pk] = frame1->ffreq[pk] + allowed;
					q[2] = (long)frame2;
				}
				else
				if(diff < -allowed)
				{
					if(modified == 0)
					{
						frame2 = CopyFrame(frame,0);
						modified = 1;
					}
					frame2->ffreq[pk] = frame1->ffreq[pk] - allowed;
					q[2] = (long)frame2;
				}
			}
		}

		if(ix == syllable_start)
			break;
		ix--;
	}

	// forwards
	ix = syllable_centre;

	frame = NULL;
	for(;;)
	{
		q = wcmdq[ix];

		if(q[0] == WCMD_PAUSE || q[0] == WCMD_WAVE)
			break;

		if(q[0] <= WCMD_SPECT2)
		{

			len = q[1] & 0xffff;

			frame1 = (frame_t *)q[2];
			if(frame != NULL)
			{
				if(frame1 == frame)
				{
					q[2] = (long)frame2;
					frame1 = frame2;
				}
				else
					break;  // doesn't follow on from previous frame
			}

			frame = frame2 = (frame_t *)q[3];
			modified = 0;

			if(frame1->frflags & FRFLAG_BREAK)
				break;

			if(frame1->frflags & FRFLAG_FORMANT_RATE)
				len = (len *6)/5;      // allow slightly greater rate of change for this frame

			for(pk=0; pk<6; pk++)
			{
				int f1, f2;
				f1 = frame1->ffreq[pk];
				f2 = frame->ffreq[pk];

				// forwards
				if((diff = f2 - f1) > 0)
				{
					allowed = f1*2 + f2;
				}
				else
				{
					allowed = f1 + f2*2;
				}
				allowed = (allowed * formant_rate[pk])/3000;
				allowed = (allowed * len)/256;

				if(diff > allowed)
				{
					if(modified == 0)
					{
						frame2 = CopyFrame(frame,0);
						modified = 1;
					}
					frame2->ffreq[pk] = frame1->ffreq[pk] + allowed;
					q[3] = (long)frame2;
				}
				else
				if(diff < -allowed)
				{
					if(modified == 0)
					{
						frame2 = CopyFrame(frame,0);
						modified = 1;
					}
					frame2->ffreq[pk] = frame1->ffreq[pk] - allowed;
					q[3] = (long)frame2;
				}
			}
		}

		ix++;
		if(ix >= N_WCMDQ) ix = 0;
		if(ix == syllable_end)
			break;
	}

	syllable_start = syllable_end;
}  //  end of SmoothSpect


static void StartSyllable(void)
{//============================
	// start of syllable, if not already started
	if(syllable_end == syllable_start)
		syllable_end = wcmdq_tail;
}



int DoSpect2(PHONEME_TAB *this_ph, int which, FMT_PARAMS *fmt_params,  PHONEME_LIST *plist, int modulation)
{//========================================================================================================
	// which:  0 not a vowel, 1  start of vowel,   2 body and end of vowel
	// length_mod: 256 = 100%
	// modulation: -1 = don't write to wcmdq

	int  n_frames;
	frameref_t *frames;
	int  frameix;
	frame_t *frame1;
	frame_t *frame2;
	frame_t *fr;
	int  ix;
	long *q;
	int  len;
	int  frame_length;
	int  length_factor;
	int  length_mod;
	int  length_sum;
	int  length_min;
	int  total_len = 0;
	static int wave_flag = 0;
	int wcmd_spect = WCMD_SPECT;
	int frame_lengths[N_SEQ_FRAMES];

	if(fmt_params->fmt_addr == 0)
		return(0);

	length_mod = plist->length;
	if(length_mod==0) length_mod=256;

	length_min = (samplerate/70);  // greater than one cycle at low pitch (Hz)
	if(which==2)
	{
		if((translator->langopts.param[LOPT_LONG_VOWEL_THRESHOLD] > 0) && ((this_ph->std_length >= translator->langopts.param[LOPT_LONG_VOWEL_THRESHOLD]) || (plist->synthflags & SFLAG_LENGTHEN) || (this_ph->phflags & phLONG)))
			length_min *= 2;    // ensure long vowels are longer
	}

if(which==1)
{
	// limit the shortening of sonorants before shortened (eg. unstressed vowels)
	if((this_ph->type==phLIQUID) || (plist[-1].type==phLIQUID) || (plist[-1].type==phNASAL))
	{
		if(length_mod < (len = translator->langopts.param[LOPT_SONORANT_MIN]))
		{
			length_mod = len;
		}
	}
}

	modn_flags = 0;
	frames = LookupSpect(this_ph, which, fmt_params, &n_frames, plist);
	if(frames == NULL)
		return(0);   // not found

	if(fmt_params->fmt_amp != fmt_amplitude)
	{
		// an amplitude adjustment is specified for this sequence
		q = wcmdq[wcmdq_tail];
		q[0] = WCMD_FMT_AMPLITUDE;
		q[1] = fmt_amplitude = fmt_params->fmt_amp;
		WcmdqInc();
	}

	frame1 = frames[0].frame;
	if(voice->klattv[0])
		wcmd_spect = WCMD_KLATT;

	wavefile_ix = fmt_params->wav_addr;

	if(fmt_params->wav_amp == 0)
		wavefile_amp = 32;
	else
		wavefile_amp = (fmt_params->wav_amp * 32)/100;

	if(wavefile_ix == 0)
	{
		if(wave_flag)
		{
			// cancel any wavefile that was playing previously
			wcmd_spect = WCMD_SPECT2;
			if(voice->klattv[0])
				wcmd_spect = WCMD_KLATT2;
			wave_flag = 0;
		}
		else
		{
			wcmd_spect = WCMD_SPECT;
			if(voice->klattv[0])
				wcmd_spect = WCMD_KLATT;
		}
	}

	if(last_frame != NULL)
	{
		if(((last_frame->length < 2) || (last_frame->frflags & FRFLAG_VOWEL_CENTRE))
			&& !(last_frame->frflags & FRFLAG_BREAK))
		{
			// last frame of previous sequence was zero-length, replace with first of this sequence
			wcmdq[last_wcmdq][3] = (long)frame1;

			if(last_frame->frflags & FRFLAG_BREAK_LF)
			{
				// but flag indicates keep HF peaks in last segment
				fr = CopyFrame(frame1,1);
				for(ix=3; ix < 8; ix++)
				{
					if(ix < 7)
						fr->ffreq[ix] = last_frame->ffreq[ix];
					fr->fheight[ix] = last_frame->fheight[ix];
				}
				wcmdq[last_wcmdq][3] = (long)fr;
			}
		}
	}

	if((this_ph->type == phVOWEL) && (which == 2))
	{
		SmoothSpect();    // process previous syllable

		// remember the point in the output queue of the centre of the vowel
		syllable_centre = wcmdq_tail;
	}

	length_sum = 0;
	for(frameix=1; frameix < n_frames; frameix++)
	{
		length_factor = length_mod;
		if(frames[frameix-1].frflags & FRFLAG_LEN_MOD)     // reduce effect of length mod
		{
			length_factor = (length_mod*(256-speed.lenmod_factor) + 256*speed.lenmod_factor)/256;
		}
		else
		if(frames[frameix-1].frflags & FRFLAG_LEN_MOD2)     // reduce effect of length mod, used for the start of a vowel
		{
			length_factor = (length_mod*(256-speed.lenmod2_factor) + 256*speed.lenmod2_factor)/256;
		}

		frame_length = frames[frameix-1].length;
		len = (frame_length * samplerate)/1000;
		len = (len * length_factor)/256;
		length_sum += len;
		frame_lengths[frameix] = len;
	}

	if((length_sum > 0) && (length_sum < length_min))
	{
		// lengthen, so that the sequence is greater than one cycle at low pitch
		for(frameix=1; frameix < n_frames; frameix++)
		{
			frame_lengths[frameix] = (frame_lengths[frameix] * length_min) / length_sum;
		}
	}

	for(frameix=1; frameix<n_frames; frameix++)
	{
		frame2 = frames[frameix].frame;

		if((fmt_params->wav_addr != 0) && ((frame1->frflags & FRFLAG_DEFER_WAV)==0))
		{
			// there is a wave file to play along with this synthesis
			seq_len_adjust = 0;
			DoSample2(fmt_params->wav_addr, which+0x100, 0, fmt_params->fmt_control, 0, wavefile_amp);
			wave_flag = 1;
			wavefile_ix = 0;
			fmt_params->wav_addr = 0;
		}

		if(modulation >= 0)
		{
			if(frame1->frflags & FRFLAG_MODULATE)
			{
				modulation = 6;
			}
			if((frameix == n_frames-1) && (modn_flags & 0xf00))
				modulation |= modn_flags;   // before or after a glottal stop
		}

		len = frame_lengths[frameix];
		pitch_length += len;
		amp_length += len;

		if(len == 0)
		{
			last_frame = NULL;
			frame1 = frame2;
		}
		else
		{
			last_wcmdq = wcmdq_tail;

			if(modulation >= 0)
			{
				q = wcmdq[wcmdq_tail];
				q[0] = wcmd_spect;
				q[1] = len + (modulation << 16);
				q[2] = long(frame1);
				q[3] = long(frame2);
	
				WcmdqInc();
			}
			last_frame = frame1 = frame2;
			total_len += len;
		}
	}

	if((which != 1) && (fmt_amplitude != 0))
	{
		q = wcmdq[wcmdq_tail];
		q[0] = WCMD_FMT_AMPLITUDE;
		q[1] = fmt_amplitude = 0;
		WcmdqInc();
	}


	return(total_len);
}  // end of DoSpect




void DoMarker(int type, int char_posn, int length, int value)
{//==========================================================
// This could be used to return an index to the word currently being spoken
// Type 1=word, 2=sentence, 3=named marker, 4=play audio, 5=end
	wcmdq[wcmdq_tail][0] = WCMD_MARKER;
	wcmdq[wcmdq_tail][1] = type;
	wcmdq[wcmdq_tail][2] = (char_posn & 0xffffff) | (length << 24);
	wcmdq[wcmdq_tail][3] = value;
	WcmdqInc();

}  // end of DoMarker


void DoVoiceChange(voice_t *v)
{//===========================
// allocate memory for a copy of the voice data, and free it in wavegenfill()
	voice_t *v2;

	v2 = (voice_t *)malloc(sizeof(voice_t));
	memcpy(v2,v,sizeof(voice_t));
	wcmdq[wcmdq_tail][0] = WCMD_VOICE;
	wcmdq[wcmdq_tail][1] = (long)(v2);
	WcmdqInc();
}


void DoEmbedded(int *embix, int sourceix)
{//======================================
	// There were embedded commands in the text at this point
	unsigned int word;  // bit 7=last command for this word, bits 5,6 sign, bits 0-4 command
	unsigned int value;
	int command;

	do {
		word = embedded_list[*embix];
		value = word >> 8;
		command = word & 0x7f;

		if(command == 0)
			return;  // error

		(*embix)++;

		switch(command & 0x1f)
		{
		case EMBED_S:   // speed
			SetEmbedded((command & 0x60) + EMBED_S2,value);   // adjusts embedded_value[EMBED_S2]
			SetSpeed(2);
			break;

		case EMBED_I:   // play dynamically loaded wav data (sound icon)
			if((int)value < n_soundicon_tab)
			{
				if(soundicon_tab[value].length != 0)
				{
					DoPause(10,0);   // ensure a break in the speech
					wcmdq[wcmdq_tail][0] = WCMD_WAVE;
					wcmdq[wcmdq_tail][1] = soundicon_tab[value].length;
					wcmdq[wcmdq_tail][2] = (long)soundicon_tab[value].data + 44;  // skip WAV header
					wcmdq[wcmdq_tail][3] = 0x1500;   // 16 bit data, amp=21
					WcmdqInc();
				}
			}
			break;

		case EMBED_M:   // named marker
			DoMarker(espeakEVENT_MARK, (sourceix & 0x7ff) + clause_start_char, 0, value);
			break;

		case EMBED_U:   // play sound
			DoMarker(espeakEVENT_PLAY, count_characters+1, 0, value);  // always occurs at end of clause
			break;

		default:
			DoPause(10,0);   // ensure a break in the speech
			wcmdq[wcmdq_tail][0] = WCMD_EMBEDDED;
			wcmdq[wcmdq_tail][1] = command;
			wcmdq[wcmdq_tail][2] = value;
			WcmdqInc();
			break;
		}
	} while ((word & 0x80) == 0);
}



int Generate(PHONEME_LIST *phoneme_list, int *n_ph, int resume)
{//============================================================
	static int  ix;
	static int  embedded_ix;
	static int  word_count;
	PHONEME_LIST *prev;
	PHONEME_LIST *next;
	PHONEME_LIST *next2;
	PHONEME_LIST *p;
	int  released;
	int  stress;
	int  modulation;
	int  pre_voiced;
	int  free_min;
	unsigned char *pitch_env=NULL;
	unsigned char *amp_env;
	PHONEME_TAB *ph;
	PHONEME_TAB *prev_ph;
	static int sourceix=0;

	PHONEME_DATA phdata;
	PHONEME_DATA phdata_prev;
	PHONEME_DATA phdata_next;
	PHONEME_DATA phdata_tone;
	FMT_PARAMS fmtp;

	if(option_quiet)
		return(0);

	if(mbrola_name[0] != 0)
		return(MbrolaGenerate(phoneme_list,n_ph,resume));

	if(resume == 0)
	{
		ix = 1;
		embedded_ix=0;
		word_count = 0;
		pitch_length = 0;
		amp_length = 0;
		last_frame = NULL;
		last_wcmdq = -1;
		syllable_start = wcmdq_tail;
		syllable_end = wcmdq_tail;
		syllable_centre = -1;
		last_pitch_cmd = -1;
		memset(vowel_transition,0,sizeof(vowel_transition));
		DoPause(0,0);    // isolate from the previous clause
	}

	while(ix < (*n_ph))
	{
		p = &phoneme_list[ix];

		if(p->type == phPAUSE)
			free_min = 5;
		else
		if(p->type != phVOWEL)
			free_min = 10;     // we need less Q space for non-vowels, and we need to generate phonemes after a vowel so that the pitch_length is filled in
		else
			free_min = MIN_WCMDQ;  // 22

		if(WcmdqFree() <= free_min)
			return(1);  // wait

		prev = &phoneme_list[ix-1];
		next = &phoneme_list[ix+1];
		next2 = &phoneme_list[ix+2];

		if(p->synthflags & SFLAG_EMBEDDED)
		{
			DoEmbedded(&embedded_ix, p->sourceix);
		}

		if(p->newword)
		{
			if(((p->type == phVOWEL) && (translator->langopts.param[LOPT_WORD_MERGE] & 1)) ||
				 (p->ph->phflags & phNOPAUSE))
			{
			}
			else
			{
				last_frame = NULL;
			}

			sourceix = (p->sourceix & 0x7ff) + clause_start_char;

			if(p->newword & 4)
				DoMarker(espeakEVENT_SENTENCE, sourceix, 0, count_sentences);  // start of sentence

//			if(p->newword & 2)
//				DoMarker(espeakEVENT_END, count_characters, 0, count_sentences);  // end of clause

			if(p->newword & 1)
				DoMarker(espeakEVENT_WORD, sourceix, p->sourceix >> 11, clause_start_word + word_count++);
		}

		EndAmplitude();

		if(p->prepause > 0)
			DoPause(p->prepause,1);

		if(option_phoneme_events && (p->type != phVOWEL))
		{
			// Note, for vowels, do the phoneme event after the vowel-start
			DoMarker(espeakEVENT_PHONEME, sourceix, 0, p->ph->mnemonic);
		}

		switch(p->type)
		{
		case phPAUSE:
			DoPause(p->length,0);
			break;

		case phSTOP:
			released = 0;
			if(next->type==phVOWEL)
			{
				 released = 1;
			}
			else
			if(!next->newword)
			{
				if(next->type==phLIQUID) released = 1;
//				if(((p->ph->phflags & phPLACE) == phPLACE_blb) && (next->ph->phflags & phSIBILANT)) released = 1;
			}
			if(released == 0)
				p->synthflags |= SFLAG_NEXT_PAUSE;

			InterpretPhoneme(NULL, 0, p, &phdata);
			phdata.pd_control |= pd_DONTLENGTHEN;
			DoSample3(&phdata, 0, 0);
			break;

		case phFRICATIVE:
			InterpretPhoneme(NULL, 0, p, &phdata);

			if(p->synthflags & SFLAG_LENGTHEN)
			{
				DoSample3(&phdata, p->length, 0);  // play it twice for [s:] etc.
			}
			DoSample3(&phdata, p->length, 0);
			break;

		case phVSTOP:
			ph = p->ph;
			memset(&fmtp, 0, sizeof(fmtp));
			fmtp.fmt_control = pd_DONTLENGTHEN;

			pre_voiced = 0;
			if(next->type==phVOWEL)
			{
				DoAmplitude(p->amp,NULL);
				DoPitch(envelope_data[p->env],p->pitch1,p->pitch2);
				pre_voiced = 1;
			}
			else
			if((next->type==phLIQUID) && !next->newword)
			{
				DoAmplitude(next->amp,NULL);
				DoPitch(envelope_data[next->env],next->pitch1,next->pitch2);
				pre_voiced = 1;
			}
			else
			{
				if(last_pitch_cmd < 0)
				{
					DoAmplitude(next->amp,NULL);
					DoPitch(envelope_data[p->env],p->pitch1,p->pitch2);
				}
			}

			if((prev->type==phVOWEL) || (prev->ph->phflags & phVOWEL2))
			{
				// a period of voicing before the release
				InterpretPhoneme(NULL, 0x01, p, &phdata);
				fmtp.fmt_addr = phdata.sound_addr[pd_FMT];
				fmtp.fmt_amp = phdata.sound_param[pd_FMT];

				DoSpect2(ph, 0, &fmtp, p, 0);
				if(p->synthflags & SFLAG_LENGTHEN)
				{
					DoPause(25,1);
					DoSpect2(ph, 0, &fmtp, p, 0);
				}
			}
			else
			{
				if(p->synthflags & SFLAG_LENGTHEN)
				{
					DoPause(50,0);
				}
			}

			if(pre_voiced)
			{
				// followed by a vowel, or liquid + vowel
				StartSyllable();
			}
			else
			{
				p->synthflags |= SFLAG_NEXT_PAUSE;
			}
			InterpretPhoneme(NULL,0, p, &phdata);
			fmtp.fmt_addr = phdata.sound_addr[pd_FMT];
			fmtp.fmt_amp = phdata.sound_param[pd_FMT];
			fmtp.wav_addr = phdata.sound_addr[pd_ADDWAV];
			fmtp.wav_amp = phdata.sound_param[pd_ADDWAV];
			DoSpect2(ph, 0, &fmtp, p, 0);

			if((p->newword == 0) && (next2->newword == 0))
			{
				if(next->type == phVFRICATIVE)
					DoPause(20,0);
				if(next->type == phFRICATIVE)
					DoPause(12,0);
			}
			break;

		case phVFRICATIVE:
			if(next->type==phVOWEL)
			{
				DoAmplitude(p->amp,NULL);
				DoPitch(envelope_data[p->env],p->pitch1,p->pitch2);
			}
			else
			if(next->type==phLIQUID)
			{
				DoAmplitude(next->amp,NULL);
				DoPitch(envelope_data[next->env],next->pitch1,next->pitch2);
			}
			else
			{
				if(last_pitch_cmd < 0)
				{
					DoAmplitude(p->amp,NULL);
					DoPitch(envelope_data[p->env],p->pitch1,p->pitch2);
				}
			}

			if((next->type==phVOWEL) || ((next->type==phLIQUID) && (next->newword==0)))  // ?? test 14.Aug.2007
			{
				StartSyllable();
			}
			else
			{
				p->synthflags |= SFLAG_NEXT_PAUSE;
			}
			InterpretPhoneme(NULL,0, p, &phdata);
			memset(&fmtp, 0, sizeof(fmtp));
			fmtp.std_length = phdata.pd_param[i_SET_LENGTH]*2;
			fmtp.fmt_addr = phdata.sound_addr[pd_FMT];
			fmtp.fmt_amp = phdata.sound_param[pd_FMT];
			fmtp.wav_addr = phdata.sound_addr[pd_ADDWAV];
			fmtp.wav_amp = phdata.sound_param[pd_ADDWAV];

			if(p->synthflags & SFLAG_LENGTHEN)
				DoSpect2(p->ph, 0, &fmtp, p, 0);
			DoSpect2(p->ph, 0, &fmtp, p, 0);
			break;

		case phNASAL:
			memset(&fmtp, 0, sizeof(fmtp));
			if(!(p->synthflags & SFLAG_SEQCONTINUE))
			{
				DoAmplitude(p->amp,NULL);
				DoPitch(envelope_data[p->env],p->pitch1,p->pitch2);
			}

			if(prev->type==phNASAL)
			{
				last_frame = NULL;
			}

			InterpretPhoneme(NULL,0, p, &phdata);
			fmtp.std_length = phdata.pd_param[i_SET_LENGTH]*2;
			fmtp.fmt_addr = phdata.sound_addr[pd_FMT];
			fmtp.fmt_amp = phdata.sound_param[pd_FMT];

			if(next->type==phVOWEL)
			{
				StartSyllable();
				DoSpect2(p->ph, 0, &fmtp, p, 0);
			}
			else
			if(prev->type==phVOWEL && (p->synthflags & SFLAG_SEQCONTINUE))
			{
				DoSpect2(p->ph, 0, &fmtp, p, 0);
			}
			else
			{
				last_frame = NULL;  // only for nasal ?
				DoSpect2(p->ph, 0, &fmtp, p, 0);
				last_frame = NULL;
			}

			break;

		case phLIQUID:
			memset(&fmtp, 0, sizeof(fmtp));
			modulation = 0;
			if(p->ph->phflags & phTRILL)
				modulation = 5;

			prev_ph = prev->ph;
//			if(p->newword)
//				prev_ph = phoneme_tab[phonPAUSE];    // pronounce fully at the start of a word

			if(!(p->synthflags & SFLAG_SEQCONTINUE))
			{
				DoAmplitude(p->amp,NULL);
				DoPitch(envelope_data[p->env],p->pitch1,p->pitch2);
			}

			if(prev->type==phNASAL)
			{
				last_frame = NULL;
			}

			if(next->type==phVOWEL)
			{
				StartSyllable();
			}
			InterpretPhoneme(NULL, 0, p, &phdata);
			fmtp.std_length = phdata.pd_param[i_SET_LENGTH]*2;
			fmtp.fmt_addr = phdata.sound_addr[pd_FMT];
			fmtp.fmt_amp = phdata.sound_param[pd_FMT];
			fmtp.wav_addr = phdata.sound_addr[pd_ADDWAV];
			fmtp.wav_amp = phdata.sound_param[pd_ADDWAV];
			DoSpect2(p->ph, 0, &fmtp, p, modulation);

			break;

		case phVOWEL:
			ph = p->ph;
			stress = p->stresslevel & 0xf;

			memset(&fmtp, 0, sizeof(fmtp));

			InterpretPhoneme(NULL, 0, p, &phdata);
			fmtp.std_length = phdata.pd_param[i_SET_LENGTH] * 2;

			if(((fmtp.fmt_addr = phdata.sound_addr[pd_VWLSTART]) != 0) && ((phdata.pd_control & pd_FORNEXTPH) == 0))
			{
				// a vowel start has been specified by the Vowel program
				fmtp.fmt_length = phdata.sound_param[pd_VWLSTART];
			}
			else
			if(prev->type != phPAUSE)
			{
				// check the previous phoneme
				InterpretPhoneme(NULL, 0, prev, &phdata_prev);
				if((fmtp.fmt_addr = phdata_prev.sound_addr[pd_VWLSTART]) != 0)
				{
					// a vowel start has been specified by the Vowel program
					fmtp.fmt2_lenadj = phdata_prev.sound_param[pd_VWLSTART];
				}
				fmtp.transition0 = phdata_prev.vowel_transition[0];
				fmtp.transition1 = phdata_prev.vowel_transition[1];
			}

			if(fmtp.fmt_addr == 0)
			{
				// use the default start for this vowel
				fmtp.use_vowelin = 1;
				fmtp.fmt_control = 1;
				fmtp.fmt_addr = phdata.sound_addr[pd_FMT];
			}

			fmtp.fmt_amp = phdata.sound_param[pd_FMT];

			pitch_env = envelope_data[p->env];
			amp_env = NULL;
			if(p->tone_ph != 0)
			{
				InterpretPhoneme2(p->tone_ph, &phdata_tone);
				pitch_env = GetEnvelope(phdata_tone.pitch_env);
				if(phdata_tone.amp_env > 0)
					amp_env = GetEnvelope(phdata_tone.amp_env);
			}

			StartSyllable();

			modulation = 2;
			if(stress <= 1)
				modulation = 1;  // 16ths
			else
			if(stress >= 7)
				modulation = 3;

			if(prev->type == phVSTOP || prev->type == phVFRICATIVE)
			{
				DoAmplitude(p->amp,amp_env);
				DoPitch(pitch_env,p->pitch1,p->pitch2);  // don't use prevocalic rising tone
				DoSpect2(ph, 1, &fmtp, p, modulation);
			}
			else
			if(prev->type==phLIQUID || prev->type==phNASAL)
			{
				DoAmplitude(p->amp,amp_env);
				DoSpect2(ph, 1, &fmtp, p, modulation);  // continue with pre-vocalic rising tone
				DoPitch(pitch_env,p->pitch1,p->pitch2);
			}
			else
			{
				if(!(p->synthflags & SFLAG_SEQCONTINUE))
				{
					DoAmplitude(p->amp,amp_env);
					DoPitch(pitch_env,p->pitch1,p->pitch2);
				}

				DoSpect2(ph, 1, &fmtp, p, modulation);
			}

			if(option_phoneme_events)
			{
				DoMarker(espeakEVENT_PHONEME, sourceix, 0, p->ph->mnemonic);
			}

			fmtp.fmt_addr = phdata.sound_addr[pd_FMT];
			fmtp.fmt_amp = phdata.sound_param[pd_FMT];
			fmtp.transition0 = 0;
			fmtp.transition1 = 0;

			if((fmtp.fmt2_addr = phdata.sound_addr[pd_VWLEND]) != 0)
			{
				fmtp.fmt2_lenadj = phdata.sound_param[pd_VWLEND];
			}
			else
			if(next->type != phPAUSE)
			{
				fmtp.fmt2_lenadj = 0;
				InterpretPhoneme(NULL, 0, next, &phdata_next);

				fmtp.use_vowelin = 1;
				fmtp.transition0 = phdata_next.vowel_transition[2];  // always do vowel_transition, even if ph_VWLEND ??  consider [N]
				fmtp.transition1 = phdata_next.vowel_transition[3];

				if((fmtp.fmt2_addr = phdata_next.sound_addr[pd_VWLEND]) != 0)
				{
					fmtp.fmt2_lenadj = phdata_next.sound_param[pd_VWLEND];
				}
			}

			DoSpect2(ph, 2, &fmtp, p, modulation);

			break;
		}
		ix++;
	}
	EndPitch(1);
	if(*n_ph > 0)
	{
		DoMarker(espeakEVENT_END, count_characters, 0, count_sentences);  // end of clause
		*n_ph = 0;
	}

	return(0);  // finished the phoneme list
}  //  end of Generate




static int timer_on = 0;
static int paused = 0;

int SynthOnTimer()
{//===============
	if(!timer_on)
	{
		return(WavegenCloseSound());
	}

	do {
		if(WcmdqUsed() > 0)
			WavegenOpenSound();

		if(Generate(phoneme_list,&n_phoneme_list,1)==0)
		{
			SpeakNextClause(NULL,NULL,1);
		}
	} while(skipping_text);

	return(0);
}


int SynthStatus()
{//==============
	return(timer_on | paused);
}



int SpeakNextClause(FILE *f_in, const void *text_in, int control)
{//==============================================================
// Speak text from file (f_in) or memory (text_in)
// control 0: start
//    either f_in or text_in is set, the other must be NULL

// The other calls have f_in and text_in = NULL
// control 1: speak next text
//         2: stop
//         3: pause (toggle)
//         4: is file being read (0=no, 1=yes)
//         5: interrupt and flush current text.

	int clause_tone;
	char *voice_change;
	static FILE *f_text=NULL;
	static const void *p_text=NULL;

	if(control == 4)
	{
		if((f_text == NULL) && (p_text == NULL))
			return(0);
		else
			return(1);
	}

	if(control == 2)
	{
		// stop speaking
		timer_on = 0;
		p_text = NULL;
		if(f_text != NULL)
		{
			fclose(f_text);
			f_text=NULL;
		}
		n_phoneme_list = 0;
		WcmdqStop();

		embedded_value[EMBED_T] = 0;
		return(0);
	}

	if(control == 3)
	{
		// toggle pause
		if(paused == 0)
		{
			timer_on = 0;
			paused = 2;
		}
		else
		{
			WavegenOpenSound();
			timer_on = 1;
			paused = 0;
			Generate(phoneme_list,&n_phoneme_list,0);   // re-start from beginning of clause
		}
		return(0);
	}

	if(control == 5)
	{
		// stop speaking, but continue looking for text
		n_phoneme_list = 0;
		WcmdqStop();
		return(0);
	}

	if((f_in != NULL) || (text_in != NULL))
	{
		f_text = f_in;
		p_text = text_in;
		timer_on = 1;
		paused = 0;
	}

	if((f_text==NULL) && (p_text==NULL))
	{
		skipping_text = 0;
		timer_on = 0;
		return(0);
	}

	if((f_text != NULL) && feof(f_text))
	{
		timer_on = 0;
		fclose(f_text);
		f_text=NULL;
		return(0);
	}

	if(current_phoneme_table != voice->phoneme_tab_ix)
	{
		SelectPhonemeTable(voice->phoneme_tab_ix);
	}

	// read the next clause from the input text file, translate it, and generate
	// entries in the wavegen command queue
	p_text = TranslateClause(translator, f_text, p_text, &clause_tone, &voice_change);

	CalcPitches(translator, clause_tone);
	CalcLengths(translator);

	if((option_phonemes > 0) || (phoneme_callback != NULL))
	{
		GetTranslatedPhonemeString(translator->phon_out,sizeof(translator->phon_out));
		if(option_phonemes > 0)
		{
			fprintf(f_trans,"%s\n",translator->phon_out);
	
			if(!iswalpha(0x010d))
			{
				// check that c-caron is recognized as an alphabetic character
				fprintf(stderr,"Warning: Accented letters are not recognized, eg: U+010D\nSet LC_CTYPE to a UTF-8 locale\n");
			}
		}
		if(phoneme_callback != NULL)
		{
			phoneme_callback(translator->phon_out);
		}
	}


	if(skipping_text)
	{
		n_phoneme_list = 0;
		return(1);
	}

	Generate(phoneme_list,&n_phoneme_list,0);
	WavegenOpenSound();

	if(voice_change != NULL)
	{
		// voice change at the end of the clause (i.e. clause was terminated by a voice change)
		new_voice = LoadVoiceVariant(voice_change,0); // add a Voice instruction to wavegen at the end of the clause
	}

	if(new_voice)
	{
		// finished the current clause, now change the voice if there was an embedded
		// change voice command at the end of it (i.e. clause was broken at the change voice command)
		DoVoiceChange(voice);
		new_voice = NULL;
	}

	return(1);
}  //  end of SpeakNextClause

By viewing downloads associated with this article you agree to the Terms of Service and the article's licence.

If a file you wish to view isn't highlighted, and is a text file (not binary), please let us know and we'll add colourisation support for it.

License

This article, along with any associated source code and files, is licensed under The BSD License


Written By
CEO bring-it-together s.r.o.
Slovakia Slovakia
Jozef Božek is currently a software engineer at bring-it-together s.r.o. in area of large scale infomation systems and mobile applications development.
He has been developing in C++ nearly full time since 2000, in Java since 2004 and in Objective-C since 2009. He is programming using Java EE SDK, iOS SDK, COM/DCOM, MFC, ATL, STL and so on Smile | :)

Comments and Discussions