                                            Mons, March 9th 2000

This is the second release of that package... 

cosmetic bug with input file parsing (extra whitespaces caused wrong
alignment) and C++ modifications to cope with ANSI requirements
(compile without warning with g++ 2.95.2 ). 

--------------------------------------------------------------------
                                            Mons, May 2nd 1999

Yo, here is the history of the development of Letter/phoneme alignement
package. The first strategy was issued by Kevin Lenzo (perl package):

Compute probability of the letter&phoneme association through an
aligned dictionary, then realign the dictionary with those estimated
probability and so on until converge.

To be possible this method need that one letter emits 0 or 1
phoneme. First problem, some letters can generate more sounds:

1)  lax -> l { k s                     ( /x/   -> k s   )
2)  example -> E g z { m p @ l         ( /ex/  -> E g z )
3)  cute    -> k j u t                 ( /cu/ -> k j u  )
4)  acumulate -> @ k j u m j u l EI t  ( /mu/ -> m j u  )
5)  aluminium -> { l j @ m I n i @ m   ( /lu/ -> l j @  )
6)  symbolism s I1 m b @ l I z @ m     ( /ism/ -> I z @ m)

Pseudo phoneme k+s g+z and so on can solve this 1st problem.

Second problem, beware of numeric instabilities :-/ when 2 paths are
equivallent... ralated problem is : emit ASAP or ASLAP ?

In french 2 class of words ->

q u e l q u ' u n  <- should emit ASALP
k   E l k     9~

v i e   i l l i  <- arrrg
v j E+j i _ _ _

a n e s s e s    <- should emit ASAP
a n E s _ _ _

The best choice is ASAP for French because of mute endings

a n e s s e s 
a n E s _ _ _

Third and major problem, is intrinsinc limitations of
P(letter | phone )  -> 

< alignaient VERB a l i n j E _ _ _ _ 
> alignaient VERB a l i n _ _ j E _ _ 

< appareilliez VERB a p _ a R E+j _ _ _ i E _ 
> appareilliez VERB a p _ a R E+j i _ _ _ E _ 

> vigueur NVERB v i g _ _ 9 R
< vigueur NVERB v i g 9 _ _ R

that is to say the ability to figur out letters that works together (in the
example above "gu" produces /g/ 

So I replaced p( letter | phone) with p(letters | phone) 
.... well this did not work as expected because in words like:

poche NVERB p O _ S _ 

the system learns the cooccurence: p("oc" | O) p("ch" | S) and so
on. The linguistic fact is the second, though with open syllable the 
first one will be significant and will generate ugly alignments like:

occise VERB  o _  s i  z _  with the following grouping:
            [o _] s i [z _] 

Conclusion the system needs more information about context...

p( letters , next_letter | phone ) will do :-)

p( "oc"+"h" | O ) will be evaluated and won't be applied to
"occise". There's a special marker for end_of_word '$'

Works correcly for French... please send acknowledgement for other languages

		Vincent PAGEL
------------------------------------------------------------------------------
Vincent PAGEL               Labo. Traitement du Signal et Theorie des Circuits
email: pagel@tcts.fpms.ac.be                     Faculte Polytechnique de Mons
tel: /32/65/374133  fax:374129             31, bvd Dolez, B-7000 Mons, Belgium
