GIZA++ ============ Supplement to the GIZA++ Readme ============================================================================== Compiling GIZA++ with GNU 2.96 ------------------------------------------------------------------------------ To compile GIZA++: 1. The GNU compiler 2.96 is needed. 2. Two changes need to be made to the Makefile: A. set the path where GIZA++ will be installed. For example: INSTALLDIR = /home/bthomson/GIZA++ B. set the location of the gnu compiler. For example: CC = g++ 3. Type 'make' to make the executable program Compiling mkcls with GNU compiler 2.96 ------------------------------------------------------------------------------ To compile mkcls: 1. Three changes need to be made to the Makefile: A. set the path where mkcls will be installed. For example: INSTALLDIR = /home/bthomson/mkcls B. set the shell that you are using. For example: SHELL = bash C. set the location of the gnu compiler. For example: CC = g++ 2. Type 'make' to make the executable program. Compiling plain2snt.cc with GNU compiler 2.96 ------------------------------------------------------------------------------ To compile plain2snt.cc: ./g++ -o plain2snt.out plain2snt.cc Training ------------------------------------------------------------------------------ 1. Compile a bilingual corpus which is sentence aligned. For example: cat *.e > english cat *.f > french 2. Run plain2snt.out which is located in the GIZA++ package. The first argument is the source language and the second argument is the target language. ./plain2snt.out french english Three output files will be created: 1. english.vcb 2. french.vcb 3. frenchenglish.snt 3. Run mkcls to create word classes. mkcls is a separate package. This is optional. ./_mkcls -penglish -Venglish.vcb.classes ./_mkcls -pfrench -Vfrench.vcb.classes Four output files will be created: 1. english.vcb.classes 2. english.vcb.classes.cats 3. french.vcb.classes 4. french.vcb.classes.cats 4. The library paths may need to be set: LD_LIBRARY_PATH /opt/sfw/lib 5. Run GIZA++ ./GIZA++ -T french.vcb -S english.vcb -C frenchenglish.snt ==============================================================================