next up previous contents
Next: Chart parser CFG file Up: Using the sample main Previous: HMM tagger parameter file   Contents


Relaxation Labelling constraint grammar file

The syntax of the file is based on that of Constraint Grammars [KVHA95], but simplified in many aspects, and modified to include weighted constraints.

An initial file based on statistical constraints may be generated from a tagged corpus using the src/utilities/train-relax.perl script provided with FreeLing. Later, hand written constraints can be added to the file to improve the tagger behaviour.

The file consists of a serie of context constraits, each of the form: weight label context;

Where:

Examples:
The next constraint states a high incompatibility for a word being a definite determiner (DA*) if the next word is a personal form of a verb (VMI*):
-8.143 DA* (1 VMI*);

The next constraint states a very high compatibility for the word mucho (much) being an indefinite determiner (DI*) -and thus not being a pronoun or an adverb, or any other analysis it may have- if the following word is a noun (NC*):
60.0 DI* (mucho) (1 NC*);

The next constraint states a positive compatibility value for a word being a noun (NC*) if somewhere to its left there is a determiner or an adjective (DA* or AQ*), and between them there is not any other noun:
5.0 NC* (-1* DA* or AQ* barrier NC*);

The next constraint adds some positive compatibility to a 3rd person personal pronoun being of undefined gender and number (PP3CNA00) if it has the possibility of being masculine singular (PP3MSA00), the next word may have lemma estar (to be), and the sencond word to the right is not a gerund (VMG). This rule is intended to solve the different behaviour of the Spanish word lo in sentences such as si, lo estoy or lo estoy viendo.
0.5 PP3CNA00 (0 PP3MSA00) (1 <estar>) (not 2 VMG*);


next up previous contents
Next: Chart parser CFG file Up: Using the sample main Previous: HMM tagger parameter file   Contents
2006-04-26