FINDING THE MAIN VERB POSITION AND BEFORE THE RESPONSE GENERATOR SE...

4. Finding the Main Verb Position

and before the response generator selects

sentences for the answer. It modifies any MT

Chinese ordering differs from English mainly

documents retrieved by the embedded

in clause ordering (Wang et al., 2007) and

information retrieval system that are missing a

within the noun phrase. But within a clause

main verb. All MT results are provided by a

centered by a verb, Chinese mostly uses a SVO

phrase-based SMT system.

or SV structure, like English (Yamada and

Post-editing includes three steps: detect a

Knight 2001), and we can assume the local

clause with a missing main verb, determine

alignment centered by a verb between Chinese

which Chinese verb should have been translated,

and English is a linear mapping relation. Under

and find an example sentence in the related

this assumption, the translation of “被捕” in the

documents with an appropriate sentence which

above example should be placed in the position

can be used to modify the sentence in question.

between “Saddam” and “.”. Thus, once we find a

To detect clauses, we first tag the corpus using a

VTG, its translation can be inserted into the

Conditional Random Fields (CRF) POS tagger

corresponding position of the target sentence

and then use manually designed regular

using the alignment.

expressions to identify main clauses of the

This assumes, however, that there is only one

sentence, subordinate clauses (i.e., clauses which

VTG found within a clause. In practice, more

are arguments to a verb) and conjunct clauses in

than one VTG may be found in a clause. If we

a sentence with conjunction. We do not handle

choose one of them, we risk making the wrong

adjunct clauses. Hereafter, we simply refer to all

choice. Instead, we insert the translations of both

of these as “clause”. If a clause does not have

VTGs simultaneously. This strategy could result

any POS tag that can serve as a main verb (VB,

in more than one main verb in a clause, but it is

VBD, VBP, VBZ), it is marked as missing a

more helpful than having no verb at all.

main verb.

MT alignment information is used to further