TO CONSTRUCT A QA SYSTEM FROM QUESTION-ANSWEROF AN INPUT SYMBOL AND...

1, to construct a QA system from question-answer

of an input symbol and label and represent them as

pairs based on the QBTE Approach. When a user

h˜ x

i

, y ˜

i

i using index i (1 ≤ i ≤ m) .

gives a question, the framework finds answers to the

In this paper, feature function f

i

is defined as fol-

question in the following two steps.

lows.

Document Retrieval retrieves the top N articles or

(

1 if x ˜

i

∈ x and y = ˜ y

i

paragraphs from a large-scale corpus.

f

i

(x, y) =

0 otherwise

QBTE creates input data by combining the question

We use all combinations of input symbols in x and

features and documents features, evaluates the

class labels for features (or the feature function) of

input data, and outputs the top M answers.

3

MEMs.

Since this paper focuses on QBTE, this paper uses

With Lagrangian λ = λ

1

, ..., λ

m

, the dual func-

a simple idf method in document retrieval.

tion of H is:

Let w

i

be words and w

1

,w

2

,. . .w

m

be a docu-

ment. Question Answering in the QBTE Model 1

p(x) log Z

λ

(x) +

X

λ

i

p(f ˜

i

),

˜

Ψ(λ) = −

X

involves directly classifying words w

i

in the docu-

x

ment into answer words or non-answer words. That

where Z

λ

(x) =

X

λ

i

f

i

(x, y)) and p(x) ˜

exp(

X

is, given input x

(i)

for w

i

, its class label is selected

from among {I, O, B } as follows:

y

i

and p(f ˜

i

) indicate the empirical distribution of x and

I: if the word is in the middle of the answer word

f

i

in the training data.

sequence;

The dual optimization problem λ∗ =

Ψ(λ) can be efficiently solved as an

argmax

O: if the word is not in the answer word sequence;

λ

optimization problem without constraints. As a

B: if the word is the start word of the answer word

result, probabilistic model p∗ = p

λ

∗ is obtained as:

sequence.

!

The class labeling system in our experiment is

.

p

λ∗

(y|x) = 1

λ

i

f

i

(x, y)

IOB2 (Sang, 2000), which is a variation of

Z

λ

(x) exp

X

IOB (Ramshaw and Marcus, 1995).

Input x

(i)

of each word is defined as described be-