TO CONSTRUCT A QA SYSTEM FROM QUESTION-ANSWEROF AN INPUT SYMBOL AND...
1, to construct a QA system from question-answer
of an input symbol and label and represent them as
pairs based on the QBTE Approach. When a user
h˜ x
i
, y ˜
i
i using index i (1 ≤ i ≤ m) .
gives a question, the framework finds answers to the
In this paper, feature function f
i
is defined as fol-
question in the following two steps.
lows.
Document Retrieval retrieves the top N articles or
(1 if x ˜
i
∈ x and y = ˜ y
i
paragraphs from a large-scale corpus.
f
i
(x, y) =
0 otherwise
QBTE creates input data by combining the question
We use all combinations of input symbols in x and
features and documents features, evaluates the
class labels for features (or the feature function) of
input data, and outputs the top M answers.
3
MEMs.
Since this paper focuses on QBTE, this paper uses
With Lagrangian λ = λ
1
, ..., λ
m
, the dual func-
a simple idf method in document retrieval.
tion of H is:
Let w
i
be words and w
1
,w
2
,. . .w
m
be a docu-
ment. Question Answering in the QBTE Model 1
p(x) log Z
λ
(x) +
Xλ
i
p(f ˜
i
),
˜
Ψ(λ) = −
Xinvolves directly classifying words w
i
in the docu-
x
ment into answer words or non-answer words. That
where Z
λ
(x) =
Xλ
i
f
i
(x, y)) and p(x) ˜
exp(
Xis, given input x
(i)
for w
i
, its class label is selected
from among {I, O, B } as follows:
y
i
and p(f ˜
i
) indicate the empirical distribution of x and
I: if the word is in the middle of the answer word
f
i
in the training data.
sequence;
The dual optimization problem λ∗ =
Ψ(λ) can be efficiently solved as an
argmax
O: if the word is not in the answer word sequence;
λ
optimization problem without constraints. As a
B: if the word is the start word of the answer word
result, probabilistic model p∗ = p
λ
∗ is obtained as:
sequence.
!The class labeling system in our experiment is
.
p
λ∗
(y|x) = 1
λ
i
f
i
(x, y)
IOB2 (Sang, 2000), which is a variation of
Z
λ
(x) exp
XIOB (Ramshaw and Marcus, 1995).
Input x
(i)