3.1.3 Combined Feature Set (CF)
A Question Feature Set (QF) is a set of features
Combined Feature Set (CF) contains features cre-
extracted only from a question sentence. This fea-
ated by combining question features and document
ture set is defined as belonging to a question sen-
features. QBTE Model 1 employs CF. For each word
tence.
w
i
, the following features are created.
The following are elements of a Question Feature
cw–k,. . .,cw+0,. . .,cw+k: matching results
Set:
(true/false) between each of dw–k,...,dw+k
qw: an enumeration of the word n-grams (1 ≤
features and any qw feature, e.g., cw–1:true if
n ≤ N ), e.g., given question “What is CNN?”,
dw–1:President and qw: President,
the features are {qw:What, qw:is, qw:CNN,
qw:What-is, qw:is-CNN } if N = 2,
cm1–k,. . .,cm1+0,. . .,cm1+k: matching results
(true/false) between each of dm1–k,...,dm1+k
qq: interrogative words (e.g., who, where, what,
features and any POS1 in qm1 features,
how many),
cm2–k,. . .,cm2+0,. . .,cm2+k: matching results
qm1: POS1 of words in the question, e.g., given
(true/false) between each of dm2–k,...,dm2+k
“What is CNN?”, { qm1:wh-adv, qm1:verb,
features and any POS2 in qm2 features,
qm1:noun } are features,
cm3–k,. . .,cm3+0,. . .,cm3+k: matching results
qm2: POS2 of words in the question,
(true/false) between each of dm3–k,...,dm3+k
features and any POS3 in qm3 features,
qm3: POS3 of words in the question,
qm4: POS4 of words in the question.
cm4–k,. . .,cm4+0,. . .,cm4+k: matching results
(true/false) between each of dm4–k,...,dm4+k
POS1-POS4 indicate part-of-speech (POS) of the
features and any POS4 in qm4 features,
IPA POS tag set generated by the Japanese mor-
phological analyzer ChaSen. For example, “Tokyo”
cq–k,. . .,cq+0,. . .,cq+k: combinations of each of
is analyzed as POS1 = noun, POS2 = propernoun,
dw–k,...,dw+k features and qw features, e.g.,
POS3 = location, and POS4 = general. This paper
cq–1:President&Who is a combination of dw–
used up to 4-grams for qw.
Bạn đang xem 3. - BÁO CÁO KHOA HỌC QUESTION ANSWERING AS QUESTION BIASED TERM EXTRACTION A NEW APPROACH TOWARD MULTILINGUAL QA DOC