2 EVALUATION OF FEATURE LEARNINGQUESTIONFOR GROUP L1 REGULARIZATION...
5.2 Evaluation of Feature Learning
Question
For group L
1
regularization term, we set the ε =
Why do teeth bleed at night and how do you prevent/stop it? This
10
−4
in Equation 6. To see how much the dif-
morning I woke up with blood caked between my two front teeth.[...]
Best Answer - Chosen by Askerferent textual and non-textual features contribute to
Periodontal disease is a possibility, gingivitis, or some gum infec-
community answer summarization, the accumulated
tion. Teeth don’t bleed; gums bleed.
weight of each group of sentence-level features
5
is
Summarized Answer Generated by Our Methodpresented in Figure 2. It shows that the textual fea-
tures such as 1 (Sentence Length), 2 (Position) 3 (An-
tion. Teeth don’t bleed; gums bleed. Gums that bleed could be a
swer Length), 6 (Has Link) and non-textual features
sign of a more serious issue like leukemia, an infection, gum dis-
ease, a blood disorder, or a vitamin deficiency. wash your mouth
such as 8 (Best Answer Star) , 12 (Total Answer
with warm water and salt, it will help to strengthen your gum and
Number) as well as 13 (Total Points) have larger
teeth, also salt avoid infection.
weights, which play a significant role in the sum-
Table 4:Summarized answer by our general CRF based modelmarization task as we intuitively considered; fea-
for the question in Table 1.tures 4 (Stopwords Rate), 5 (Uppercase Rate) and 9
(Thumbs Up) have medium weights relatively; and
6 Conclusions
the other features like 7 (Similarity to Question), 10
(Author Level) and 11 (Best Answer Rate) have the
We proposed a general CRF based community an-
smallest accumulated weights. The main reasons
swer summarization method to deal with the in-
that the feature 7 (Similarity to Question) has low
complete answer problem for deep understanding of
contribution is that we have utilized the similarity
complex multi-sentence questions. Our main con-
to question in the contextual factors, and this simi-
tributions are that we proposed a systematic way
larity feature in the single site becomes redundant.
for modeling semantic contextual interactions be-
Similarly, the features Author Level and Best An-
tween the answer sentences based on question seg-
swer Number are likely to be redundant when other
mentation and we explored both the textual and non-
non-textual features(Total Answer Number and To-
textual answer features learned via a group L
1
reg-
tal Points) are presented together. The experimental
ularization. We showed that our method is able to
results demonstrate that with the use of group L
1
-
achieve significant improvements in performance of
regularization we have learnt better combination of
answer summarization compared to other baselines
these features.
and previous methods on Yahoo! Answers dataset.
We planed to extend our proposed model with more
advanced feature learning as well as enriching our
5
Note that we have already evaluated the contribution of the