2 EVALUATION OF FEATURE LEARNINGQUESTIONFOR GROUP L1 REGULARIZATION...

5.2 Evaluation of Feature Learning

Question

For group L

1

regularization term, we set the ε =

Why do teeth bleed at night and how do you prevent/stop it? This

10

−4

in Equation 6. To see how much the dif-

morning I woke up with blood caked between my two front teeth.[...]

Best Answer - Chosen by Asker

ferent textual and non-textual features contribute to

Periodontal disease is a possibility, gingivitis, or some gum infec-

community answer summarization, the accumulated

tion. Teeth don’t bleed; gums bleed.

weight of each group of sentence-level features

5

is

Summarized Answer Generated by Our Method

presented in Figure 2. It shows that the textual fea-

tures such as 1 (Sentence Length), 2 (Position) 3 (An-

tion. Teeth don’t bleed; gums bleed. Gums that bleed could be a

swer Length), 6 (Has Link) and non-textual features

sign of a more serious issue like leukemia, an infection, gum dis-

ease, a blood disorder, or a vitamin deficiency. wash your mouth

such as 8 (Best Answer Star) , 12 (Total Answer

with warm water and salt, it will help to strengthen your gum and

Number) as well as 13 (Total Points) have larger

teeth, also salt avoid infection.

weights, which play a significant role in the sum-

Table 4:Summarized answer by our general CRF based model

marization task as we intuitively considered; fea-

for the question in Table 1.

tures 4 (Stopwords Rate), 5 (Uppercase Rate) and 9

(Thumbs Up) have medium weights relatively; and

6 Conclusions

the other features like 7 (Similarity to Question), 10

(Author Level) and 11 (Best Answer Rate) have the

We proposed a general CRF based community an-

smallest accumulated weights. The main reasons

swer summarization method to deal with the in-

that the feature 7 (Similarity to Question) has low

complete answer problem for deep understanding of

contribution is that we have utilized the similarity

complex multi-sentence questions. Our main con-

to question in the contextual factors, and this simi-

tributions are that we proposed a systematic way

larity feature in the single site becomes redundant.

for modeling semantic contextual interactions be-

Similarly, the features Author Level and Best An-

tween the answer sentences based on question seg-

swer Number are likely to be redundant when other

mentation and we explored both the textual and non-

non-textual features(Total Answer Number and To-

textual answer features learned via a group L

1

reg-

tal Points) are presented together. The experimental

ularization. We showed that our method is able to

results demonstrate that with the use of group L

1

-

achieve significant improvements in performance of

regularization we have learnt better combination of

answer summarization compared to other baselines

these features.

and previous methods on Yahoo! Answers dataset.

We planed to extend our proposed model with more

advanced feature learning as well as enriching our

5

Note that we have already evaluated the contribution of the

summarized answer with more available Web re-

contextual factors in Section 5.1.

sources.

Yandong Liu, Jiang Bian, and Eugene Agichtein. 2008.Predicting Information Seeker Satisfaction in Commu-

Acknowledgements

nity Question Answering. Proceedings of the 31thACM SIGIR Conference.

This work was supported by the NSFC under Grant

Yuanjie Liu, Shasha Li, Yunbo Cao, Chin-Yew Lin,

No.61073002 and No.60773077.

Dingyi Han, and Yong Yu. 2008. Understanding andsummarizing answers in community-based questionanswering services. Proceedings of the 22nd ICCL,

References

pages 497–504.S. Riezler, A. Vasserman, I. Tsochantaridis, V. Mittal, andL. A. Adamic, J. Zhang, E. Bakshy, and M. S. Ackerman.Y. Liu. 2007. Statistical machine translation for query