1 CONDITIONAL RANDOM FIELDSSUB-QUESTIONS ALONG WITH THEIR CONTEXTS,...

Question

3.1 Conditional Random Fieldssub-questions along with their contexts, then se-We utilize the probabilistic graphical model to solvequentially retrieve the sub questions one by one,the answer summarization task, Figure 1 gives someand return similar questions and their best answersillustrations, in which the sites correspond to the(Wang et al., 2010). This strategy works well in gen-sentences and the edges are utilized to model theeral, however, as the automatic question segmenta-interactions between sentences. Specifically, let xtion is imperfect and the matched similar questionsbe the sentence sequence to all answers within aare likely to be generated in different contextual sit-question thread, and y be the corresponding label se-uations, this strategy often could not combine multi-quence. Every component yiof y has a binary value,ple independent best answers of sub questions seam-with +1 for the summary sentence and -1 otherwise.lessly and may introduce redundancy in final answer.Then under CRF (Lafferty et al., 2001), the condi-On general problem of cQA answer summariza-tional probability of y given x obeys the followingtion, Liu et al.(2008) manually classified both ques-distribution:tions and answers into different taxonomies and ap-plied clustering algorithms for answer summariza-p(y | x) = 1µlgl(v, y |v, x)Z(x) exp( ∑tion.They utilized textual features for open and opin-v∈V,l(1)ion type questions. Through exploiting metadata,+ ∑λkfk(e, y |e, x)),Tomasoni and Huang(2010) introduced four char-e∈E,kacteristics (constraints) of summarized answer andcombined them in an additional model as well aswhere Z(x) is the normalization constant calleda multiplicative model. In order to leverage con-partition function, gl denotes the cQA feature func-text, Yang et al.(2011) employed a dual wing fac-tion of site l , fkdenotes the function of edge k ( mod-tor graph to mutually enhance the performance ofeling the interactions between sentences), µ and λsocial document summarization with user generatedare respectively the weights of function of sites andcontent like tweets. Wang et al. (2011) learned on-edges, and y |tdenotes the components of y relatedline discussion structures such as the replying rela-to site (edge) t.tionship by using the general CRFs and presented a

1 CONDITIONAL RANDOM FIELDSSUB-QUESTIONS ALONG WITH THEIR CONTEXTS,...

3.1 Conditional Random Fields

sub-questions along with their contexts, then se-

We utilize the probabilistic graphical model to solve

quentially retrieve the sub questions one by one,

the answer summarization task, Figure 1 gives some

and return similar questions and their best answers

illustrations, in which the sites correspond to the

(Wang et al., 2010). This strategy works well in gen-

sentences and the edges are utilized to model the

eral, however, as the automatic question segmenta-

interactions between sentences. Specifically, let x

tion is imperfect and the matched similar questions

be the sentence sequence to all answers within a

are likely to be generated in different contextual sit-

question thread, and y be the corresponding label se-

uations, this strategy often could not combine multi-

quence. Every component y

of y has a binary value,

ple independent best answers of sub questions seam-

with +1 for the summary sentence and -1 otherwise.

lessly and may introduce redundancy in final answer.

Then under CRF (Lafferty et al., 2001), the condi-

On general problem of cQA answer summariza-

tional probability of y given x obeys the following

tion, Liu et al.(2008) manually classified both ques-

distribution:

tions and answers into different taxonomies and ap-

plied clustering algorithms for answer summariza-

p(y | x) = 1

µ

g

(v, y |

, x)

Z(x) exp( ∑

tion.They utilized textual features for open and opin-

(1)

ion type questions. Through exploiting metadata,

+ ∑

λ

f

(e, y |

, x)),

Tomasoni and Huang(2010) introduced four char-

acteristics (constraints) of summarized answer and

combined them in an additional model as well as

where Z(x) is the normalization constant called

a multiplicative model. In order to leverage con-

partition function, g

denotes the cQA feature func-

text, Yang et al.(2011) employed a dual wing fac-

tion of site l , f

denotes the function of edge k ( mod-

tor graph to mutually enhance the performance of

eling the interactions between sentences), µ and λ

social document summarization with user generated

are respectively the weights of function of sites and

content like tweets. Wang et al. (2011) learned on-

edges, and y |

denotes the components of y related

line discussion structures such as the replying rela-

to site (edge) t.

tionship by using the general CRFs and presented a

Bạn đang xem 3. - BÁO CÁO KHOA HỌC: "COMMUNITY ANSWER SUMMARIZATION FOR MULTI-SENTENCE QUESTION WITH GROUP L1 REGULARIZATION" PDF