1 CONDITIONAL RANDOM FIELDSSUB-QUESTIONS ALONG WITH THEIR CONTEXTS,...

3.1 Conditional Random Fields

sub-questions along with their contexts, then se-

We utilize the probabilistic graphical model to solve

quentially retrieve the sub questions one by one,

the answer summarization task, Figure 1 gives some

and return similar questions and their best answers

illustrations, in which the sites correspond to the

(Wang et al., 2010). This strategy works well in gen-

sentences and the edges are utilized to model the

eral, however, as the automatic question segmenta-

interactions between sentences. Specifically, let x

tion is imperfect and the matched similar questions

be the sentence sequence to all answers within a

are likely to be generated in different contextual sit-

question thread, and y be the corresponding label se-

uations, this strategy often could not combine multi-

quence. Every component y

i

of y has a binary value,

ple independent best answers of sub questions seam-

with +1 for the summary sentence and -1 otherwise.

lessly and may introduce redundancy in final answer.

Then under CRF (Lafferty et al., 2001), the condi-

On general problem of cQA answer summariza-

tional probability of y given x obeys the following

tion, Liu et al.(2008) manually classified both ques-

distribution:

tions and answers into different taxonomies and ap-

plied clustering algorithms for answer summariza-

p(y | x) = 1

µ

l

g

l

(v, y |

v

, x)

Z(x) exp(

tion.They utilized textual features for open and opin-

v∈V,l

(1)

ion type questions. Through exploiting metadata,

+ ∑

λ

k

f

k

(e, y |

e

, x)),

Tomasoni and Huang(2010) introduced four char-

e∈E,k

acteristics (constraints) of summarized answer and

combined them in an additional model as well as

where Z(x) is the normalization constant called

a multiplicative model. In order to leverage con-

partition function, g

l

denotes the cQA feature func-

text, Yang et al.(2011) employed a dual wing fac-

tion of site l , f

k

denotes the function of edge k ( mod-

tor graph to mutually enhance the performance of

eling the interactions between sentences), µ and λ

social document summarization with user generated

are respectively the weights of function of sites and

content like tweets. Wang et al. (2011) learned on-

edges, and y |

t

denotes the components of y related

line discussion structures such as the replying rela-

to site (edge) t.

tionship by using the general CRFs and presented a