1 THE MDL-BASED TREE CUT MODEL TOPIC, E.G. HAMBURG, BERLIN), WHICH...

2.1 The MDL-based tree cut model

topic, e.g. Hamburg, Berlin), which makes them

unsuitable as the results of question search.

Formally, a tree cut model (Li and Abe, 1998)

We also propose to use the MDL-based (Mini-

can be represented by a pair consisting of a tree cut

mum Description Length) tree cut model for auto-

, and a probability parameter vector of the same

matically identifying question topic and question

length, that is,

focus. Given a question as query, a structure called

, (1)

question tree is constructed over the question col-

where and are

lection including the queried question and all the

, , . . ,

related questions, and then the MDL principle is

, , … , (2)

applied to find a cut of the question tree specifying

where , , …  are classes determined by a cut

the question topic and the question focus of each

in the tree and ∑ 1. A ‘cut’ in a tree is

question.

any set of nodes in the tree that defines a partition

In a summary, we summarize questions in a data

of all the nodes, viewing each node as representing

structure consisting of question topic and question

the set of child nodes as well as itself. For example,

focus. On the basis of this, we then propose to

the cut indicated by the dash line in Figure 1 cor-

model question topic and question focus in a lan-

responds to three classes: , , , , and

guage modeling framework for search. To the best

, , , .

of our knowledge, none of the existing studies ad-

dressed question search by modeling both question

topic and question focus.

We empirically conduct the question search with

questions about ‘travel’ and ‘computers & internet’.

Both kinds of questions are from Yahoo! Answers.

Experimental results show that our approach can

significantly improve traditional methods (e.g.

Figure 1. An Example on the Tree Cut Model

VSM, LMIR) in retrieving relevant questions.

The rest of the paper is organized as follow. In

A straightforward way for determining a cut of a