. TO AVOID ZERO PROBABILITY, WE USE JELINEK-TIES AS RANKING FEATU...

Question

2010). To avoid zero probability, we use Jelinek-

ties as ranking features is likely to improve the ques-

Mercer smoothing (Zhai and Lafferty, 2001) due to

tion retrieval performance, as we will show in our

experiments.

its good performance and cheap computational cost.

Unlike the general natural language translation,

So the ranking function for the query likelihood lan-

the parallel sentences between questions and an-

guage model with Jelinek-Mercer smoothing can be

D: … for good cold home remedies …

document

written as:

E: [for, good, cold, home remedies]

segmentation

F: [for

1

, best

2

, stuffy nose

3

, home remedy

4

]

translation

(1−λ)P

ml

(w|D) +λP

ml

(w|C)Score(q, D) = ∏

M: (1

Ƥ

3

⧎

2

Ƥ

1

⧎

3

Ƥ

4

⧎

4

Ƥ

2)

permutation

q: best home remedy for stuffy nose

queried question

w

∈

q

(1)Figure 1: Example describing the generative procedureP

_ml

(w|D) = #(w, D)|D| , P

_ml

(w|C) = #(w, C)|C| (2)of the phrase-based translation model.

where q is the queried question, D is a document, C

3 Our Approach: Phrase-Based

is background collection, λ is smoothing parameter.

Translation Model for Question

#(t, D) is the frequency of term t in D, | D | and | C |

Retrieval

denote the length of D and C respectively.

Answer

2010). To avoid zero probability, we use Jelinek-

ties as ranking features is likely to improve the ques-

Mercer smoothing (Zhai and Lafferty, 2001) due to

tion retrieval performance, as we will show in our

experiments.

its good performance and cheap computational cost.

Unlike the general natural language translation,

So the ranking function for the query likelihood lan-

the parallel sentences between questions and an-

guage model with Jelinek-Mercer smoothing can be

D: … for good cold home remedies …

document

written as:

E: [for, good, cold, home remedies]

segmentation

F: [for

1

, best

2

, stuffy nose

3

, home remedy

4

]

translation

(1−λ)P

ml

(w|D) +λP

ml

(w|C)Score(q, D) = ∏

M: (1

Ƥ

3

⧎

2

Ƥ

1

⧎

3

Ƥ

4

⧎

4

Ƥ

2)

permutation

q: best home remedy for stuffy nose

queried question

w

∈

q

(1)Figure 1: Example describing the generative procedureP

_ml

(w|D) = #(w, D)|D| , P

_ml

(w|C) = #(w, C)|C| (2)of the phrase-based translation model.

. TO AVOID ZERO PROBABILITY, WE USE JELINEK-TIES AS RANKING FEATU...

2010). To avoid zero probability, we use Jelinek-

ties as ranking features is likely to improve the ques-

Mercer smoothing (Zhai and Lafferty, 2001) due to

tion retrieval performance, as we will show in our

experiments.

its good performance and cheap computational cost.

Unlike the general natural language translation,

So the ranking function for the query likelihood lan-

the parallel sentences between questions and an-

guage model with Jelinek-Mercer smoothing can be

written as:

where q is the queried question, D is a document, C

3 Our Approach: Phrase-Based

is background collection, λ is smoothing parameter.

Translation Model for Question

#(t, D) is the frequency of term t in D, | D | and | C |

Retrieval

denote the length of D and C respectively.

Bạn đang xem 2010) - TÀI LIỆU BÁO CÁO KHOA HỌC PHRASE BASED TRANSLATION MODEL FOR QUESTION RETRIEVAL IN COMMUNITY QUESTION ANSWER ARCHIVES PPT