4 SIMILARITY MEASUREQ2

5.4 Similarity Measure

Q2: Where should my family go for spring

The first experiment conducted question recommen-

break?

dation based on their information need parts. Dif-

InfoN ... family wants to go somewhere for

ferent text similarity methods described in section

a couple days during spring break ...

3 were used to measure the similarity between the

prefers a warmer climate and we live in

information need texts. In TFIDF similarity mea-

IL, so it shouldn’t be SUPER far away.

sure (TFIDF), the idf values for each word were

... a family road trip. ...

computed from frequency counts over the entire

RQ1 Whats a cheap travel destination forspring break?

Aquaint corpus

8

. For calculating the word-to-word

InfoN I live in houston texas and i’m trying to

knowledge-based similarity, a WordNet::Similarity

find i inexpensive place to go for spring

Java implementation

9

of the similarity measures lin

break with my family.My parents don’t

(Knowledge2) and jcn (Knowledge1) is used in this

want to spend a lot of money due to the

paper. For calculating topic model based similarity,

economy crisis, ... a fun road trip...

we estimated two LDA models from ’Train t’ and

RQ2 Alright you creative deal-seekers, I need

’Train c’ using GibbsLDA++

10

. We treated each

some help in planning a spring breaktrip for my family

question including the question title and the infor-

InfoN Spring break starts March 13th and goes

mation need part as a single document of a sequence

until the 21st ... Someplace WARM!!!

of words. These documents were preprocessed be-

Family-oriented hotel/resort ... North

fore being fed into LDA model. 1800 iterations for

American Continent (Mexico, America,

Gibbs sampling 200 topics parameters were set for

Jamaica, Bahamas, etc.) Cost= Around

each LDA model estimation.

$5,000 ...

The results in table 2 show that TFIDF and LDA1

methods perform better for recommending questions

Table 4: Question recommendation results by LDA mea-

than the others. After further analysis of the ques-

suring the similarity between information needs

tions recommended by both methods, we discov-

8

https://traloihay.net

9

https://traloihay.net

10

https://traloihay.net

ered that the ordering of the recommended questions

The question and information need pairs in both

‘Train t’ and ‘Train c’ training sets were used to

from TFIDF and LDA1 are quite different. TFIDF

train two IBM-4 translation models by GIZA++

similarity method prefers texts with more common

words, while the LDA1 method can find the rela-

toolkit. These pairs were also preprocessed before

training. And the pairs whose information need part

tion between the non-common words between short

texts based on a series of third-party topics. The L-

become empty after preprocessing were disregard-

ed.

DA1 method outperforms the TFIDF method in two

ways: (1) the top recommended questions’ informa-

During the experiment, we found that some of the

generated words in the information need parts are

tion needs share less common words with the query

themselves. This is caused by the self translation

question’s; (2) the top recommended questions span

problem in translation model: the highest transla-

wider topics. The questions highly recommended by

tion score for a word is usually given to itself if

LDA1 can suggest more useful topics to the user.

the target and source languages are the same (Xue

Knowledge-based methods are also shown to per-

form worse than TFIDF and LDA1. We found that

et al., 2008). This has always been a tough ques-

tion: not using self-translated words can reduce re-

some words were mis-tagged so that they were not

trieval performance as the information need parts

included in the word-to-word similarity calculation.

Another reason for the worse performance is that the

need the terms to represent the semantic meanings;

using self-translated words does not take advantage

words out of the WordNet dictionary were also not

included in the similarity calculation.

of the translation approach. To tackle this problem,

The Mean Reciprocal Rank score for TFIDF and

we control the number of the words predicted by the

translation model to be exactly twice the number of

LDA1 are more than 80%. That is to say, we are able

words in the corresponding preprocessed question.

to recommend questions to the users by measuring

The predicted information need words for the re-

their information needs. The first two recommended

trieved questions are shown in Table 5. In Q1, the in-

questions for Q1 and Q2 using LDA1 method are

formation need behind question “recommend web-

shown in table 4. InfoN is the information need part

site for custom built computer parts” may imply

associated with each question.

that the users need to know some information about

In the preprocessing step, some words were suc-

building computer parts such as “ram ” and “moth-

cessfully corrected such as “What should I do this

erboard ” for a different purpose such as “gaming ”.

saturday? ... and staying in a hotell ...” and “my

faimly is traveling to florda ...”. However, there are

While in Q2, the user may want to compare comput-

ers in different brands such as “dell ” and “mac” or

still a small number of texts such as “How come my

Gforce visualization doesn’t work? ” and “Do i need

consider the “price” factor for “purchasing a laptop

for a college student”.

an Id to travel from new york to maimi? ” failed to

We also did a small scale comparison between the

be corrected. So in the future, a better method is

generated information needs against the real ques-

expected to correct these failure cases.

tions whose information need parts are not empty.