5.4 Similarity Measure
Q2: Where should my family go for spring
The first experiment conducted question recommen-
break?
dation based on their information need parts. Dif-
InfoN ... family wants to go somewhere for
ferent text similarity methods described in section
a couple days during spring break ...
3 were used to measure the similarity between the
prefers a warmer climate and we live in
information need texts. In TFIDF similarity mea-
IL, so it shouldn’t be SUPER far away.
sure (TFIDF), the idf values for each word were
... a family road trip. ...
computed from frequency counts over the entire
RQ1 Whats a cheap travel destination forspring break?
Aquaint corpus
8
. For calculating the word-to-word
InfoN I live in houston texas and i’m trying to
knowledge-based similarity, a WordNet::Similarity
find i inexpensive place to go for spring
Java implementation
9
of the similarity measures lin
break with my family.My parents don’t
(Knowledge2) and jcn (Knowledge1) is used in this
want to spend a lot of money due to the
paper. For calculating topic model based similarity,
economy crisis, ... a fun road trip...
we estimated two LDA models from ’Train t’ and
RQ2 Alright you creative deal-seekers, I need
’Train c’ using GibbsLDA++
10
. We treated each
some help in planning a spring breaktrip for my family
question including the question title and the infor-
InfoN Spring break starts March 13th and goes
mation need part as a single document of a sequence
until the 21st ... Someplace WARM!!!
of words. These documents were preprocessed be-
Family-oriented hotel/resort ... North
fore being fed into LDA model. 1800 iterations for
American Continent (Mexico, America,
Gibbs sampling 200 topics parameters were set for
Jamaica, Bahamas, etc.) Cost= Around
each LDA model estimation.
$5,000 ...
The results in table 2 show that TFIDF and LDA1
methods perform better for recommending questions
Table 4: Question recommendation results by LDA mea-
than the others. After further analysis of the ques-
suring the similarity between information needs
tions recommended by both methods, we discov-
8
https://traloihay.net
9
https://traloihay.net
10
https://traloihay.net
ered that the ordering of the recommended questions
The question and information need pairs in both
‘Train t’ and ‘Train c’ training sets were used to
from TFIDF and LDA1 are quite different. TFIDF
train two IBM-4 translation models by GIZA++
similarity method prefers texts with more common
words, while the LDA1 method can find the rela-
toolkit. These pairs were also preprocessed before
training. And the pairs whose information need part
tion between the non-common words between short
texts based on a series of third-party topics. The L-
become empty after preprocessing were disregard-
ed.
DA1 method outperforms the TFIDF method in two
ways: (1) the top recommended questions’ informa-
During the experiment, we found that some of the
generated words in the information need parts are
tion needs share less common words with the query
themselves. This is caused by the self translation
question’s; (2) the top recommended questions span
problem in translation model: the highest transla-
wider topics. The questions highly recommended by
tion score for a word is usually given to itself if
LDA1 can suggest more useful topics to the user.
the target and source languages are the same (Xue
Knowledge-based methods are also shown to per-
form worse than TFIDF and LDA1. We found that
et al., 2008). This has always been a tough ques-
tion: not using self-translated words can reduce re-
some words were mis-tagged so that they were not
trieval performance as the information need parts
included in the word-to-word similarity calculation.
Another reason for the worse performance is that the
need the terms to represent the semantic meanings;
using self-translated words does not take advantage
words out of the WordNet dictionary were also not
included in the similarity calculation.
of the translation approach. To tackle this problem,
The Mean Reciprocal Rank score for TFIDF and
we control the number of the words predicted by the
translation model to be exactly twice the number of
LDA1 are more than 80%. That is to say, we are able
words in the corresponding preprocessed question.
to recommend questions to the users by measuring
The predicted information need words for the re-
their information needs. The first two recommended
trieved questions are shown in Table 5. In Q1, the in-
questions for Q1 and Q2 using LDA1 method are
formation need behind question “recommend web-
shown in table 4. InfoN is the information need part
site for custom built computer parts” may imply
associated with each question.
that the users need to know some information about
In the preprocessing step, some words were suc-
building computer parts such as “ram ” and “moth-
cessfully corrected such as “What should I do this
erboard ” for a different purpose such as “gaming ”.
saturday? ... and staying in a hotell ...” and “my
faimly is traveling to florda ...”. However, there are
While in Q2, the user may want to compare comput-
ers in different brands such as “dell ” and “mac” or
still a small number of texts such as “How come my
Gforce visualization doesn’t work? ” and “Do i need
consider the “price” factor for “purchasing a laptop
for a college student”.
an Id to travel from new york to maimi? ” failed to
We also did a small scale comparison between the
be corrected. So in the future, a better method is
generated information needs against the real ques-
expected to correct these failure cases.
tions whose information need parts are not empty.