2005) to perform combined analyses of readers’
ure 1b) is somewhat wider than (but still quite
rating and response times. The analyses showed
similar to) the distribution of differences for
that when the difference between authors’ and
readers’ ratings was ≤1and the response time
whole reviews. The distribution of differences
for second sentences is the widest of the three
much shorter than average (<14.1 sec), then
(Figure 1c).
96% of the sentences were last sentences. Due
Pearson correlation coefficient calculations
to the small sample size, we cautiously infer
(Table 1) show that both the correlation be-
that last sentences express polarity better than
tween authors’ ratings and readers’ rating for
second sentences, bearing in mind that the sec-
whole reviews and the correlation between au-
ond sentence in our experiment represents any
thors’ rating and readers’ rating upon reading
other sentence in the text except for the first
the last sentence are similar, while the correla-
one.
tion between authors’ rating and readers
’ rating
We also predicted that hesitation in making a
when presented with the second sentence of
decision would effect not only latency times but
each review is significantly lower. Moreover,
also mouse trajectories. Namely, hesitation will
when correlating readers’ rating of whole re-
be accompanied by moving the mouse here and
views with readers’ rating of single sentences,
there, while decisiveness will show a firm
the correlation coefficient for last sentences is
movement. However, no such difference be-
significantly higher than for second sentences.
tween the responses to last sentences or to sec-
As for the biometric measurements per-
ond sentences appeared in our analysis; most
formed in the second experiment, since all sub-
subjects laid their hand still while reading the
jects were computer-skilled, hesitation revealed
texts and while reflecting upon their answers.
through mouse-movements was assumed to be
They moved the mouse only to rate the texts.
attributed to difficulty of decision-making rather
6 Conclusions and Future Work
than to problems in operating the mouse. As
previously stated, we recorded mouse latency
In 2 psycholinguistic and psychophysical ex-
times following the reading of the texts up until
periments, we showed that rating whole cus-
clicking the mouse. Mouse latency times were
tomer-reviews as compared to rating final sen-
not normalized for each subject due to the lim-
tences of these reviews showed an (expected)
ited number of results. However, the average
insignificant difference. In contrast, rating whole
latency time is shorter for last sentences
customer-reviews as compared to rating second
(19.61±12.23s) than for second sentences
sentences of these reviews, showed a consider-
(22.06±14.39s). Indeed, the difference between
able difference. Thus, instead of focusing on
latency times is not significant, as a paired t-test
whole texts, computational linguists should focus
could not reject the null hypothesis that those
on the last sentences for efficient and accurate
automatic polarity-classification. Indeed, last but
distributions have equal means, but might show
some tendency.
definitely not least!
We also used the WizWhy software (Meidan,
We are currently running experiments that
350a b c
300250200150Counts100500-5 -4 -3 -2 -1 0 1 2 3 4 5 -5 -4 -3 -2 -1 0 1 2 3 4 5-5 -4 -3 -2 -1 0 1 2 3 4 5Rating Difference (Authors' rating - Readers' rating)Figure 1. Histograms of the rating differences between the authors of reviews and their
readers: for whole reviews (a), for last sentence only (b), and for second sentence only (c).
Readers’ star rating of: Correlated with: Pearson Correlation Coefficient (P<0.0001)
Authors’ star rating
Whole reviews 0.7891
of whole reviews
Last sentences 0.7616
0.4705
Second sentences
Readers’ star rating
Last sentences 0.8463
of whole reviews 0.6563
Table 1. Pearson Correlation Coefficients
include hundreds of subjects in order to draw a
ence and Reference Accessibility ed. J. Gundel and T. Fretheim, 113-140. Amsterdam: Benja-profile of polarity evolvement throughout cus-
mins. tomer reviews. Specifically, we present our sub-
jects with sentences in various locations in cus-
Kieras, David E. 1978. Good and Bad Structure in tomer reviews asking them to rate them. As the
Simple Paragraphs: Effects on Apparent Theme, expanded experiment is not psychophysical, we
Reading Time, and Recall. Journal of Verbal Learning and Verbal Behavior 17:13-28. added an additional remote radio button named
“irrelevant” where subjects can judge a given
Kieras, David E. 1980. Initial Mention as a Cue to the text as lacking any evident polarity. Based on the
Main Idea and the Main Item of a Technical Pas-rating results we will draw polarity profiles in
sage. Memory and Cognition 8:345-353. order to see where, within customer reviews, po-
Lin, Chen-Yew, and Hovy, Edward. 1997. Identifying larity is best manifested and whether there are
Topic by Position. Paper presented at Proceeding other “candidates” sentences that would serve as
of the Fifth Conference on Applied Natural Lan-useful polarity indicators. The profiles will be
guage Processing, San Francisco. used as a feature in our computational analysis.
Meidan, Abraham. 2005. Wizsoft's WizWhy. In The Acknowledgments
Data Mining and Knowledge Discovery Hand-book, eds. Oded Maimon and Lior Rokach, We thank Prof. Rachel Giora and Prof. Ido Da-
Bạn đang xem 2005) - BÁO CÁO KHOA HỌC LAST BUT DEFINITELY NOT LEAST ON THE ROLE OF THE LAST SENTENCE IN AUTOMATIC POLARITY CLASSIFICATION PDF