Section 2.5). Cross-validation was conducted. For
ROUGE-2 score, which seems to be particularly
the S Π system, which required no training, all of
sensitive to Novelty: no matter what combination
the 300 summaries were used as the test set.
of measures is used (R alone, RQ, RQC), changes
S Σ outperformed the baseline in Recall (R) but
in ROUGE-2 score remain under one point per-
not in Precision (P); nevertheless, the combined F-
centile. Once Novelty is added, performances rise
1 score (F) was sensibly higher (around 5 points
abruptly to the system’s highest. A summary ex-
percentile). On the other hand, our S Π system
ample, along with the question and the best an-
showed very consistent improvements of an order
swer, is presented in Table 2.
of 10 to 15 points percentile over the baseline on
all measures; we would like to draw attention on
4 Discussion and Future Directions
the fact that even if Precision scores are higher,
it is on Recall scores that greater improvements
We conclude by discussing a few alternatives to
were achieved. This, together with the results ob-
the approaches we presented. The length M con-
tained by S Σ , suggest performances could benefit
straint for the final summary (Section 2.6), could
have been determined by making use of external
10Available at https://traloihay.net.
knowledge such as T K q : since T K q represents
aspx
space for questions is presented by Agichtein et
HOW TO PROTECT YOURSELF FROM A BEAR?
https://traloihay.netal. (2008) and could be used to rank the quality of
20060818062414AA7VldBquestions in a way similar to how we ranked the
***BEST ANSWER***
Great question. I have done alot of trekking through California, Montana
quality of answers.
and Wyoming and have met Black bears (which are quite dinky and placid
The Quality assessing component itself could
but can go nuts if they have babies), and have been half an hour away from
(allegedly) the mother of all grizzley s whilst on a trail through Glacier
be built as a module that can be adjusted to the
National park - so some other trekkerers told me... What the park wardens
say is SING, SHOUT, MAKE NOISE...do it loudly, let them know you
kind of Social Media in use; the creation of cus-
are there..they will get out of the way, it is a surprised bear wot will go
tomized Quality feature spaces would make it
mental and rip your little legs off..No fun permission: anything that will
confuse them and stop them in their tracks...I have been told be an native
possible to handle different sources of UGC (fo-
american buddy that to keep a bottle of perfume in your pocket...throw it at
the ground near your feet and make the place stink: they have good noses,
rums, collaborative authoring websites such as
them bears, and a mega concentrated dose of Britney Spears Obsessive
Compulsive is gonna give em something to think about...Have you got a
Wikipedia, blogs etc.). A great obstacle is the lack
rape alarm? Def take that...you only need to distract them for a second
of systematically available high quality training
then they will lose interest..Stick to the trails is the most important thing,
and talk to everyone you see when trekking: make sure others know where
examples: a tentative solution could be to make
you are.
use of clustering algorithms in the feature space;
***SUMMARIZED ANSWER***
[...]In addition if the bear actually approaches you or charges you.. still
high and low quality clusters could then be labeled
stand your ground. Many times they will not actually come in contact
with you, they will charge, almost touch you than run away.
[...]The
by comparison with examples of virtuous behav-
actions you should take are different based on the type of bear. for ex-
ior (such as Wikipedia’s Featured Articles). The
ample adult Grizzlies can t climb trees, but Black bears can even when
adults. They can not climb in general as thier claws are longer and not
quality of a document could then be estimated as a
semi-retractable like a Black bears claws.
[...]I truly disagree with the
whole play dead approach because both Grizzlies and Black bears are
function of distance from the centroid of the clus-
oppurtunistic animals and will feed on carrion as well as kill and eat an-
ter it belongs to. More careful estimates could take
imals. Although Black bears are much more scavenger like and tend not
to kill to eat as much as they just look around for scraps. Grizzlies on the
the position of other clusters and the concentration
other hand are very accomplished hunters and will take down large prey
animals when they want.
[...]I have lived in the wilderness of Northern
of nearby documents in consideration.
Canada for many years and I can honestly say that Black bears are not at
Finally, in addition to the chosen best answer, a
all likely to attack you in most cases they run away as soon as they see or
smell a human, the only places where Black bears are agressive is in parks
DUC-styled query-focused multi-document sum-
with visitors that feed them, everywhere else the bears know that usually
humans shoot them and so fear us.
[...]mary could be used as a baseline against which
the performances of the system can be checked.
Table 2: A summarized answer composed of five different
portions of text generated with the S
Πscoring function; the
5 Related Work
chosen best answer is presented for comparison. The rich-
A work with a similar objective to our own is
ness of the content and the good level of readability make
that of Liu et al. (2008), where standard multi-
it a successful instance of metadata-aware summarization of
document summarization techniques are em-
information in cQA systems. Less satisfying examples in-
ployed along with taxonomic information about
clude summaries to questions that require a specific order of
questions. Our approach differs in two fundamen-
sentences or a compromise between strongly discordant opin-
tal aspects: it took in consideration the peculiari-
ions; in those cases, the summarized answer might lack logi-
ties of the data in input by exploiting the nature of
cal consistency.
UGC and available metadata; additionally, along
with relevance, we addressed challenges that are
the total knowledge available about q, a coverage
specific to Question Answering, such as Cover-
estimate of the final answers against it would have
age and Novelty. For an investigation of Coverage
been ideal. Unfortunately the lack of metadata
in the context of Search Engines, refer to Swami-
about those answers prevented us from proceeding
nathan et al. (2009).
in that direction. This consideration suggests the
At the core of our work laid information trust-
idea of building T K q using similar answers in the
fulness, summarization techniques and alternative
dataset itself, for which metadata is indeed avail-
concept representation. A general approach to
able. Furthermore, similar questions in the dataset
could have been used to augment the set of an-
the broad problem of evaluating information cred-
swers used to generate the final summary with an-
ibility on the Internet is presented by Akamine
swers coming from similar questions. Wang et al.
et al. (2009) with a system that makes use of
(2009a) presents a method to retrieve similar ques-
semantic-aware Natural Language Preprocessing
tions that could be worth taking in consideration
techniques. With analogous goals, but a focus
for the task. We suggest that the retrieval method
on UGC, are the papers of Stvilia et al. (2005),
could be made Quality-aware. A Quality feature
Mcguinness et al. (2006), Hu et al. (2007) and
Zeng et al. (2006), which present a thorough inves-
state-of-the-art summarization systems is ongoing.
tigation of Quality and trust in Wikipedia. In the
Acknowledgments
cQA domain, Jeon et al. (2006) presents a frame-
work to use Maximum Entropy for answer quality
This work was partly supported by the Chi-
estimation through non-textual features; with the
nese Natural Science Foundation under grant No.
same purpose, more recent methods based on the
60803075, and was carried out with the aid of
expertise of answerers are proposed by Suryanto
a grant from the International Development Re-
et al. (2009), while Wang et al. (2009b) introduce
search Center, Ottawa, Canada. We would like to
the idea of ranking answers taking their relation to
thank Prof. Xiaoyan Zhu, Mr. Yang Tang and Mr.
questions in consideration. The paper that we re-
Guillermo Rodriguez for the valuable discussions
gard as most authoritative on the matter is the work
and comments and for their support. We would
by Agichtein et al. (2008) which inspired us in the
also like to thank Dr. Chin-yew Lin and Dr. Eu-
design of the Quality feature space presented in
gene Agichtein from Emory University for sharing
Bạn đang xem section 2. - BÁO CÁO KHOA HỌC METADATA AWARE MEASURES FOR ANSWER SUMMARIZATION IN COMMUNITY QUESTION ANSWERING PDF