5). CROSS-VALIDATION WAS CONDUCTED. FORROUGE-2 SCORE, WHICH...

Section 2.5). Cross-validation was conducted. For

ROUGE-2 score, which seems to be particularly

the S Π system, which required no training, all of

sensitive to Novelty: no matter what combination

the 300 summaries were used as the test set.

of measures is used (R alone, RQ, RQC), changes

S Σ outperformed the baseline in Recall (R) but

in ROUGE-2 score remain under one point per-

not in Precision (P); nevertheless, the combined F-

centile. Once Novelty is added, performances rise

1 score (F) was sensibly higher (around 5 points

abruptly to the system’s highest. A summary ex-

percentile). On the other hand, our S Π system

ample, along with the question and the best an-

showed very consistent improvements of an order

swer, is presented in Table 2.

of 10 to 15 points percentile over the baseline on

all measures; we would like to draw attention on

4 Discussion and Future Directions

the fact that even if Precision scores are higher,

it is on Recall scores that greater improvements

We conclude by discussing a few alternatives to

were achieved. This, together with the results ob-

the approaches we presented. The length M con-

tained by S Σ , suggest performances could benefit

straint for the final summary (Section 2.6), could

have been determined by making use of external

10

Available at https://traloihay.net.

knowledge such as T K q : since T K q represents

aspx

space for questions is presented by Agichtein et

HOW TO PROTECT YOURSELF FROM A BEAR?

https://traloihay.net

al. (2008) and could be used to rank the quality of

20060818062414AA7VldB

questions in a way similar to how we ranked the

***BEST ANSWER***

Great question. I have done alot of trekking through California, Montana

quality of answers.

and Wyoming and have met Black bears (which are quite dinky and placid

The Quality assessing component itself could

but can go nuts if they have babies), and have been half an hour away from

(allegedly) the mother of all grizzley s whilst on a trail through Glacier

be built as a module that can be adjusted to the

National park - so some other trekkerers told me... What the park wardens

say is SING, SHOUT, MAKE NOISE...do it loudly, let them know you

kind of Social Media in use; the creation of cus-

are there..they will get out of the way, it is a surprised bear wot will go

tomized Quality feature spaces would make it

mental and rip your little legs off..No fun permission: anything that will

confuse them and stop them in their tracks...I have been told be an native

possible to handle different sources of UGC (fo-

american buddy that to keep a bottle of perfume in your pocket...throw it at

the ground near your feet and make the place stink: they have good noses,

rums, collaborative authoring websites such as

them bears, and a mega concentrated dose of Britney Spears Obsessive

Compulsive is gonna give em something to think about...Have you got a

Wikipedia, blogs etc.). A great obstacle is the lack

rape alarm? Def take that...you only need to distract them for a second

of systematically available high quality training

then they will lose interest..Stick to the trails is the most important thing,

and talk to everyone you see when trekking: make sure others know where

examples: a tentative solution could be to make

you are.

use of clustering algorithms in the feature space;

***SUMMARIZED ANSWER***

[...]

In addition if the bear actually approaches you or charges you.. still

high and low quality clusters could then be labeled

stand your ground. Many times they will not actually come in contact

with you, they will charge, almost touch you than run away.

[...]

The

by comparison with examples of virtuous behav-

actions you should take are different based on the type of bear. for ex-

ior (such as Wikipedia’s Featured Articles). The

ample adult Grizzlies can t climb trees, but Black bears can even when

adults. They can not climb in general as thier claws are longer and not

quality of a document could then be estimated as a

semi-retractable like a Black bears claws.

[...]

I truly disagree with the

whole play dead approach because both Grizzlies and Black bears are

function of distance from the centroid of the clus-

oppurtunistic animals and will feed on carrion as well as kill and eat an-

ter it belongs to. More careful estimates could take

imals. Although Black bears are much more scavenger like and tend not

to kill to eat as much as they just look around for scraps. Grizzlies on the

the position of other clusters and the concentration

other hand are very accomplished hunters and will take down large prey

animals when they want.

[...]

I have lived in the wilderness of Northern

of nearby documents in consideration.

Canada for many years and I can honestly say that Black bears are not at

Finally, in addition to the chosen best answer, a

all likely to attack you in most cases they run away as soon as they see or

smell a human, the only places where Black bears are agressive is in parks

DUC-styled query-focused multi-document sum-

with visitors that feed them, everywhere else the bears know that usually

humans shoot them and so fear us.

[...]

mary could be used as a baseline against which

the performances of the system can be checked.

Table 2: A summarized answer composed of five different

portions of text generated with the S

Π

scoring function; the

5 Related Work

chosen best answer is presented for comparison. The rich-

A work with a similar objective to our own is

ness of the content and the good level of readability make

that of Liu et al. (2008), where standard multi-

it a successful instance of metadata-aware summarization of

document summarization techniques are em-

information in cQA systems. Less satisfying examples in-

ployed along with taxonomic information about

clude summaries to questions that require a specific order of

questions. Our approach differs in two fundamen-

sentences or a compromise between strongly discordant opin-

tal aspects: it took in consideration the peculiari-

ions; in those cases, the summarized answer might lack logi-

ties of the data in input by exploiting the nature of

cal consistency.

UGC and available metadata; additionally, along

with relevance, we addressed challenges that are

the total knowledge available about q, a coverage

specific to Question Answering, such as Cover-

estimate of the final answers against it would have

age and Novelty. For an investigation of Coverage

been ideal. Unfortunately the lack of metadata

in the context of Search Engines, refer to Swami-

about those answers prevented us from proceeding

nathan et al. (2009).

in that direction. This consideration suggests the

At the core of our work laid information trust-

idea of building T K q using similar answers in the

fulness, summarization techniques and alternative

dataset itself, for which metadata is indeed avail-

concept representation. A general approach to

able. Furthermore, similar questions in the dataset

could have been used to augment the set of an-

the broad problem of evaluating information cred-

swers used to generate the final summary with an-

ibility on the Internet is presented by Akamine

swers coming from similar questions. Wang et al.

et al. (2009) with a system that makes use of

(2009a) presents a method to retrieve similar ques-

semantic-aware Natural Language Preprocessing

tions that could be worth taking in consideration

techniques. With analogous goals, but a focus

for the task. We suggest that the retrieval method

on UGC, are the papers of Stvilia et al. (2005),

could be made Quality-aware. A Quality feature

Mcguinness et al. (2006), Hu et al. (2007) and

Zeng et al. (2006), which present a thorough inves-

state-of-the-art summarization systems is ongoing.

tigation of Quality and trust in Wikipedia. In the

Acknowledgments

cQA domain, Jeon et al. (2006) presents a frame-

work to use Maximum Entropy for answer quality

This work was partly supported by the Chi-

estimation through non-textual features; with the

nese Natural Science Foundation under grant No.

same purpose, more recent methods based on the

60803075, and was carried out with the aid of

expertise of answerers are proposed by Suryanto

a grant from the International Development Re-

et al. (2009), while Wang et al. (2009b) introduce

search Center, Ottawa, Canada. We would like to

the idea of ranking answers taking their relation to

thank Prof. Xiaoyan Zhu, Mr. Yang Tang and Mr.

questions in consideration. The paper that we re-

Guillermo Rodriguez for the valuable discussions

gard as most authoritative on the matter is the work

and comments and for their support. We would

by Agichtein et al. (2008) which inspired us in the

also like to thank Dr. Chin-yew Lin and Dr. Eu-

design of the Quality feature space presented in

gene Agichtein from Emory University for sharing