4 EFFECT OF TRAINING DATA SIZEIN THE QUESTION ITSELF BY 300%. IT IS...

7.4 Effect of Training Data Size

in the question itself by 300%. It is also shown

that the algorithm improves the performance of a

We now assess the effect of training data size on

state-of-the-art QA system significantly.

performance. Tables 5 and 6 presented earlier

As always, there are many ways how we could

show that an average of 32.2% of the questions

imagine our algorithm to be improved. Combin-

have no matching patterns. This is because the

ing it with fuzzy matching techniques as in (Cui et

data used for training contained no examples for a

al., 2004) or (Cui et al., 2005) is an obvious direc-

significant subset of question classes. It can be ex-

tion for future work. We are also aware that in or-

pected that, if more training data would be avail-

der to apply our algorithm on a larger scale and in

able, this percentage would decrease and perfor-

a real world setting with real users, we would need

mance would increase. In order to test this as-

a much larger set of training data. These could

sumption, we repeated the evaluation procedure

be acquired semi-manually, for example by using

detailed in this section several times, initially us-

crowd-sourcing techniques. We are also thinking

ing data from only one TREC test set for train-

about fully automated approaches, or about us-

ing and then gradually adding more sets until all

ing indirect human evidence, e.g. user clicks in

available training data had been used. The results

search engine logs. Typically users only see the

for evaluation set 2 are presented in Figure 2. As

title and a short abstract of the document when

can be seen, every time more data is added, per-

clicking on a result, so it is possible to imagine a

formance increases. This strongly suggests that

scenario where a subset of these abstracts, paired

the point of diminishing returns, when adding ad-

with user queries, could serve as training data.

ditional training data no longer improves perfor-

mance is not yet reached.

References

Dekang Lin and Patrick Pantel. 2001. Discovery ofInference Rules for Question-Answering. NaturalGiuseppe Attardi, Antonio Cisternino, FrancescoLanguage Engineering, 7(4):343–360.Formica, Maria Simi, and Alessandro Tommasi.Dekang Lin. 1998. Dependency-based Evaluation of