3.2 ANSWER EXTRACTION THEREFORE, EACH OF THESE THREE COMPONENT...

2.3.2 Answer Extraction

Therefore, each of these three components attracted the The parser enables the recognition of the answer candidates in attention of QA researchers. the paragraphs. So, once an answer candidate has been  Question Classification: identified, a set of heuristics is applied in order to extract only the relevant word or phrase that answers the question. Questions generally conform to predictable language patterns Researchers have presented miscellaneous heuristic measures and therefore are classified based on taxonomies. Taxonomies to extract the correct answer from the answer candidates. are distinguished into two main types: flat and hierarchical Extraction can be based on measures of distance between taxonomies. Flat taxonomies have only one level of classes keywords, numbers of keywords matched and other similar without having sub-classes, whereas hierarchical taxonomies heuristic metrics. Commonly, if no match is found, QA have multi-level classes. Lehnert [14] proposed “QUALM”, a systems would fallback to delivering the best ranked computer program that uses a conceptual taxonomy of thirteen paragraph. Unfortunately, given the tightening requirements conceptual classes. Radev et al. [15] proposed a QA system of the TREC QA track, such behavior is no longer useful. As called NSIR, pronounced “answer”, which used a flat in the original TREC QA tracks, systems could present a list taxonomy with seventeen classes, shown in (Table 1). of several answers, and were ranked based on where the Table 1: Flat Taxonomy (Radev et al. – “NSIR”) correct answer appeared in the list. From 1999-2001, the length of this list was 5. Since 2002, systems have been PERSON PLACE DATE required to present only a single answer [10]. NUMBER DEFINITION ORGANIZATIODESCRIPTIO N