2 HEAD NOUN IDENTIFICATIONIN THIS PAPER, WE DISCUSSED THE DIFFICULTI...

5.2 Head Noun Identification

In this paper, we discussed the difficulties inherent in

In the evaluation of chunking, we focus on head

learner corpus creation and a method for efficiently

noun identification. Head noun identification often

creating a learner corpus. We described the manu-

plays an important role in error detection/correction.

ally error-annotated and shallow-parsed learner cor-

For example, it is crucial to identify head nouns to

pus which was created using this method. We also

detect errors in article and number.

showed its usefulness in developing and evaluating

We again used the shallow-parsed corpus as a test

POS taggers and chunkers. We believe that publish-

corpus. The essays contained 3,589 head nouns.

ing this corpus will give researchers a common de-

We implemented an HMM-based chunker using 5-

velopment and test set for developing related NLP

grams whose input is a sequence of POSs, which

techniques including error detection/correction and

was obtained by the HMM-based POS tagger de-

POS-tagging/chunking, which will facilitate further

scribed in the previous subsection. The chunker was

research in these areas.

trained on the same corpus as the HMM-based POS

A Error tag set

tagger. The performance was evaluated by recall and

precision defined by

This is the list of our error tag set. It is based on the

number of head nouns correctly identified

NICT JLE tag set (Izumi et al., 2005).

number of head nouns (2)

n: noun

and

num: number

lxc: lexis

number of tokens identified as head noun

(3)

o: other

respectively.

Table 7 shows the results. To our surprise, the

v: verb

chunker performed better than we had expected. A

agr: agreement

possible reason for this is that sentences written by

learners of English tend to be shorter and simpler in

Recall Precision

terms of their structure.

The results in Table 7 also enable us to quanti-

0.903 0.907

tatively estimate expected improvement in error de-

tection/correction which is achieved by improving

Table 7: Performance on head noun identification.

tns: tense

Rachele De Felice and Stephen G. Pulman. 2008.A classifier-based approach to preposition and deter-miner error correction in L2 English. InProc. of 22ndInternational Conference on Computational Linguis-tics, pages 169–176.

mo: auxiliary verb

Sylviane Granger, Estelle Dagneaux, Fanny Meunier,and Magali Paquot. 2009. International Corpus of

aj: adjective

Learner English v2. Presses universitaires de Louvain.Sylviane Granger. 1998. Prefabricated patterns in ad-vanced EFL writing: collocations and formulae. InA. P. Cowie, editor,Phraseology: theory, analysis, andapplication, pages 145–160. Clarendon Press.

av: adverb

Na-Rae Han, Martin Chodorow, and Claudia Leacock.