2 HEAD NOUN IDENTIFICATIONIN THIS PAPER, WE DISCUSSED THE DIFFICULTI...
5.2 Head Noun Identification
In this paper, we discussed the difficulties inherent in
In the evaluation of chunking, we focus on head
learner corpus creation and a method for efficiently
noun identification. Head noun identification often
creating a learner corpus. We described the manu-
plays an important role in error detection/correction.
ally error-annotated and shallow-parsed learner cor-
For example, it is crucial to identify head nouns to
pus which was created using this method. We also
detect errors in article and number.
showed its usefulness in developing and evaluating
We again used the shallow-parsed corpus as a test
POS taggers and chunkers. We believe that publish-
corpus. The essays contained 3,589 head nouns.
ing this corpus will give researchers a common de-
We implemented an HMM-based chunker using 5-
velopment and test set for developing related NLP
grams whose input is a sequence of POSs, which
techniques including error detection/correction and
was obtained by the HMM-based POS tagger de-
POS-tagging/chunking, which will facilitate further
scribed in the previous subsection. The chunker was
research in these areas.
trained on the same corpus as the HMM-based POS
A Error tag set
tagger. The performance was evaluated by recall and
precision defined by
This is the list of our error tag set. It is based on the
number of head nouns correctly identified
NICT JLE tag set (Izumi et al., 2005).
number of head nouns (2)
n: noun
and
– num: number
– lxc: lexis
number of tokens identified as head noun
(3)
– o: other
respectively.
Table 7 shows the results. To our surprise, the
v: verb
chunker performed better than we had expected. A
– agr: agreement
possible reason for this is that sentences written by
learners of English tend to be shorter and simpler in
Recall Precision
terms of their structure.
The results in Table 7 also enable us to quanti-
0.903 0.907
tatively estimate expected improvement in error de-
tection/correction which is achieved by improving
Table 7: Performance on head noun identification.– tns: tense
Rachele De Felice and Stephen G. Pulman. 2008.A classifier-based approach to preposition and deter-miner error correction in L2 English. InProc. of 22ndInternational Conference on Computational Linguis-tics, pages 169–176.mo: auxiliary verb
Sylviane Granger, Estelle Dagneaux, Fanny Meunier,and Magali Paquot. 2009. International Corpus of
aj: adjective
Learner English v2. Presses universitaires de Louvain.Sylviane Granger. 1998. Prefabricated patterns in ad-vanced EFL writing: collocations and formulae. InA. P. Cowie, editor,Phraseology: theory, analysis, andapplication, pages 145–160. Clarendon Press.av: adverb
Na-Rae Han, Martin Chodorow, and Claudia Leacock.Bạn đang xem 5. - TÀI LIỆU BÁO CÁO KHOA HỌC CREATING A MANUALLY ERROR TAGGED AND SHALLOW PARSED LEARNER CORPUS PPTX
![Đáp án tham khảo 5. - TÀI LIỆU BÁO CÁO KHOA HỌC CREATING A MANUALLY ERROR TAGGED AND SHALLOW PARSED LEARNER CORPUS PPTX](https://www.traloihay.net/traloihay/question/accepted_answers/2022/02_19/620fe5be1868d.webp?v=20220420142246)