2. NEURAL RECOGNITION MODEL AS ILLUSTRATED IN FIG. 2, OUR NEURAL NET...

Question

3.2. Neural recognition model As illustrated in Fig. 2, our neural network-based model consists of three stages: word representation, sentence representation, and inference.   Word  representation.  In  this  stage,  the  model  employs  several  neural network layers to learn a representation for each word in the input question. The final representation incorporates both automatically learned information at the character and word levels and handcrafted features extracted from the word. We consider two variants of the model; one uses CNNs and the other exploits BiLSTM networks to learn the word representation. The detail of the two variants will be described in the following sections.     Sentence  Representation.  In  this  stage,  BiLSTM  networks  are  used  to modeling the relation between words. Receiving the word representations from the previous stage, the model learns a new representation for each word that incorporates the  information  of  the  whole  question.  Previous  studies  [3]  show  that  by  stacking several BiLSTM layers, we can produce better representations. We, therefore, also use  two  BiLSTM  layers  in  this  stage.  The  detail  of  BiLSTM  networks  will  be presented in the following sections.      Inference. In this stage, the model receives the output of the previous stage and generates a tag (in the IOB notation) at each position of the input question. We consider  two  variants  of  the  models;  one  uses  the  softmax  function  and  the  other exploit CRFs. While the softmax function computes a probability distribution on the set of all possible tags at each position of the question independently, CRFs can look at  the  whole  question  and  utilize  the  correlation  between  the  current  tag  and neighboring tags.   Fig. 2. General architecture of neural recognition modelsWe now describe our two methods to produce the word representation for each word in the input question. The first method employs CNNs, and the other one uses BiLSTM networks. For notation, we denote vectors with bold lower-case, matrices with bold upper-case, and scalars with italic lower-case.

2. NEURAL RECOGNITION MODEL AS ILLUSTRATED IN FIG. 2, OUR NEURAL NET...

3.2. Neural recognition model

As illustrated in Fig. 2, our neural network-based model consists of three stages: word

representation, sentence representation, and inference.

 Word representation. In this stage, the model employs several neural

network layers to learn a representation for each word in the input question. The final

representation incorporates both automatically learned information at the character

and word levels and handcrafted features extracted from the word. We consider two

variants of the model; one uses CNNs and the other exploits BiLSTM networks to

learn the word representation. The detail of the two variants will be described in the

following sections.

 Sentence Representation. In this stage, BiLSTM networks are used to

modeling the relation between words. Receiving the word representations from the

previous stage, the model learns a new representation for each word that incorporates

the information of the whole question. Previous studies [3] show that by stacking

several BiLSTM layers, we can produce better representations. We, therefore, also

use two BiLSTM layers in this stage. The detail of BiLSTM networks will be

presented in the following sections.

 Inference. In this stage, the model receives the output of the previous stage

and generates a tag (in the IOB notation) at each position of the input question. We

consider two variants of the models; one uses the softmax function and the other

exploit CRFs. While the softmax function computes a probability distribution on the

set of all possible tags at each position of the question independently, CRFs can look

at the whole question and utilize the correlation between the current tag and

neighboring tags.

We now describe our two methods to produce the word representation for each

word in the input question. The first method employs CNNs, and the other one uses

BiLSTM networks. For notation, we denote vectors with bold lower-case, matrices

with bold upper-case, and scalars with italic lower-case.

Bạn đang xem 3. - QUESTION ANALYSIS TOWARDS A VIETNAMESE QUESTION ANSWERING SYSTEM IN THE EDUCATION DOMAIN