2. NEURAL RECOGNITION MODEL AS ILLUSTRATED IN FIG. 2, OUR NEURAL NET...

3.2. Neural recognition model

As illustrated in Fig. 2, our neural network-based model consists of three stages: word

representation, sentence representation, and inference.

Word representation. In this stage, the model employs several neural

network layers to learn a representation for each word in the input question. The final

representation incorporates both automatically learned information at the character

and word levels and handcrafted features extracted from the word. We consider two

variants of the model; one uses CNNs and the other exploits BiLSTM networks to

learn the word representation. The detail of the two variants will be described in the

following sections.

Sentence Representation. In this stage, BiLSTM networks are used to

modeling the relation between words. Receiving the word representations from the

previous stage, the model learns a new representation for each word that incorporates

the information of the whole question. Previous studies [3] show that by stacking

several BiLSTM layers, we can produce better representations. We, therefore, also

use two BiLSTM layers in this stage. The detail of BiLSTM networks will be

presented in the following sections.

Inference. In this stage, the model receives the output of the previous stage

and generates a tag (in the IOB notation) at each position of the input question. We

consider two variants of the models; one uses the softmax function and the other

exploit CRFs. While the softmax function computes a probability distribution on the

set of all possible tags at each position of the question independently, CRFs can look

at the whole question and utilize the correlation between the current tag and

neighboring tags.

Fig. 2. General architecture of neural recognition models

We now describe our two methods to produce the word representation for each

word in the input question. The first method employs CNNs, and the other one uses

BiLSTM networks. For notation, we denote vectors with bold lower-case, matrices

with bold upper-case, and scalars with italic lower-case.