CHARACTER REPRESENTATIONS (THE OUTPUT OF THE CNNS); 2) THE WORD EMB...

Question

1) character representations (the output of the CNNs); 2) the word embedding; 3) the embeddings of handcrafted features. Word embeddings, character embeddings, and the embeddings of handcrafted features are initialized randomly and learned during the training process.  In the following, we give a brief introduction to CNNs and describe how to use them to produce our word representations. Convolutional  neural  networks  [14]  are  one  of  the  most  popular  deep  neural network  architectures  that  have  been  applied  successfully  to  various  fields  of computer science, including computer vision [10], recommender systems [29], and natural  language  processing  [12].  The  main  advantage  of  CNNs  is  the  ability  to extract  local  features  or local  patterns from  data.  In  this  work,  we  apply  CNNs  to extract local features from groups of characters or sub-words.  Suppose  that  we  want  to  learn  the  representation  of  a  Vietnamese  word consisting  of  a  sequence  of  characters  𝑐1𝑐2… 𝑐𝑚,  where  each  character  𝑐𝑖  is represented by its 𝑑-dimensional embedding vector 𝐱𝑖 and 𝑚 denotes the length (in character)  of  the  word.  Let  𝐗 ∈ ℝ𝑚×𝑑  denotes  the  embedding  matrix,  which  is formed from the embedding vectors of 𝑚 characters. We first apply a convolution filter 𝐇 ∈ ℝ𝑤×𝑑 of height 𝑤 and width 𝑑 (𝑤 ≤ 𝑚) on 𝐗, with stride height of 1. We then apply  a  tanh  operator  to  generate a  feature  map  𝐪.  Specifically,  let  𝐗𝑖  be the submatrix consisting of 𝑤 rows of 𝐗 starting at the i-th row, we have 𝐪[𝑖] = tanh (〈𝐗𝑖, 𝐇〉 + 𝑏), where 𝐪[𝑖] is the i-th element of 𝐪, 〈. , . 〉 denotes the Frobenius inner product, tanh is the hyperbolic tangent activation function, and 𝑏 is a bias. Finally,  we  perform  max-over-time  pooling  to  generate  a  feature  𝑓  that corresponds to the filter 𝐇: 𝑓 = max𝑖𝐪[𝑖]. By using ℎ filters 𝐇1, . . . , 𝐇ℎ with different height 𝑤, we will generate a feature vector 𝐟 = [𝑓1, … , 𝑓ℎ], which serves as the character representation of our model.

CHARACTER REPRESENTATIONS (THE OUTPUT OF THE CNNS); 2) THE WORD EMB...

1) character representations (the output of the CNNs); 2) the word embedding; 3) the

embeddings of handcrafted features. Word embeddings, character embeddings, and

the embeddings of handcrafted features are initialized randomly and learned during

the training process.

In the following, we give a brief introduction to CNNs and describe how to use

them to produce our word representations.

Convolutional neural networks [14] are one of the most popular deep neural

network architectures that have been applied successfully to various fields of

computer science, including computer vision [10], recommender systems [29], and

natural language processing [12]. The main advantage of CNNs is the ability to

extract local features or local patterns from data. In this work, we apply CNNs to

extract local features from groups of characters or sub-words.

Suppose that we want to learn the representation of a Vietnamese word

consisting of a sequence of characters 𝑐

𝑐

… 𝑐

, where each character 𝑐

is

represented by its 𝑑-dimensional embedding vector 𝐱

and 𝑚 denotes the length (in

character) of the word. Let 𝐗 ∈ ℝ

denotes the embedding matrix, which is

formed from the embedding vectors of 𝑚 characters. We first apply a convolution

filter 𝐇 ∈ ℝ

of height 𝑤 and width 𝑑 (𝑤 ≤ 𝑚) on 𝐗, with stride height of 1. We

then apply a tanh operator to generate a feature map 𝐪. Specifically, let 𝐗

be the

submatrix consisting of 𝑤 rows of 𝐗 starting at the i-th row, we have

𝐪[𝑖] = tanh (〈𝐗

, 𝐇〉 + 𝑏),

where 𝐪[𝑖] is the i-th element of 𝐪, 〈. , . 〉 denotes the Frobenius inner product, tanh is

the hyperbolic tangent activation function, and 𝑏 is a bias.

Finally, we perform max-over-time pooling to generate a feature 𝑓 that

corresponds to the filter 𝐇:

𝑓 = max

𝐪[𝑖].

By using ℎ filters 𝐇

, . . . , 𝐇

with different height 𝑤, we will generate a feature

vector 𝐟 = [𝑓

, … , 𝑓

], which serves as the character representation of our model.

Bạn đang xem 1) - QUESTION ANALYSIS TOWARDS A VIETNAMESE QUESTION ANSWERING SYSTEM IN THE EDUCATION DOMAIN