1 PARALLEL CORPUS COLLECTIONTIVELY UNTIL CONVERGENCE. THE TEXTRANK...
3.3.1 Parallel Corpus Collection
tively until convergence. The TextRank score of a
word w in document D at kth iteration is defined as
In Q&A archives, question-answer pairs can be con-
follows:
sidered as a type of parallel corpus, which is used for
estimating the translation probabilities. Unlike the
ei,j
∑Rk
w,D
= (1−d) +d· ∑bilingual machine translation, the questions and an-
∀
l:(j,l)
∈
G
ej,l
Rk
w,D
−
1
∀
j:(i,j)
∈
G
swers in a Q&A archive are written in the same lan-
(16)guage, the translation probability can be calculated
where d is a damping factor usually set to 0.85, and
through setting either as the source and the other as
e
i,j