4, HIKARIDAI, SEIKA-CHO, SORAKU-GUN, KYOTO, JAPANABSTRACTON THEIR TA...
2-4, Hikaridai, Seika-cho, Soraku-gun, Kyoto, Japan
Abstract
on their target domains, interfaces, and interactions
to draw out additional information from users to ac-
We have been investigating an interactive
complish set tasks, as is shown in Table 1. In this
approach for Open-domain QA (ODQA)
table, text and speech denote text input and speech
and have constructed a spoken interactive
input, respectively. The term “addition” represents
ODQA system, SPIQA. The system de-
additional information queried by the QA systems.
rives disambiguating queries (DQs) that
This additional information is separate to that de-
draw out additional information. To test
rived from the user’s initial questions.
the efficiency of additional information re-
quested by the DQs, the system recon-
structs the user’s initial question by com-
bining the addition information with ques-
Table 1: Domain and data structure for QA systems
tion. The combination is then used for an-
target domain specific openswer extraction. Experimental results re-
data structure knowledge DB unstructured textvealed the potential of the generated DQs.
without addition CHAT-80 SAIQAtext with addition MYCIN (SPIQA∗
)without addition Harpy VAQA1 Introduction
speechwith addition JUPITER (SPIQA∗
)Open-domain QA (ODQA), which extracts answers
∗SPIQA is our system.from large text corpora, such as newspaper texts, has
been intensively investigated in the Text REtrieval
Conference (TREC). ODQA systems return an ac-
To construct spoken interactive ODQA systems,
tual answer in response to a question written in a
the following problems must be overcome: 1. Sys-
natural language. However, the information in the
tem queries for additional information to extract an-
first question input by a user is not usually sufficient
swers and effective interaction strategies using such
to yield the desired answer. Interactions for col-
queries cannot be prepared before the user inputs the
lecting additional information to accomplish QA are
question. 2. Recognition errors degrade the perfor-
needed. To construct more precise and user-friendly
mance of QA systems. Some information indispens-
ODQA systems, a speech interface is used for the
able for extracting answers is deleted or substituted
interaction between human beings and machines.
with other words.
Our goal is to construct a spoken interactive
ODQA system that includes an automatic speech
Our spoken interactive ODQA system, SPIQA,
recognition (ASR) system and an ODQA system.
copes with the first problem by adopting disam-
To clarify the problems presented in building such
biguating users’ questions using system queries. In
a system, the QA systems constructed so far have
addition, a speech summarization technique is ap-
been classified into a number of groups, depending
plied to handle recognition errors.
2 Spoken Interactive QA system: SPIQA
ODQA engine
The ODQA engine, SAIQA, has four compo-
Figure 1 shows the components of our system, and
nents: question analysis, text retrieval, answer hy-
the data that flows through it. This system com-
pothesis extraction, and answer selection.
prises an ASR system (SOLON), a screening filter
DDQ module
that uses a summarization method, and ODQA en-
When the ODQA engine cannot extract an appro-
gine (SAIQA) for a Japanese newspaper text corpus,
priate answer to a user’s question, the question is
a Deriving Disambiguating Queries (DDQ) module,
considered to be “ambiguous.” To disambiguate the
and a Text-to-Speech Synthesis (TTS) engine (Fi-
initial questions, the DDQ module automatically de-
nalFluet).
rives disambiguating queries (DQs) that require in-
formation indispensable for answer extraction. The
Additional
Question
info.
New question
reconstructor
situations in which a question is considered ambigu-
Recognition
Question/
ous are those when users’ questions exclude indis-
result
Additional info.
ODQA engine
Screening
ASR
First
filter
(SAIQA)
pensable information or indispensable information
question
UserAnswer/
Answer
sentence
is lost through ASR errors. These instances of miss-
DDQ speech
TTS
derived?
Yes
sentence generator
ing information should be compensated for by the
No
DDQ
users.
module
To disambiguate a question, ambiguous phrases
Figure 1: Components and data flow in SPIQA.
within it should be identified. The ambiguity of
each phrase can be measured by using the struc-
tural ambiguity and generality score for the phrase.
ASR system
The structural ambiguity is based on the dependency
Our ASR system is based on the Weighted Finite-
structure of the sentence; phrase that is not modified
State Transducers (WFST) approach that is becom-
by other phrases is considered to be highly ambigu-
ing a promising alternative formulation for the tra-
ous. Figure 2 has an example of a dependency struc-
ditional decoding approach. The WFST approach
ture, where the question is separated into phrases.
offers a unified framework representing various
Each arrow represents the dependency between two
knowledge sources in addition to producing an op-
phrases. In this example, “the World Cup” has no
timized search network of HMM states. We com-
bined cross-word triphones and trigrams into a sin-
gle WFST and applied a one-pass search algorithm
Which country in Southeast Asia won the world cup ?to it.
Figure 2: Example of dependency structure.
Screening filter
modifiers and needs more information to be identi-
To alleviate degradation of the QA’s perfor-
fied. “Southeast Asia” also has no modifiers. How-
mance by recognition errors, fillers, word fragments,
ever, since “the World Cup”appears more frequently
and other distractors in the transcribed question, a
than “Southeast Asia” in the retrieved corpus, “the
screening filter that removes these redundant and
World Cup” is more difficult to identify. In other
irrelevant information and extracts meaningful in-
words, words that frequently occur in a corpus rarely
formation is required. The speech summarization
help to extract answers in ODQA systems. There-
approach (C. Hori et. al., 2003) is applied to the
fore, it is adequate for the DDQ module to generate
screening process, wherein a set of words maximiz-
questions relating to “World Cup” in this example,
ing a summarization score that indicates the appro-
such as “What kind of World Cup?” , “What year
priateness of summarization is extracted automati-
was the World Cup held?”.cally from a transcribed question, and these words
The structural ambiguity of the
n-th phrase is de-
are then concatenated together. The extraction pro-
fined as
cess is performed using a Dynamic Programming
AD
(Pn
) = log1−N
i=1:i=n
D(Pi
, Pn
),(DP) technique.
where the complete question is separated into
Nsystem. The question transcriptions were processed
with a screening filter and input into the ODQA
phrases, and
D(Pi
, Pn
)is the probability that phrase
Pn
will be modified by phrase
Pi