... I KNEW HOW TO MODIFY MY ESSAY 0.791 0.478 0.327 4.1 > 3.9 3...

16. ... I knew how to modify my essay 0.791 0.478 0.327 4.1 > 3.9 3.7 < 3.8

Table 1. System questionnaire results

sponsible for the non-significant NM effect on the

the NM as a reference while updating their essay.

dimension captured by Q12.

In addition to the 16 questions, in the system

Concentration. Users also think that the NM

questionnaire after the second problem users were

enabled version of the system requires less effort in

asked to choose which version of the system they

terms of concentration (Q7). We believe that hav-

preferred the most (i.e. the first or the second prob-

ing the discourse segment purpose as visual input

lem version). 24 out 28 users (86%) preferred the

allows the users to concentrate more easily on what

NM enabled version. In the open-question inter-

the system is uttering. In many of the open ques-

view, the 4 users that preferred the noNM version

tion interviews users stated that it was easier for

(2 in each condition) indicated that it was harder

them to listen to the system when they had the dis-

for them to concurrently concentrate on the audio

course segment purpose displayed on the screen.

and the visual input (divided attention problem)

and/or that the NM was changing too fast.

Results for Q14-16

Questions Q14-16 were included to probe user’s

To further strengthen our conclusions from the

post tutoring perceptions. We find a trend that in

system questionnaire analysis, we would like to

note that users were not asked to directly compare

the NM problems it was easier for users to under-

the two versions but they were asked to individu-

stand the system’s main point (Q14). However, in

ally rate two versions which is a noisier process

terms of identifying (Q15) and correcting (Q16)

(e.g. users need to recall their previous ratings).

problems in their essay the results are inconclusive.

We believe that this is due to the fact that the essay

The NM survey

interpretation component was disabled in this ex-

While the system questionnaires probed users’

periment. As a result, the instruction did not match

NM usage indirectly, in the second to last step in

the initial essay quality. Nonetheless, in the open-

the experiments, users had to fill a NM survey

question interviews, many users indicated using

which explicitly asked how the NM helped them, if

NM condition. The fact that in the second problem

at all. The answers were on the same 1 to 5 scale.

the differences are much smaller (e.g. 2% for

We find that the majority of users (75%-86%)

AsrMis) and that the NM-AsrMis and NM-

SemMis interactions are not significant anymore,

agreed or strongly agreed that the NM helped them

follow the dialogue, learn more easily, concentrate

suggests that our observations can not be attributed

to a difference in population with respect to sys-

and update the essay. These findings are on par

with those from the system questionnaire analysis.

tem’s ability to recognize their speech. We hy-

pothesize that these differences are due to the NM