SHORE 2001 Logo
SHORE 2001 Logo University of Maryland Logo
Student HCI Online Research Experiments
Abstract
Introduction
Experiment
Results
Discussion
Conclusions
Acknowledgements
References
Appendices
Credits
Feedback
SHORE 2001 : Help :

The Effectiveness of Online Help Systems: Text Only, Animated Images Only, and Integrated Interactive

Results

When the subjects were performing tasks, each task completion time was recorded in seconds.  The total time the subjects spent on PhotoFinder and answering thirteen subjective questions were also recorded.  The recorded data is presented in the Raw Data Table in the Appendices section.  The five tasks’ completion times were summed, recorded and named Total Tasks Time or TTT.  The relevant data and descriptive statistics are presented in tabular and graphical forms below.

Tasks Completion Times

The total time is the time that the subjects spent on PhotoFinder during the experiment.  TTT is the sum of times that the subjects spent on five tasks.

Text, N = 10

Total Time

Task 1

Task 2

Task 3

Task 4

Task 5

TTT

Mean

382.5

38.4

124.1

68.4

46.9

61.8

339.6

Standard Deviation

264.29

37.06

72.50

66.58

28.99

53.14

179.36

Animated, N = 10

Total Time

Task 1

Task 2

Task 3

Task 4

Task 5

TTT

Mean

257

52.5

96.3

34.1

34.6

26.4

243.9

Standard Deviation

92.63

64.22

60.29

50.85

21.59

10.84

104.84

Integrated, N = 10

Total Time

Task 1

Task 2

Task 3

Task 4

Task 5

TTT

Mean

262.8

56.2

81.7

39.5

33.6

26.6

237.6

Standard Deviation

100.23

49.82

54.48

25.42

23.78

9.78

94.40

 

The chart below (Figure 1) shows the comparison between the mean of each task completion time of the three groups (Text only, Animated only, and Integrated interactive online help).

Figure 1

The chart below (Figure 2) shows that mean TTT (Total Task Time) for the online help systems are: Text only 339.8 seconds; Animated only 243.9 seconds; and Integrated 237.6 seconds. 

Figure 2

ANOVA, statistically analyzed data in tabular form below, shows that the difference among the mean TTTs is not statistically significant since F < F crit  (F = 1.88 < F crit = 3.35), with significance level α = 0.05.  Hence the null hypothesis cannot be rejected.

TTT, ANOVA: Single Factor

         

SUMMARY

           

Groups

 

Count

 

Sum

Average

Variance

   

Text TTT

 

10

 

3396

339.6

32169.82

   

Animated TTT

 

10

 

2439

243.9

10992.32

   

Integrated TTT

 

10

 

2376

237.6

8911.82

   

ANOVA

           

Source of Variation

SS

 

Df

 

MS

 

F

P-value

F crit

Between Groups

65340.6

 

2

 

32670.3

 

1.88

0.17

3.35

Within Groups

468665.7

 

27

 

17358

 

Total

534006.3

 

29

   

To confirm ANOVA’s results, the t-test was also performed on three pairs of treatments; Text only and Animated only, Animated only and Integrated, and Text and Integrated online helps.  Null hypothesis (Ho) is rejected when p < alpha, but

Text only and Animated only online help:    p one-tail = 0.07 > alpha = 0.05

Animated only and Integrated interactive online help:   p one-tail = 0.32 > alpha = 0.05 

Text only and Integrated interactive online help:      p one-tail = 0.13 > alpha = 0.05

Therefore, the null hypothesis cannot be rejected.  (See Sheet1 worksheet in Statistical data in Appendices.)

Since Figure 1 shows that there may be some statistically significant difference among each task performance time of the three treatments, ANOVAs (single factor analysis of variances) were used to analyze the performance times. The results are:

  • Task 1, F = 0.33 < F crit = 3.35
  • Task 2, F = 1.17 < F crit = 3.35
  • Task 3, F = 1.33 < F crit = 3.35
  • Task 4, F = 0.88 < F crit = 3.35
  • Task 5, F = 4.10 > F crit = 3.35

The result of ANOVA (table below) on Task 5's performance time shows statistically significant differences. It implies that the null hypothesis can be rejected. Therefore, our hypothesis is accepted, for Task 5 (Create a slide show of the photo's in the Buildings collection.) Integrated Interactive online help has the best performance time. On the other hand, ANOVAs on Task 1, 2, 3 and 4 show no statistically significant differences. (See Sheet1 worksheet in Statistical data.)

Task 5 ANOVA: Single Factor

         

SUMMARY

           

Groups

Count

Sum

Average

Variance

     

Text Only

10

618

61.80

2823.51

     

Animated Images Only

10

264

26.40

117.60

     

Integrated Interactive

10

266

26.60

95.60

     

ANOVA

           

Source of Variation

SS

df

MS

F

 

P-value

F crit

Between Groups

8307.47

2

4153.73

4.10

 

0.03

3.35

Within Groups

27330.40

27

1012.24

       

Total

35637.87

29

         
             

            Subjective Questionnaires

Questions 1 to 4 are background questions.  Only questions 5 to13 used the scoring system of from 1 (least preferred) to 9 (most preferred).  Means and standard deviation are shown in tabular forms below.

Text, N= 10

Q5

Q6

Q7

Q8

Q9

Q10

Q11

Q12

Q13

Mean

5.3

5.3

6

5.8

6.7

6.8

6.2

7.1

5.6

Standard Deviation

2.45

2.8

2.45

2.55

1.67

2.01

2.89

1.79

2.05

Animated, N = 10

Q5

Q6

Q7

Q8

Q9

Q10

Q11

Q12

Q13

Mean

7.3

7.9

7.4

7.3

7.3

6.6

6.1

7.1

6.9

Standard Deviation

1.8

1.4

1.65

1.77

1.34

1.84

1.56

0.88

1.79

Integrated, N = 10

Q5

Q6

Q7

Q8

Q9

Q10

Q11

Q12

Q13

Mean

6.7

6.4

6.4

6.4

6.8

6.8

7

7

6.5

Standard Deviation

1.6

2.1

1.96

1.84

1.87

1.93

1.41

1.15

1.72

 

The chart below (Figure 3) shows the comparison between the mean of each score given by the subjects of the three treatment groups. 

Figure 3

The chart below (Figure 4) shows mean total subjective scores with a 95% confidence interval (+ and – from the mean) for the online help systems are: Text only 58.4; Animated only 63.9; and Integrated 60.

Figure 4

ANOVA, statistically analyzed data in tabular form below, shows that the difference among the means is not statistically significant since F < F crit  (F = 0.88 < F crit = 3.35), with significant level α = 0.05.  Hence the null hypothesis cannot be rejected.

Mean Total Subjective Scores,  ANOVA: Single Factor
SUMMARY            
Groups

Count

Sum

Average

Variance

   

Text total score

10

548

54.8

379.29

   

Animated total score

10

639

63.9

128.99

   

Integrated total scores

10

600

60

188.89

   

ANOVA

           

Source of Variation

SS

df

MS

F

P-value

F crit

Between Groups

416.87

2

208.43

0.90

0.42

3.35

Within Groups

6274.50

27

232.39

     

Total

6691.37

29

       

 

            ANOVA shows no statistical significance in any of the subjective questionnaire scores given by the three groups of subjects.  The 30 subjects included 16 males and 14 females.  They were distributed evenly among three treatments.  About 57% of the subjects (17 out of 30) considered themselves intermediate in computer experience level, seven out of 30 considered themselves as advanced computer experience level, four subjects were in the beginner level, and three subjects did not answer the question.  Nineteen subjects had English as their first language, nine did not, and two did not answer the question.  There were also short-answer questions, e.g., “What were the problems during the use of PhotoFinder?” and a request for “Other comments and or suggestions.”  These are included under Subjective Questionnaires in the Appendices. 

            Microsoft Excel 2000 was used to analyze the raw data. The whole workbook is presented as Statistical data in Appendices.

 

Prev...