Abstract
Introduction
Experiment
Results
Discussion
Conclusions
Acknowledgements
References
Appendices
Credits
Feedback
Back To Main
|
The Effect of Direct Annotation on Speed and Satisfaction
Experiment
Introduction and Hypothesis
We conducted an experiment to find which methods of annotation
would be the most efficient to annotate a photo on a display. To do the
experiment, we modified the source code for the existing Photofinder
prototype to conduct our experiment. For the methods of annotation, we
came up with three different methods: direct annotation, click & type, and
text box.
In our 1 x 3 experiment, the independent variable was the
annotation method of each picture. The three different methods thus became
the treatments for the independent variable, with each subject being
exposed to all three methods (referred to as a within subjects
experiment). Briefly explained, the direct annotation method is similar to
what is commonly known as "drag & drop," a method in which a user clicks
on a name from the list of names and drag it to its proper location. The
click & type method was done by clicking on a certain place on a photo and
typing in a name. Lastly, the text box method was a method that involved
typing in all the names in a text box provided, typing the names as it
appears on a photo from left to right.
For the dependent variables, we had
time of completion and subjective preference for each independent
variable. Our group's hypothesis was that the direct annotation would have
the best overall result: better time and better subjective satisfaction
compared to the click &
type and the text box methods. For the click & type method, it would have
second subjective satisfaction but the slowest time due to its combination
of using keyboard & mouse. Lastly, the text box should have a lowest
subjective satisfaction due to its dull nature; however, since the text
box only requires a usage of a keyboard, this method should have a
comparable completion time with direct annotation.
Pilot Study Results
We used five
subjects for our pilot study. The result of pilot study was a
little mixed; we could not point out which three methods was fastest. The
subjective preference of three methods was also mixed, as there
was no definite pattern we could see in the pilot study result. We
concluded that since the number of subjects was so small, the results were
non-conclusive; thus it was rather hard to have any idea
about whether our hypothesis would be correct or not.
There were some
design changes in the user interface to reduce errors by subjects and the
number of instructions was shortened so that the subject would not have to
read a lot of text. Overall, it was a successful pilot study as the
experiment went according to what we hoped for and we found some weakness
that we could work on.
Subjects
For our subjects, we asked for volunteers from computer science
classes here at the University of Maryland. It turned out we were
fortunate to have such a large group of subjects. After the experiment was
done, we had a total of 48 subjects. We had some concerns that they were
predominantly male (~80% male) and at about the same age group
(18~21). We thought this might create some bias toward our experiment
since the 80% representation of males is not proportional to reality. The
fact that the subjects were from computer science classes might add to the
bias, since most of the subjects would be experienced computer users. One
way to vastly improve the validity of the experiment would be to sample a
random group of people from the entire campus, or even better, from the
general population.
Materials
The experiment was done by using a modified version of the
Photofinder prototype. Everything was done on a computer, from handling a
survey to measuring time. Before starting our experiment, we provided a
background survey followed by
training. The background survey contained
following questions: "How often do you use a computer?" "How familiar are
you with the windows interface?", whether the subject was male or female,
and what age group the subject was in by selecting from age group
ranges.
In the first two questionnaires, a subject was asked to rate their
frequency of use of computers from 1 to 9. The first question regarding
frequency of using computers, we provided an ordinal scale with 1 being
"less than once a month", 5 being "couple of times a week", and 9 being
"several times a day". For the familiarity with the windows question, we
provided a range of 1 being "completely new", 5 being "I know my way
around" and 9 being "I've mastered it".
After the background survey, a
subject then proceeded to the training part. In the
training part, a
subject could spend a time playing around with three methods of annotating
a photo: direct annotation, click & type, and text box. For each method,
we provided a
sheet with some example photos to annotate. In addition,
we provided the
instruction sheet
and a
sheet with photos
of people in the demo photos, with theirnames captioned. On-line
help was also available to a subject by clicking on a help button. When a
subject was satisfied to begin the real experiment, a subject was then asked to
enter in an ID number that determined the
order in which they did the three methods.
For the actual
experimental timed portion, we gave out a sheet of new set of named photos
of people. We made three sheets of named photos with each page titled
"direct annotation",
"click & type", or
"text box". The order in which the
subject was pre-set so that with 1 being direct annotation, 2 being click
and type and 3 being the textbox method, the subject would get an order
from one of the six permutations of the three numbers (123, 132, 213, 231,
312, 321). For every permutation, an even number of subjects were
set. This was done to prevent some bias that might happen if we used some
fixed ordering to do our experiment. Also to prevent any sort of bias due to the name
length ending up affecting the experiment, the total number of characters
that the subjects had to type were exactly the same.
During the
experiment, the computer would keep track of time and recorded any
mistakes that might be made by a subject. There was no way for a subject
to know how fast they were doing, however, since the time was not
displayed. The program won't tell a subject when they annotated wrong,
either. Every subjects did all three methods albeit in different
order. After the experiment was over, a subject was then asked for the
questionnaires to rate the each three annotating methods. This was done in
scale of 1 to 9 with 1 being the worst subjective preference and 9 being
the best subjective preference.
Procedures and Problems
The administration of the process is detailed in the above section. Some
of the problems that we encountered included the absence of subjects who
signed up; the total number of subjects that signed up was over 70, but
only 48 came. Also, each trial was supposed to not last more than 20
minutes but some subjects took quite a long time in the training
environment, resulting in some of the subjects running over their
allotted time and us falling behind in our schedule.
There also was a problem with the subjects not reading all of the
instructions, perhaps due to its length. Fortunately the demo/training
period handled most of the problems of the instruction length; the subjects, for the most
part, asked for help if they did not know how to proceed. Since there was a lot of
instructions that were necessary to read, this was a problem that could not easily be
solved.
|