2. Experiment

This study sought to determine whether particular video browsing surrogate designs facilitate specific user tasks. The overall goal was to provide users with ways to rapidly obtain task-related data from previews of video clips without having to watch each one in real time.

The two surrogates tested represent different classes of image presentation techniques (online demonstration). The slide show (SS) design is dynamic: it plays key frames extracted from the video one at a time in the same order as in the original video. It is anticipated that the main advantage of such a presentation is that it maintains the temporal integrity of the original clip. On the other hand, the storyboard (SB) is a static design; the temporal order of the key frames is represented physically by location in a matrix. The disadvantage is that users will need to map the temporal dimension onto a planar arrangement. The advantage of a static display is that users can control where they want to look, how long they want to look at it, and how many times they wish to review previous key frames.

The two tasks developed for this experiment reflect different needs that users may have in searching video. The gist determination (GD) task requires the user to develop a mental model or schema of the events in the clip as represented by discrete snapshots in time. Essentially, users will need to "fill in the blanks" between the key frames to develop a coherent model of the gist or story line of the clip. The other task, object recognition (OR), requires that users search for known items. This requires short-term memory resources but also depends on the ability to form a schema: knowing what the clip is about generally will help rule out objects that obviously do not fit within the story line (e.g., a helicopter in a video about aquatic life). Therefore, GD is a goal-oriented task where the users are asked to come up with the overall story line, whereas OR is more task-oriented -- looking for specific objects.

The experimental design used was a 2x2 repeated block factorial (RBF-22). Because each participant received all 4 interface-task treatments (i.e., slide show/gist determination; slide show/object recognition; storyboard/gist determination; storyboard/object recognition), individual differences among participants were controlled by the design and a separate control group was not required. That is, each person served as her/his own control. However, to control for learning effects -- the possibility that participants exposed to a particular interface or task would do better the next time they were asked to use it -- the order of the 4 interface-task combinations/treatments was randomized among participants. Another potential problem with repeated measure designs is mental fatigue. This is not likely to be a problem here, as each of the experimental sessions lasts only several minutes. Nevertheless, randomizing the treatments also controls for possible fatigue effects -- not all participants will become fatigued for the same treatment.

2.1. Hypotheses/Experimental Questions

Hypothesis 1. There will be statistically significant differences in performance between display type and user task: One of the questions Slaughter et al. (1997) posed based on their study on multiple surrogate presentation was whether different surrogate types would be more suited for different task types. It is hypothesized in this study that specific surrogates are better suited for accomplishing specific tasks. In particular, the objective of the GD task is to understand the basic idea presented in the video. Because video is a dynamic medium (i.e., moving images), the temporal dimension, such as the perceived motion itself and the mechanism of scene changes, is a part of the visual narrative structure. The SS display maintains this mechanism, and hence is predicted to be more suited for GD than the SB display. However, since the OR task requires more detailed examination of the images and is not dependent on temporal flow, it is predicted that the SB display will be more effective for this task than GD.

Hypothesis 2. Subjective satisfaction will be higher for the storyboard (static) design than for the slide show (dynamic) interface design overall. However, satisfaction with the slide show interface will be higher for the gist determination task than for object recognition.

Based on several previous studies (Komlodi, A., 1997; Slaughter et al., 1997; and Ding et al., 1997), users consistently rated the SS design as less satisfactory than the SB interface. There is no reason to believe that such subjective evaluations of SS will change in the present study. However, because of the focus on appropriateness of interfaces with specific tasks, it is likely that users will find SS more satisfying for the GD task, hypothesized to be a "better fit," than for the OR task.


2.2. Experimental Design

2.2.1. Independent Variables: 2.2.2. Dependent Variables: 2.2.3. Subjects:

Thirty-four subjects participated in the experiment: 16 females and 18 males. All were students at the University of Maryland in College Park. Twenty were undergraduates enrolled in introductory psychology, 11 were graduate students (masters or doctoral programs) in a variety of disciplines (e.g., library and information science, computer science, and engineering), and the remaining four were undergraduates not in the introductory psychology class. The 20 students from the psychology class were given a single credit toward their course grade in return for their participation. The remaining 14 students did not receive any compensation. All subjects participated on a voluntary basis. The mean age of the participants was 23.79 with a range from 18 to 50 years. All but one of the subjects reported that they had had some Web experience and all reported some experience with graphics or video hardware or software.

2.2.4. Materials:

Software: The software used in the experiment was developed in MS Visual Basic 3.0. Subjects were asked to read along with the instructions presented on the screen while a tape recording of one of the experimenters reading the instructions was played back. Subjects progressed through the screens by pressing buttons marked "Continue" on the bottom of each screen. The sample and experimental trials were self-paced--subjects were free to use the video browser interface for as long as they wanted and progressed to the next screen by pressing a "Continue" button. The storyboard interface displayed all 12 key frames for a clip on one screen. The arrangement was 3 rows of 4 columns (i.e., a 3x4 array). The first four key frames were placed in the top row, ordered from left to right. The next four were in the middle row and the last four in the bottom row. For the slide show interface, images were displayed at a rate of three key frames per second. After all of the images had been displayed, the browser automatically looped back to the first slide and began again. Answers to task-based questions and immediate feedback satisfaction surveys were completed online through selection of predefined answers. The only input device required was a mouse. Besides using the online system, subjects were required to complete the consent form, assessment of spatial visualization (VZ-2), and overall subjective satisfaction questionnaire (QUIS) on paper. The software also included a module to randomize the interface-task treatments so that different subjects would receive each of the four experimental treatments in a random order (to control for learning effects). A text file transaction log was generated at the end of the session. It recorded the image set used, the interface-task combination tested, time spent using the video browser in seconds, and answers to the task and satisfaction questions for each of the four experimental trials.

Video Materials: Video clips were obtained from three Discovery Channel documentary CD-ROMs: Aquatic Habitats, How the West Was Lost, and Wonders of Weather. Eight 1.5 sec - 3.0 sec video clips were selected for this experiment. Key frames (representative still images) were selected through a combination of two methods. First, key frames were selected algorithmically using the MERIT program, developed at the Center for Automation Research (CfAR) at the University of Maryland at College Park (Kobla et al., 1996). The algorithm selects images based on color-histograms. Manual viewing of the clips and extraction of key frames was then done by one of the investigators to supplement those supplied by MERIT. Twelve key frames were selected for each clip. The image files were saved as bitmaps (BMP files) at a resolution of 120 pixels by 120 pixels (120x120).

Experimental Setting and Hardware: Two sessions were arranged in University of Maryland teaching theaters so that multiple subjects could participate simultaneously. Twelve subjects participated during the first session in the AT&T teaching theater. Eighteen additional subjects were present at the second session in the aITs teaching theatre. The remaining four subjects could not attend either session and were administered the experiment in individual sessions at different times in the Digital Library Research Group (DLRG) laboratory in Hornbake Building (HBK 4121B). The computers used in the group experiments were Gateway 2000s with Intel Pentium microprocessors, 15-in. monitors set at 800x600 resolution, and Microsoft Windows 95 operating systems. The computer used in the DLRG lab was also a Gateway Pentium with Windows 95. However, it had a 21-in. monitor set at 800x600 resolution.

Paper-Based Forms: Three paper-based forms were prepared for the experiment:

Two different sets of consent forms were used: one for introductory psychology students (approved by the Psychology Department human subjects board) and one for all other participants (approved by the College of Library and Information Services human subjects board). VZ-2 by the Educational Testing Service (ETS), a standard instrument for measuring SVA, was used. The procedure consists of viewing two sets of 10 questions each. For each question, a sequence of images indicates how a square piece of paper is folded and ultimately punctured by a pencil. Five completely unfolded pieces of paper are shown with holes in them. Participants are instructed to place an "X" over the image that represents what the punctured paper would look like after it is unfolded. Three minutes are allowed for answering each set of 10 questions. The subjective satisfaction questionnaire consisted of 4 parts and was adapted from the QUIS instrument developed by the Human-Computer Interaction Laboratory (HCIL), University of Maryland, College Park. After an initial section on demographics, the first part involved questions about past experience with computer systems and, in particular, visual, graphic, and video experiences. The second part consisted of identical pairs of five questions regarding the two different interface designs (SB and SS). The third part had two general questions about screen and image clarity. The final part had two questions relating to each interface design and a section for general comments. All of the questions were either short answer, multiple choice, or based on a Likert scale (1 - 9).

2.3 Procedures

[Click here for screen shots.]

1. Subjects were briefed on the goals of the experiment.
2. Subjects were asked to read and sign consetn forms.
3. Both interface designs were explaned and demonstrated.
4. An assessment of spatial visual abilities (SVA) was administered.
5. Subjects were given 30 seconds to try each of the interface designs.
6. Two sample trials were administered to familiarize subjects with the experimental conditions. The two trials combined elements from the four treatments so that each of the two different tasks and interface designs were tested.
7. Four experimental trials, one for each treament, were administered.
8. Subjects were debriefed and given an opportunity to ask questions.
9. Subjects completed a brief subjective satisfaction questionnaire.


Continue

[ Abstract | Credits | 1. Introduction | 2. Experiment | 3. Results | 4. Discussion | 5. Conclusions | Acknowledgements | References | Appendices ]