Discussion

4.1  Overview of Analysis and Findings

The outcome of our results are such that we cannot make conclusive remarks about optimal coding schemes in Spotfire. Multiple factors contributed to this. These include subjects' learning of the query tasks (creating a possible bias for faster performance at the end of the experiment), large variances (standard deviations) in our statistical data. In addition, our results are based on only 13 subjects, of which may or may not reflect results we would have gotten with a larger subject pool.

4.1.2  Analysis of user remarks and observations of subjects

Some subjects said that they preferred to look only at the query devices, ignoring the starfield display completely. Because the task was the same in each treatment(case), subjects most likely memorized or became familiar with the sequence of mouse movements, clicks, etc. to accomplish the task. We attribute the improvement in performance to this learning phenomenon.  

4.2  Summaries of statistical results

The summaries below discuss the interactions between the rows and columns of the 2x3 Anova.

  • In 4.2.1, you will find three tables, each of which compares the three cases of the starfield display treatment and our findings.
  • In 4.2.2 and 4.2.3, you will find two tables, each of which compares the two treatments of color schemes(color-coded and graded-shading), and our findings.
  • Finally, in 4.2.4, we discuss the implications of the overall 2x3 Anova with replication.

    Additionally, our MS Powerpoint Discussion slides are available.

    4.2.1  Summary of t-test paired two sample for means

    Click here to see the [t-test] interaction between the two treatments of Popularity & Length on the 2x3 Anova.
    Cases w/ starfield display: Popularity & LengthMeanVariance
    Case 1 (color-coded)70.23609.86 (Standard Deviation = 24.70)
    Case 4 (graded-shading)53.31321.90 (Standard Deviation = 17.94)

    The fact that the data points are closely packed together on the starfield display may negate any benefits of either color or shading schemes. The large difference between means and variances makes it difficult to draw significant conclusions. We feel that the faster time for case 4 is attributed to subjects learning the interface/task.  

     

    Click here to see the [t-test] interaction between the two treatments of Subject & Length on the 2x3 Anova.
    Cases w/ starfield display: Subject & LengthMeanVariance
    Case 2 (color-coded)57.77732.03 (Standard Deviation = 27.06)
    Case 5 (graded-shading)50.00576.83 (Standard Deviation = 24.02)

    The small difference between both means and variances suggest that color coding and graded shading has minimum affect on user performance.  

    Click here to see the [t-test] interaction between the two treatments of Subject & Popularity on the 2x3 Anova.
    Cases w/ starfield display: Subject & PopularityMeanVariance
    Case 3 (color-coded)50.31326.90 (Standard Deviation = 18.08)
    Case 6 (graded-shading)37.62148.92 (Standard Deviation = 12.20)

    Ideally the difference between this setup and previous setup(case #2 & #5) should lead to similar results. However, the significantly faster performance of case #6 can be attributed largely to subject learning tasks through repetition.  

     

    4.2.2  Summary of single factor Anova on the three color-coded cases

    Click here to see the visual representation of our 2x3 Anova.
    Click here to see the interaction between cases 1,2 and 3.
     MeanVariance
    Case 170.23609.86 (Standard Deviation = 24.70)
    Case 257.77732.03 (Standard Deviation = 27.06)
    Case 350.31326.90 (Standard Deviation = 18.08)

    Case 1 (Popularity & Length) yields a significant higher mean value (slower performance) than case 2 (Subject & Length) -- 70.23 & 57.77. The variances for the 2 cases appear to be close --( Standard Deviation 24.7 & 27.06). The high mean difference and small variance differences allow us to attribute the difference between the 2 means to the difference in query setup( Popularity & Length vs. Subject & Length).

    Case 2 (Subject & Length) yields a slight higher mean value (slower performance) than case 3 (Subject & Popularity) -- 57.77 & 50.31. The variances for the 2 cases however appear to be significant -- (Standard Deviation 27.06 & 18.08). The small mean difference and the large variance differences shows that it is plausible that the difference between the means are attributed to the chance variation.  

     

    4.2.3  Summary of single factor Anova on the three graded-shading coded cases

    Click here to see the visual representation of our 2x3 Anova.
    Click here to see the interaction between cases 4,5 and 6.
     MeanVariance
    Case 453.31321.90 (Standard Deviation = 17.94)
    Case 550.00576.83 (Standard Deviation = 24.02)
    Case 637.62148.92 (Standard Deviation = 12.20)

    Case 1 (Popularity & Length) yields a slight higher mean value (slower performance) than case 2 (Subject & Length) -- 53.31 & 50.00. The variances for the 2 cases appear to be significant (Standard Deviation 17.94 & 24.02). The small mean difference and large variance differences suggest the difference between the 2 means could be affect by chance variation.

    Case 2 (Subject & Length) yields a significant higher mean value (slower performance) than case 3 (Subject & Popularity) -- 50.00 & 37.62. The variances for the 2 cases also appear to be significantly. The large mean difference and the large variance differences suggest thereis a chance that variance affects the means greatly in this particular case.  

     

    4.2.4   Summary of Two-factor Anova with replication (2x3)

    The basic 2 x 3 setup up to now makes 2 assumptions. The three query setups have the same magnitude of affect on the each of the 2 color/shade schemes. The same additive affect works vice versa.

    However, the other possibility exists that there is interaction between each of the 2 x 3; that is each individual query has different magnitude of affect on individual color/shade scheme, and vice versa. So each item in the 2 x 3 combination does not necessary have correlation with the others.

    Click here to see the [row-wise] interaction analysis in our 2x3 Anova.

    The table below contains the means of our test results in 2 x 3 format.
     Popularity & LengthSubject & LengthSubject & Popularity
    Color-coded70.231(mean)57.769(mean)50.308(mean)
    Graded-shading53.308(mean)50.000(mean)37.615(mean)

    To simplify the explanation, assume each number between (1) and (6) corresponds to the 2 x 3 matrix above.
     Popularity & LengthSubject & LengthSubject & Popularity
    Color-coded(1)(2)(3)
    Graded-shading(4)(5)(6)

    Ideally, the trend of curves between [(1)(2)(3)] and [(4)(5)(6)] should be similar due to the similar query setups. However, our actual data depicts a graph with (1) much higher (slower performance) than the "hypothetical" trend and (6) much lower (faster performance) than the "hypothetical" trend. The results of these 2 cases may be due to individualized interaction between each of the instances in the 2 x 3 matrix above.

    The nature of closely packed data points on the starfield display in (1) may account for the slow performance time. On the other hand, it may be because this was the first task, and the users were simply unfamiliar with the task.

    Case (6) yielded significantly faster performance since the requirement is to have an exact subject (drama) and popularity (most popular), to meet the requirement for length, users simply have to manipulate the length slider within the 1 1/2 - 2 hour range. On the other hand, the results could also be attributed largely to subjects learning the tasks through repetition.


    Continue --> Conclusion

    Return to Main
    Return to SHORE '98