12 Future Research ................ 197

The postscript version of this chapter.

Table of Contents.

Chapter 12


Future Research

In this chapter we list the issues concerning the InfoCrystal that we would like to address in the near future. The chapter that discusses some of the possible applications of the InfoCrystal also touches on the question of how the InfoCrystal might evolve in the future.

As we have already mentioned, we have implemented the InfoCrystal prototype on the Macintosh using the MacLISP programming language. We intend to reimplement the InfoCrystal using a more common programming language and faster platform to be able to interface more easily with a diverse set of retrieval methods, where the InfoCrystal can act as the common interface. We are planning to address the following issues in the near future:

o        We could provide users with a cone shaped visual object, called the Input/Cone, that enables them to interact with a hierarchical classification system or a thesaurus to define the concepts of the InfoCrystal. Users navigate through a thesaurus by moving from more general terms to more specific ones and they can get immediate feedback from the way the data distribution changes. The cone shape is intended to reinforce in a visual way that if users move in the direction where the cone gets narrower, then they are narrowing the query by choosing a more specific concept. We are in the process of implementing the Input/Cone visual representation (see Figure 12.1 and 12.2).

o        We could also use clustering techniques to help users identify concepts of interest or the dimensions along which to explore an information space. A user could retrieve an initial set, apply a clustering technique, such as the Scatter/Gather method developed by Cutting et al. (1991), to this set to identify N principal components, where the number N can be specified by the user. These identified concepts can then be used to create an InfoCrystal with N inputs (see Figure 12.3). Users could progressively refine their concept selections by using this automatic identification of the reference dimensions.

If users click on any of the fields, then they select the concept associated with it and the Input/Cone is updated to reflect this change. Although not shown here, there will be buttons at the edges of each cone layer so that users can view and possibly select concepts that are currently not visible. Users can interact with the Input/Cone to change the specificity of the input concepts. They can easily zoom in and out and observe how the data distribution changes across the different relationships.


Figure 12.2: shows how the Input/Cone can be used to define the inputs to a five-concept InfoCrystal. The interior icons are displayed using numerical style to show how the retrieved documents across the 31 different relationships.


Figure 12.3: displays how users could use a three-concept InfoCrystal, for example, to generate an initial set by selecting only those relationships that satisfy at least two of the three concepts. Next users could perform a clustering analysis on the output of the InfoCrystal. They can use the centriods of the four identified clusters to define the reference concepts for a new, four-concept InfoCrystal.


         Icon Library of Retrieval Engines and Data Generators: The leaf or atomic nodes of the query structure represent the locations where we interface with external information sources. The items, which are retrieved based on the instructions specified in the leaf nodes, are propagated through the query structure in a bottom-up fashion. We want users to be able to interact with a library of icons, which represent the available data generators, that they could select-drag-drop in the corners of the InfoCrystal's border area to specify an atomic input. Once the input has been defined, the appropriate window would appear asking the user to specify the needed settings.

         3-D visualization: We want to explore how we could use 3D computer graphics techniques to enhance the InfoCrystal: 1) We could place the corners and the criterion icons of an InfoCrystal in the third dimension to reflect the input weights. 2) The interior icons could be placed in the third dimension to reflect their relevance scores. 3) There are already now multiple layers of information associated with the interior of an InfoCrystal, and in the future we could imagine that further information could be added. For example, we could enable users to interact with these different levels of information by varying the degree of transparency between the layers or depth of focus.

         How to Assist Users in the Programming of the InfoCrystal ? A single InfoCrystal visualizes a large universe of feasible queries. How we can help users to explore this huge query space ? How can we help users converge quickly on a query that satisfies their information need ? In chapter 4 we have addressed how we can assist users in the translation of Boolean expressions into the InfoCrystal. We also want to develop methods that use other sources of information to determine the selection pattern of the interior icons. For example, we would like to integrate learning mechanisms to assist the user in the selection of the interior icons. This could take the form where learning agents use user relevance feedback to compute the selection status of the interior icons or to recompute the input weights and the threshold setting. Further, we would like to explore the possibility of enabling users to create macros that help them to explore the large query space visualized by an InfoCrystal.

         Revealing the Complexity Gracefully: As we have pointed out, the number of possible relationships between N concepts grows exponentially. The InfoCrystal offers users a structured overview of all these relationships, but their sheer number can be overwhelming. Users currently have the possibility to display the interior icons in different styles to emphasize the icons of interest. For example, users can choose from a list of common Boolean expressions to select a subset of the interior icons, where the not selected ones are displayed in the point style to not crowd the display.
We are interested in developing alternative ways of enabling users to juggle many different concepts without becoming too overwhelmed by the resulting complexity. 1) We are considering a fish-eye transformation that emphasizes interior icons satisfying certain requirements and de-emphasizes the others. 2) We are interested in developing methods for displaying the different interior icons in stages and as the need arises ("just-in-time-display"). These methods could consider the way the user interacts with the InfoCrystal to determine how to display and "roll out" the interior icons. We have already implemented a method along these lines in the context of applying Boolean operations (see section 4.2.2 and Figure 4.2).

         Support multiple data-visualization methods: We want to build a software environment, where we can view the information using different visualization techniques, and where the output of one visualization can be the input to another one. For example, we would like to be able to visualize the InfoCrystal's output using Parallel Coordinates (PCs), select a subset of the items displayed in PCs and pass them on to another visual analysis tool.

         We want to continue the development of our object-centered software environment to be able to use any data type as an input to the InfoCrystal and then have the appropriate retrieval method automatically called. For example, at any point in the search process we want users to be able to switch from a keyword-based to a full-text, Partial Matching approach, where they do not have to concern themselves that the appropriate retrieval engine is called. They could select a document from the ranked-list window of the InfoCrystal and drag-drop it in the location of the criterion icon that they wish to replace with this particular document, because it better captures a particular aspect of their information need.

         One of our ultimate goals is to be able to test if the InfoCrystal interface leads to more effective retrieval as measured in terms of precision and recall. This one of the reasons why we want to reimplement the InfoCrystal on a more powerful platform than the Macintosh so that we can more easily tap into a rich array of retrieval engines that enable us to search large databases more quickly. We believe that one of the InfoCrystal's key advantages is that it is flexible in terms of the retrieval methods that can be used and that it enables users to move seamlessly and quickly between them. Information retrieval is a highly interactive process, where users start out with one translation of their information need, which they modify as their search progresses and they are responding to the intermediary results. The InfoCrystal could be well suited to support users in the search process. We believe that the ability to explore large information spaces with a variety of powerful retrieval engines will show us how the InfoCrystal needs to evolve to become a truly effective retrieval interface.