11 Applications ................ 187

The postscript version of this chapter.

Table of Contents.


Chapter 11

 

Applications

In this chapter we present a collection of brief scenarios of how the InfoCrystal representation could be applied in different domains. This collection is not intended to be exhaustive, but instead to stimulate the reader's mind and to demonstrate the versatility of the InfoCrystal.

The InfoCrystal is a flexible tool that enables users to compare arbitrary sets of data items. These sets can be fuzzy sets, where the degree of membership can be used in the visualization. The InfoCrystal can structure and cluster the information into discrete or continuous groupings to visualize how the information is related to several criteria. The InfoCrystal has applications as a visual Boolean Calculator in any domain where users need to coordinate several criteria. The InfoCrystal makes it easy for them to formulate and change their queries. Further, users can specify relevance weights and the InfoCrystal can visualize the resulting ranked output. The question at hand is: In which situations is the type of structure the InfoCrystal can visualize and are the types of operations it makes possible useful to users? We will begin to answer this question by listing some the data generators that could be used to define the inputs to the InfoCrystal. Next we will discuss how the InfoCrystal can be used as a general-purpose coordinator or generator of arbitrary data streams. Finally, we will provide brief scenarios how the InfoCrystal can be applied in the following ways and domains: Internet Exploration, Document & Information Retrieval, Database Mining, Multimedia Editing, Electronic Mail Filters, Hypertext Browser and Link Builder, Statistical Visualization, Visualizing the Power Set, Boolean Networks, and Neural Networks.



Figure 11.1: shows how the InfoCrystal can be used to coordinate diverse data generators to create a hybrid or heterogeneous data generator. The input data generators are: (a) Parallel Coordinates; (b) Slider; (c) Clustering Overview; (d) Map; (e) Input/Cone. The interior icons have been selected so that all the data that is generated by only one of the input generators is included in the output as well as data that is generated by at least four of the inputs. We have also selected some of the icons of rank three.


There are multiple ways to generate an InfoCrystal input, as long as the device, which is used, generates an ordinary or fuzzy set as its output: 1) Parallel Coordinates representation, which maps a point in a higher-dimensional space into a piece-wise linear curve in a two-dimensional display using parallel coordinates so as to not loose any information [Inselberg 1985]. Users can select a subset of these lines to define the output. 2) Slider, where users can define range(s) of values along a discrete or continuous data dimension. 3) Clustering Overview, where users can select multiple subsets of items to define the output. 4) Map, which can be used to specify a subset or a two-dimensional area of data points. 5) Thesaurus or Classification Hierarchy, where users can select the appropriate concepts at the desired level of specificity by interacting with an Input/Cone object (see Figure 12.1 for explanation). Figure 11.1 shows the visual objects representing the data generators mentioned above. This list of data generators is not meant to be exhaustive, but rather to give an indication of the types of generators can be used. Figure 11.1 further shows how the InfoCrystal can be employed as a general purpose coordinator of arbitrary information generators, and can act as a generator of diverse data streams. For example, we can use the InfoCrystal to combine and coordinate data streams containing diverse data types. If the data sets generated by the inputs do not overlap a great deal, then the InfoCrystal will clearly visualize this by having the data cluster away from the center.

We now will provide brief scenarios how the InfoCrystal can be applied in the following ways and domains:

·         Document & Information Retrieval: It is important to distinguish between Document Retrieval and Data Retrieval. The retrieval of documents requires different tools than the search for data, because documents contain contextual and structural information that needs to be considered to be effective. Hence, we have developed additional visual tools to formulate and represent stemming, field and proximity specifications, which are of great value in text retrieval, employing a simple metaphor to visualize the resulting broadness of a query.
With respect to Boolean coordination, retrieval specialists often suggest to searchers to generate queries, where quasi-synonymous words for each conceptual factor are ORed and these different synonym lists are then ANDed [Cooper 1988, Marcus 1991 and 1994]. Our default selection of the interior icons generates a query that is equivalent to the one suggested by retrieval specialists. A key advantage of the InfoCrystal is that it not only shows this query, but also other related, potentially useful queries.
In the context of retrieving text documents, the NOT operator does not necessarily have a straightforward interpretation and is not as commonly used, unless to indicate the exclusion of the previously or already retrieved documents. In a certain sense the InfoCrystal makes it possible for searchers to formulate queries that they rarely use in document retrieval. Hence, the InfoCrystal could be further customized by offering explicitly the types of queries that are especially effective in document retrieval (i.e., ANDs of ORs), and "suppressing", at least initially, the remaining possible queries that can overwhelm users.
Searchers often do little or no conceptual analysis of their query, especially of any formal kind. This is where InfoCrystal could be a big help [Marcus (personal communication)]. The InfoCrystal could play a useful role as a tool for teaching the basic concepts of modern Boolean retrieval. In particular, it could also be used by search specialists in libraries to communicate with their clients and device the appropriate search strategy.
One of the advantages of the InfoCrystal is that users can explore many different, but related ways of retrieving the information without having to modify the framework of a query. They can perform a "what-if" analysis by changing and observing how the retrieved information is propagated through the InfoCrystal query structure. Users can use a diverse set of retrieval methods to initialize a query structure. For example, at any point in the search process users can switch from a keyword-based to a full-text retrieval approach by replacing an input criterion with a particular document that better captures a specific interest. The InfoCrystal enables users to explore an information space without having to abandon their sense of overview. It provides users with a compact visual representation of how the retrieved documents relate to their specific interests. Such visual feedback helps users decide how to proceed in the search process.

·         Relational Databases: The InfoCrystal can be used as a Boolean Calculator, and is therefore ideally suited for specifying the required combinations of conditions that records in a relational database need to satisfy. In the context of relational databases, the use of the NOT operation is more appropriate and frequent than in document retrieval (see Document & Information Retrieval bullet).

o        Database Mining: The InfoCrystal could become a useful element in the toolbox that is needed to "excavate nuggets of value" possibly contained in large databases. The crystal could perform the function of a radar screen that provides users with a compact and structured overview of how the data is related to a multitude of criteria. The InfoCrystal can be easily integrated with other data analysis tools, as we have indicated in other parts of this thesis.

o        Financial Portfolio Management: A portfolio manager can formulate an array of criteria to be used to evaluate stocks. These criteria can then be grouped and arranged in a hierarchical structure. The manager will place the necessary criteria that a stock needs to satisfy at the bottom of the hierarchy. The created hierarchical structure acts like a filter that progressively refines the selection of stocks to be considered. The advantage of the InfoCrystal is that the portfolio manager can easily narrow or widen the selection by interacting with a direct manipulation interface. Furthermore, there is no limit on how the criteria are defined: it could be a simple property that needs to be computed or it could involve a complex computation.

o        Human Resource & Workteam Management: If a manager needs to assemble a new workteam, then the InfoCrystal allows the manager to see how the skills of his workforce distribute across the space defined by the needed skills. The interface allows the manager to see the workforce in a new light and to become aware of people with interdisciplinary skills. It could also enable workers within a company to identify and find needed resources within their own organization.

·         Multimedia Editing: When editing a film or video sequence, editors often face the problem of how to retrieve the appropriate segment that satisfies a certain combination of requirements at an edit point and also supports the desired overall mood. Editing involves the art of compromise and the juggling of multiple criteria to arrive at a solution that optimizes the different requirements and leads to overall pleasing sequence. Editors often end up using a segment that does not satisfy all the criteria, but using one that satisfies one or two of the needed characteristics especially well and that supports the desired overall mood. Editing can involve a lot of trial and error, where editors need to try different possibilities, where this very fluid process can lead them to change their mind about what is needed to make a particular transition work. The InfoCrystal can provide editors with a versatile palette that shows them not only the next best possible shots, but a whole range of segments and how they relate to their requirements that they could use to "paint" the next segment. Editors can specify the degree of importance they assign to the different criteria by interacting with the weight sliders, then the InfoCrystal provides them with a ranked output of the possible shots.
In the multimedia context it seems also appropriate that users could interact with an iconic interface to specify their requirements. The MediaStreamer is an example of a rich visual interface that enables users to annotate video clips by interacting with a visual taxonomy of video events [Davis 1993]. We could easily build on this representation and use it to specify search criteria. In the current InfoCrystal implementation, all the atomic or leaf nodes are represented as circles and hence look all the same, although they represent different content. Hence, it could be beneficial if the criterion icons could reflect their content in a visual way and not just the structure or form of the query. Users could interact with a library of icons that represent the available data generators that they could select-drag-drop in the corners of the InfoCrystal's border area to specify an input. If the data generator has an iconic representation of the appropriate size, then it could be even displayed instead of the generic circular criterion icon. If the data generator operates on a hierarchical structure, such as ACM classification system or another taxonomy, and its elements have iconic representation, as is the case in the MediaStreamer for example, then the criterion icons could be replaced by the icon representing the instance currently selected in the classification hierarchy.

·         Electronic Mail Filters: The InfoCrystal could be used to filter and organize electronic mail. The crystal shows users how the received mail messages relate to their stated interests, and they could use it to help them decide which mail messages to read first, namely the ones that have at least a certain relevance score. Once users have programmed the InfoCrystal by selecting the relationships of interests or setting the weights and the threshold, they would be presented with a ranked list of the crystal's output. At any point users could view the InfoCrystal to show them which types of messages they are ignoring and how well their rules, represented in terms of the chosen concepts and the selected relationships, are performing.

·         Hypertext Browser and Link Builder: The InfoCrystal could be used in the following ways in the context of a hypertext system: 1) to select those existing links that satisfy certain combinations of criteria; 2) to generate new links by retrieving those documents or passages that satisfy criteria supplied by the user or that have been generated by performing a cluster analysis on a selection of text (fragments) that are of interest to a user. The InfoCrystal can offer users not just a single link to follow, but a whole structured space of links to explore.

·         Statistical Visualization: The InfoCrystal can be used to visualize the higher-level correlations or interactions among different input variables. Further, it can be used to visualize all the effects computed in a complete 2N factorial design. The statistical model of such design with N factors each at two levels includes N main effects, N(N-1) two-factor interactions, ... , and one N-factor interaction. Hence, the complete model of the 2N factorial design contains 2N - 1 effects, which can all be represented in an InfoCrystal.

·         The InfoCrystal represents a general tool for visualizing the combinatorics of the possible relationships between several concepts, and users can assign weights to the concepts. This fact has implication in the following domains:

o        Visualizing the Power Set: The collection of all the possible disjoint subsets among N sets is commonly referred to as the power set 2N. The InfoCrystal visualizes the power set, with the exception of the relationship that does not involve any of the sets, which is equivalent to the empty set in this context. Hence, the InfoCrystal could have applications anywhere where the power set plays an important role and needs to be visualized. For example, there is the Dempster-Shafer theory of evidence that addresses the problem of how to represent and manipulate the degrees of support provided by different sources of evidence to a set of N propositions. In contrast to a standard Bayesian design, in which degrees of belief are assigned to the N elements directly, the Dempster-Shafer model assigns degrees of belief to members of the power set 2N [Schocken & Hummel 1992].

o        Boolean Networks: The interior icons of an InfoCrystal correspond to the elements used by Boolean Networks to model the learning and evolutionary processes in nature [Kauffman 1993]. This fact has further implications, because it suggests how relevance feedback could be used to select the interior icons. Learning mechanisms could be used to assist users in the selection of the interior icons. This can take the form where learning agents use the received relevance feedback by users to compute the selection status of an interior icon.

o        Neural Networks: The InfoCrystal in continuous mode can be used to visualize which combinations of the input values will trigger a cell to fire, based on the current settings of the threshold and the input weights.