HIPPO: Incorporating Hypertext using Fuzzy Components
Electronic Publishing Research Group
Department Of Computer Science
University Of Nottingham, NG7 2RD
In recent years, the hypertext community has seen a move towards more flexible, open forms of hypertext, with many researchers calling for support for heterogeneous applications and platforms [10,12]. Academics have rejected many of the traditional notions of hypertext which adopt monolithic, closed implementations, and have begun to consider hypertext as an abstract service which can be incorporated into other applications. This allows users to augment their conventional environment with hypertext functionality and linking concepts, while using their existing tools and legacy data. Recent years have seen great advances in open hypertext theory and hypertext integration, with the emergence of many popular open systems and models [2,7,9,14,15,16,21]. Many of these models are heavily influenced by early work on classical hypertext [1,10,18], and the development of abstract models such as Dexter, Trellis etc [8,11,17].
While many of these open approaches to hypertext differ in their implementation and precise abstractions, many of them share the idea of some form of hypertext link services layer. Hypertext abstractions are implemented as a substrate layer which sits between the user applications and the underlying operating system. Applications can then make use of these hypertext abstractions, and incorporate hypertext functionality to structure their data. This approach raises many issues surrounding the integration of hypertext into the existing environment - typically, designers have explored ways in which existing applications should communicate and integrate with the hypertext substrate. Furthermore, different models hold widely differing views as to the demands made on existing applications - while some models place most of the hypertext functionality in the hypertext services layer, others choose to make increased demands on applications, and require applications to maintain additional structures to support hypertext.
Problems with Open Hypertext Theory
The research into open hypertext theory has produced some excellent results, and shown the advantages of integrating hypertext services into existing environments. However, despite the advantages offered by hypertext and the flexibility of these open systems, users can still experience the unusual paradox of a closed environment. While open hypertext implementations allow the integration of hypertext services into existing applications, the users must still adopt the particular abstractions and hypertext model offered by a particular open hypertext system. The increasing array of open models all offer different abstractions and views of hypertext, and aim to proselytise users to the their particular model. If a user wishes to adopt the linking strategies or some of the concepts offered by a given model, then they are forced to relinquish their existing open system and all of its features, and adopt another system. Some of these problems are lessened by adopting a loose-coupling between applications and hypertext services, in which the data itself is left untouched by the hypertext layer, and the hypertext structure is maintained separately. However, this rigid, static notion of hypertext falls short of many of the claims and potential advantages of open hypertext theory; some of the particular areas of concern are detailed below:
Many hypertext systems allow a limited form of integration and extensibility, using some approach based around scripting languages or plugin/API support. Indeed, some of the earliest hypertext systems such as NoteCards and KMS provided some form of scripting support, and many contemporary applications support an API (Application Programmer's Interface) framework (eg Netscape). However, these approaches can prove very limiting; they often force developer's to adopt a particular language or platform and the extensions still operate within the framework of the host system so they must still adopt the underlying concepts and abstractions defined by the original hypertext system. Furthermore, any extensions can only employ services and API calls which the system designers make available to developers.
Some recent contributions to open theory have employed different approaches to hypertext integration and extensibility. For example, the HyperDISCO and Hyperform systems adopt an object-oriented framework as a means of extending the hypertext models. Developers can extend and modify a set of core classes to implement specific behaviours and augment the hypertext model. While these approaches have proved very useful and make substantial contributions to the flexibility of hypertext environments, they do not address how a system should be extended, which objects can be used and when. They do not provide adequate support for a changing system, and still exhibit some of the problems described previously.
The HIPPO Model
The HIPPO model takes the view that the user is completely in control of the hypertext environment, that the hypertext services are there to be used by a user, but that they should not restrict the user, or force them to change their working practices. The hypertext model and the behaviour of the hypertext services should be selected by the user, and the user should be free to change this whenever they wish. The hypertext user should choose what constitutes a node, what type of links are supported, what happens when they traverse a link etc - indeed all aspects of the hypertext model and semantics should be controlled by the user, not dictated by a system designer (this builds on some of the ideas in Bieber[4,5]). The HIPPO model takes a computational view of hypertext, by abstracting the functionality and semantics of a hypertext environment into lots of small, lightweight processes. These processes can implement any operation or set of operations (link retrievel, presentation of a node, processing of link sets, storage of objects etc), which can than be used together to define an arbitrary hypertext system. In this way, the user chooses the behaviour of the hypertext system and which processes they use to integrate with their existing applications and environment. This model builds on some existing research into computational/dynamic hypertext systems [3,5,16,17,22]
HIPPO introduces a classification system to manage the administration of these processes - this uses a hierarchical system somewhat similar to the MIME-type taxonomy system used to encode data types. This relates to some of the early work done on hypertext link classification [5,6,18] which examined the nature of hyper-links and identified different link types, however HIPPO extends this to include all aspects of hypertext semantics and functionality. For example, a classification system for a set of hypertext processes, might read:
Processes can then register themselves under any of these classification types, and make themselves available to hypertext users. These processes could reside on the user's local machine, however the flexibility of this approach to hypertext becomes more interesting when applied to a distributed model. The HIPPO system allows system processes to register with a trading service, which stores information about each process, location, arguments etc. These trader services can then service requests from clients, and return information about the processes it knows about. This notion of traders is widely used in other areas of distributed programming such as CORBA, DNS services etc, and allows processes to reside anywhere in the network domain. In this way, processes can make use of remote hardware and software platforms, and utilise remote resources without being restricted to the target architecture of the user. In addition, the system can support an arbitrary number of users, who can all add processes to the system. By employing a distributed model, the arbitrary hypertext model that the user decides to use is itself distributed, which results in a more robust and scalable architecture, which inherits the well documented advantages of distributed systems.
The classification hierarchy included above is largely arbitrary, and could be reconstructed in many different ways - indeed, the HIPPO system could adopt a number of policies for administrating the development of the taxonomy system. One approach would be to utilise a centralised authority to control and approve additions to the hierarchy, in much the same way as the existing MIME hiearchy. Applicants make requests for additions to the hierarchy, which are then examined before approval or rejection. This could work well, but a more interesting approach would be to allow the hierarchy to develop itself; processes could (and should) be made available by all users of the system, and could perhaps be given some temporary status. If the processes become sufficiently useful and widely used, then the component could be automatically added to the hierarchy. In this way, the hierarchy becomes largely autonomous, and develops to reflect the communal usage of the system. A useful analogy to this approach is seen in the development and administration of the USENET newsgroup hierarchy.
System Specifications Based Around Document Objects
While the system described so far offers a great deal of flexibility, where each user is free to adopt the hypertext model they choose, this new found freedom could become just as restrictive as existing systems. Users can become overwhelmed by the (infinite) functionality and number of remote processes at their disposal. The system addresses this problem by moving the focus of the hypertext paradigm from the system to the information itself. Existing approaches to hypertext appear to offer hypertext models which encapsulate the document - a system provides the functionality, and the document is shoe-horned into the framework. However, the author believes it is important to give a higher priority to the information - different tasks and information fields demand widely different frameworks and functionality. This ranges from the trivial requirements of different file formats to the more pertinent question of differing platforms and available resources, to the perhaps more wide-reaching question of different behaviours for different tasks and users.
It is clear that an expert working in a specific, highly-skilled medical domain makes very different demands on their hypertext model to a beginner looking for information about their favourite piece of music etc. The HIPPO model attempts to address this by treating the document as the primary object of interest - each document (or other "unit of information") has a system configuration associated with it. Typically, this would be a list of processes from the taxonomy above, which are deemed useful and important for the item of information. In this way, each different document or type of information includes a different specification of a system, so as a user moves through an information space, the system changes to match the task at hand. The user is free to accept or reject the proposed system specification at any time, and can always select from any of the processes available in their trading space.
While the view of the document as the primary object proves useful for providing a flexible hypertext model, the idea of simply providing a "system specification" is not ideal. Firstly, this raises the obvious question of who actually decides on this system specification for each document object? It seems natural to allow the author to specify the components, link data etc that a user should use, yet this can never be ideal in all cases. Indeed, this seems to be a failing with many hypertext systems - while the author has a useful level of expertise, and should surely contribute to the hypertext structure and behaviour, they can never predict how any given user will use the information. They do not know the level of expertise of a user, why they are viewing the documents, or what they hope to gain from them.
Furthermore, it seems very demanding to ask an author or user to define a particular system specification, to say that the optimum system consists of a given set of processes and nothing else. The current notion of a system specification which is associated with each document defines the hypertext system as a fixed set of processes, and excludes any processes outside of this set. However, it seems intuitive to suggest that some processes may prove more useful than others - many users use simple linking models and navigation tools, but may wish to use some of the more complex components from time to time. Also, it seems natural to allow this "ideal" system specification to reflect the demands and usage patterns of the hypertext community, and to change over time.
HIPPO supports this less rigid method of specifying systems, by employing the notion of fuzzy logic, where objects are no longer forced to belong, or be excluded from a set, but are allowed to have a fuzzy membership value. For example, a typical system specification may specify several processes which are considered vital to a document, yet may also list a number of hypertext behaviours which, while not as important, may prove useful in some cases. By attaching "fuzzy values" to hypertext components and processes, then this opens up the possibilities of adaptive hypertext and user modelling. Fuzzy values can be altered to reflect the status of a user or some other criteria. For example, the fuzzy value associated with a process could be increased as more users find it useful. Adaptive methods are a relatively new development in the hypertext community, but have thus far only been applied to sets of links or interfaces. The use of fuzzy logic with hypertext processes opens up the possibilities for truly adaptive hypertext, where the actual behaviour and hypertext abstractions can adapt to meet the demands of the user.
This position paper has discussed some of the techniques used in the HIPPO hypertext system. This takes an alternative approach to integrating hypertext into the users environment, by abstracting the hypertext model into arbitrary hypertext processes. In this way, the user is free to decide which processes and which methods they use to integrate hypertext with their existing environment. A taxonomy for this process model is introduced, along with a brief discussion of the distributed methods employed in its implementations. Finally, the paper introduces the idea of using the document as the unit of hypertext integration, so that the hypertext model changes to reflect the current task of the user. Finally, the notion of fuzzy logic is incorporated into the hypertext model - the author views this area as particularly interesting for future work. Current hypertext systems offer a very static model to the user, yet it seems useful to consider ways in which users can be given a less rigid definition of a hypertext model, which changes to reflect the needs of the users.