iCommerce.com Corporation
eCommerce


Search our
entire site

Enter your search
terms below, or visit
our
search page



Search case
studies only

Enter your search
terms below:




For the table
of contents and
hyperlinks to
general topics
proceed to
toc



























Towards a Web Object Model


10 February 1998

Abstract

Today, the World Wide Web is a global information repository of resources primarily consisting of syntactically-structured HTML documents and MIME-typed files. These relatively unstructured data models do not provide the foundation for command and control situation modeling or enterprise computing, or for a new generation of tools to operate on a more semantically-structured, knowledge-based web. Richer base data model(s) are needed that converge the benefits of emerging Web structuring mechanisms and distributed object service architectures.

A number of ongoing activities are attempting to merge aspects of object models with those of the World Wide Web. This paper describes a number of these activities, with particular emphasis on those which focus on providing enhanced facilities for representing metadata for describing Web (and other) resources. The intent of this paper is to:

  • describe key examples of existing work from the Web, database, and OMG communities that contribute both ideas and technology toward providing the components of a Web object model
  • identify some key underlying principles behind this work
  • identify a framework which allows this work to be unified and extended to support the requirements of advanced Web applications for object technology

Contents

1. Introduction
1.1 Background
1.2 Capabilities Provided by an Object Service Architecture
1.3 Increasing the Structuring Power of the Web
2. Relevant Work
2.1 Structured Data Representations and "Lightweight Object Models"
2.1.1 Summary Object Interchange Format (SOIF)
2.1.2 Object Exchange Model (OEM)
2.1.3 Knowledge Interchange Format (KIF)
2.1.4 Extensible Markup Language (XML)
2.2 Higher-Level Models and Metadata
2.2.1 Dublin Core
2.2.2 Warwick Framework
2.2.3 PICS and PICS-NG
2.2.4 XML-Data
2.2.5 Meta Content Framework (MCF)
2.2.6 Resource Description Framework (RDF)
2.3 Adding Behavior to Web Pages
2.3.1 Document Object Model (DOM)
2.3.2 Embedded Objects
2.3.3 Web Interface Definition Language
2.4 Related OMG Technologies
2.4.1 OMG Property Service
2.4.2 Tagged Data Facility
3. Building a Web Object Model
3.1 Integration Approach
3.2 Discussion
3.3 Formal Principles
3.3.1 Logic Basis
3.3.2 Representation of Higher Level Semantics
3.3.3 Object Logics
4. Conclusions
References

1. Introduction

1.1 Background

Many business and governmental organizations are planning or developing enterprise-wide, open distributed computing architectures to support their operational information processing requirements. Such architectures generally employ distributed object middleware technology, such as the Object Management Group's (OMG's) Common Object Request Broker Architecture (CORBA) [OMG95], as a basic infrastructure.

The use of objects in such architectures reflects the fact that advanced software development increasingly involves the use of object technology. This includes the use of object-oriented programming languages, class libraries and application development frameworks, application integration technology such as Microsoft's OLE, as well as distributed object middleware such as CORBA. It also involves the use of object analysis and design methodologies and associated tools.

This use of object technology is driven by a number of factors, including:

  • the desire to build software from reusable components
  • the desire for software to more directly and more completely reflect enterprise concepts, rather than information technology concepts
  • the need to support enterprise processes that involve legacy information systems
  • the inclusion of object concepts and facilities in key software products by major software vendors

The first two factors reflect requirements for business systems to be rapidly and cheaply developed or adapted to reflect changes in the enterprise environment, such as new services, altered internal processes, or altered customer, supplier, or other partner relationships. Object technology provides mechanisms, such as encapsulation and inheritance, that have the potential to support more rapid and flexible software development, higher levels of reuse, and the definition of software artifacts that more directly model enterprise concepts.

The third factor reflects a situation faced by many large organizations, in which a key issue is not just the development of new software, but the coordination of existing software that supports key internal processes and human activities. Mechanisms provided by object technology can help encapsulate existing systems, and unify them into higher-level processes.

The fourth factor is particularly important. It reflects the fact that, as commercial software vendors incorporate object concepts in key products, it will become more and more difficult to avoid using object technology. This is illustrated by the rapid pace at which object technology is being included in software such as DBMSs (including relational DBMSs) and other middleware, and client/server development environments. Due to this factor, organizations may be influenced to begin adopting object technology before they would ordinarily consider doing so.

At the same time, the Internet is becoming an increasingly important factor in planning for enterprise distributed computing environments. For example, companies are providing information via World Wide Web pages, as well as customer access via the Internet to such enterprise computing services as on-line ordering or order/service tracking facilities. Companies are also using Internet technology to create private Intranets, providing access to enterprise data (and, potentially, services) from throughout the enterprise in a way that is convenient and avoids proprietary network technology. Following this trend, software vendors are developing software to allow Web browsers to act as user interfaces to enterprise computing systems, e.g., to act as clients in workflow or general client/server systems. Products have also been developed that link mainframes to Web pages (e.g., translating conventional terminal sessions into HTML pages).

Organizations perceive a number of advantages in using the Web in enterprise computing. For example, Web browser software is widely available for most client platforms, and is cheaper than most alternative client applications. Web pages generally work reasonably well with a variety of browsers, and maintenance is simpler since the browser and associated software can reduce the amount of distributed software to be managed. In addition, the Web provides a representation for information which

  • supports interlinking of all kinds of content (text, voice, video, etc.)
  • is easy for end-users to access
  • is easy to create content for using widely-available tools

However, as organizations have attempted to employ the Web in increasingly-sophisticated applications, these applications have begun to overlap in complexity the sorts of distributed applications for which architectures such as OMG's CORBA, and its surrounding Object Management Architecture (OMA) [OMG97] were originally intended. Since the Web was not originally designed to support such applications, Web application development efforts increasingly run into limitations of the basic Web infrastructure. As a result, numerous efforts are being made to enhance Web capabilities, to enable them to support these more complex applications. In order to understand the missing elements, it is useful to look at the components of OMG's OMA.

1.2 Capabilities Provided by an Object Service Architecture

There is increasing agreement that modeling a distributed system as a distributed collection of interacting objects provides the appropriate framework for use in integrating heterogeneous, autonomous, and distributed (HAD) computing resources. Objects form a natural model for a distributed system because, like objects, distributed components can only communicate with each other using messages addressed to well-defined interfaces, and components are assumed to have their own locally-defined procedures enabling them to respond to messages sent them. Objects accommodate the heterogeneous aspects of such systems because messages sent to distributed components depend only on the component interfaces, not on the internals of the components. Objects accommodate the autonomous aspects of such systems because components may change independently and transparently, provided their interfaces are maintained. These characteristics allow objects to be used both in the development of new components, and for encapsulating access to legacy components. In addition, because object-oriented implementations bundle data with related operations in modular units, the use of objects provides the possibility of fine-grained tuning in the computing architecture by moving or copying objects to appropriate nodes of the network (this is becoming increasingly feasible with the development of technology such as Sun's Java).

OMG's Object Management Architecture (OMA) is an example of a distributed object architecture intended to support distributed enterprise computing applications. The OMA includes the following components:

  • A global object model to define how the heterogeneous resources that make up the system can be modeled as objects. In the OMA, this global object model is defined by the CORBA Interface Definition Language (IDL).
  • The Object Request Broker (ORB), an object messaging backplane that enables distributed objects to transparently send and received requests and responses.
  • Object Services, which support basic functions for using and implementing objects, and are likely to be used in any object-based program. Examples include support for queries, transactions, and event notification.
  • Common Facilities, which provide end-user oriented capabilities useful across multiple application domains, such as compound document and workflow facilities.
  • Domain Objects, which are likely to be used only in specific vertical application domains, such as telecommunications or manufacturing.
  • Application Objects, which are built specifically for a particular application.

These components provide multiple levels of capabilities in support of developing complex distributed applications.

The ORB in the OMA is defined by the CORBA specifications. An ORB does not require that the objects it supports be implemented in an object-oriented programming language. The CORBA architecture defines interfaces for connecting code and data to form object implementations, with interfaces defined by IDL, that are managed by the ORB and its supporting object services. It is this flexibility that enables ORBs to be used in connecting legacy systems and data together as components in enterprise computing architectures.

A distributed enterprise object system must provide functionality beyond that of simply delivering messages between objects. OMG's Object Services have been defined to address some of these requirements. Object Services provide the next level of structure above the basic object messaging support provided by CORBA. The services define specific types of objects (or interfaces) and relationships between them in order to support higher-level capabilities. Object Services currently defined by OMG include, among others:

  • Concurrency Control Service
  • Life Cycle Services
  • Event Notification Service
  • Query Service
  • Persistent Object Service
  • Relationship Service
  • Naming Service
  • Transaction Service

Taken together, OMG Object Services provide services for ORB-accessible objects similar to those that an Object DBMS (ODBMS) provides for objects in an object database (queries, transactions, etc.). The Object Services, together with the basic connectivity provided by the ORB, turn the collection of network-accessible objects into a unified shared object space, accessible by any ORB client application. Managing the collection of ORB-accessible objects thus becomes a generalized form of "object database management", with the ORB being part of the internal implementation of what is effectively an ODBMS. Viewed in this way, the OMA provides a powerful object-oriented infrastructure for the development of general-purpose applications, just as an enterprise database and its associated DBMS provide such an infrastructure for the development of general-purpose enterprise applications. Additional levels of organization are also needed. These additional levels are where OMG's Common Facilities, Application, and Domain Objects, as well as still higher level concepts, come into play [MGHH+97].

If the Web is to be used as the basis of complex enterprise applications, it must provide generic capabilities similar to those provided by the OMA, although these may need to be adapted to the more open, flexible nature of the Web. Providing these capabilities involves addressing not only the provision of higher level services and facilities for the Web, but also the suitability of the basic data structuring capabilities provided by the Web (its "object model"). For example, in the case of services, search engines (a form of query service) are becoming indispensable tools, and agent technology can add additional intelligence to the searching process. Similarly, extended facilities to support transactions over the Web are being investigated. However, the ability to define and apply powerful generic services in the Web, and the ability to generally use the Web to support complex applications, depends crucially on the ability of the Web's underlying data structure to support these complex applications and services.

1.3 Increasing the Structuring Power of the Web

The basic data structure of the Web consists of hyperlinked HTML documents. It is generally recognized that HTML is too simple a data structure to support complex enterprise applications. For example, Jon Bosak's XML, Java, and the Future of the Web [Bos97] identifies a number of key limitations of HTML:

  • Extensibility: HTML does not allow users to specify their own tags or attributes in order to help identify the semantic significance of data (e.g., to identify that a particular text string represents the title of a document, or the customer placing an order).
  • Structure: HTML does not support the specification of deep structures needed to represent, e.g., database schemas or object-oriented hierarchies.
  • Validation: HTML does not support the kind of language specification that allows client applications to check data for structural validity on loading the data, e.g., data that represents fixed structured forms or database records

These limitations severely affect the ability to develop advanced applications using HTML, including:

  • applications that require the Web client to function as the front-end to enterprise applications or mediate between multiple heterogeneous databases,
  • applications that require more flexibility in distributing processing load between Web servers and clients, and
  • applications that require the Web client to present different views of the same data to different users, or in which intelligent Web agents need to tailor information discovery to the needs of individual users.

Proprietary HTML extensions have been developed to address some of these problems, but none deals with all of them, and together they create barriers to interoperability. The same is true of the proprietary data formats used by particular applications. Their use requires specialized helper applications, plug-ins, or Java applets, creating interoperability problems, and difficulty in reusing that data in different applications for new purposes. While use of some specialized formats is necessary in particular applications (e.g., multimedia), in many cases these formats are just used to address the deficiencies of HTML for generalized document and data processing.

A more fundamental direction of efforts to address HTML limitations has been attempts to integrate aspects of object technology with the basic infrastructure of the Web. There are a number of reasons for the interest in integrating Web and object technologies:

  • The Web, even in its current form, can be viewed as a simple form of distributed object system, with a particularly simple object model. In this model, HTML pages are considered as objects (actually, object state), having identity provided by URLs, and methods defined by, or that are invoked via, HTTP servers. The methods supported by HTTP servers are extensible, and HTTP supports negotiation to find out what they are (even though GET, PUT, and POST are the only methods generally used). The basic resemblance of the Web to a simple object system has created a natural interest in seeing how far the resemblance can be further developed. The World Wide Web Consortium (W3C) HTTP-NG activity <http://www.w3.org/Protocols/HTTP-NG/> is attempting to do this at the protocol level by developing a new architecture for the HTTP protocol based on a simple, extensible, distributed object-oriented model.
  • Object technology is seen as a particularly-convenient way of adding functionality (e.g., behavior) to the Web, both by adding the behavior provided by objects to the static content of HTML, and by allowing Web clients and servers, through distributed object technology, to access other computing resources. For example:
    • Web pages can be used as convenient carriers or containers for objects in various models, e.g., Java or ActiveX objects. In this approach, objects are added to the conventional static content of Web pages. The pages provide a vehicle for transmitting the objects between server and client. Once on the client, the objects can then execute. In some cases, the client objects then interact with server objects, possibly using a different protocol, e.g., OMG's IIOP or Java's RMI. While this was originally supported by proprietary extensions, HTML specifications now include support for the <APPLET> tag, and the recently-adopted HTML 4.0 specification includes a more general <OBJECT> tag (see Section 2.3).
    • Web pages can be treated as objects with methods that execute on HTTP clients. Dynamic HTML developments by Microsoft <http://www.microsoft.com/sitebuilder/workshop/author/dhtml/> and Netscape are examples of this approach. Current work by the W3C on a Document Object Model <http://www.w3.org/TR/WD-DOM/> is attempting to extend these ideas to include even more powerful facilities (see Section 2.3). What is being proposed is an object model that allows the HTML document, together with its contents (its collection of elements and attributes), to be treated as a collection of programmable objects. Client-side code (scripts or code contained in the document, or plug-ins or other code which accesses the document through the client) will be allowed to access these objects, and manipulate them dynamically (e.g., causing immediate changes in the document displayed to the user).

Such efforts all contribute toward giving the Web a richer structural base, capable of directly supporting a wider variety of activities, in more flexible and extensible ways. However, up until recently these efforts have still been based on HTML, with its basic structuring limitations, and have generally been pursued as separate, non-integrated activities. There is much other ongoing work within both the Web and database communities on data structure developments to address Web-related enhancements. Work on similar issues is ongoing within the Object Management Group as well (see Section 2.4). This work has contributed valuable ideas, and the various proposals illustrate similar basic concepts, generally, movement toward some form of simple object model. However, these similarities are often obscured by detailed representational differences, and the work is fragmented and lacks a unifying framework. As a result, individual proposals often lack key capabilities that are in some cases contained in other proposals. Moreover, in many cases these proposals are not well-integrated with key areas of emerging industry consensus on Web data structuring technologies.

If the Internet is to develop to support advanced application requirements, there is a need for both richer individual data structuring mechanisms, and a unifying overall framework which supports heterogeneous representations and extensibility, and provides metalevel concepts for describing and integrating them.


Objects
Home
IA Infrastructure
Metadata
Objects
Object Models
Object Models II
Object Models III
Object Models IV
Reports
Survivability
Webtrader