iCommerce.com Corporation
eCommerce


Search our
entire site

Enter your search
terms below, or visit
our
search page



Search case
studies only

Enter your search
terms below:




For the table
of contents and
hyperlinks to
general topics
proceed to
toc



























XML - Industry Take: XML Is Ex-Cell-Ent

JANUARY 1999

Neither Web servers nor the Java programming language nor application servers are enough to construct sites on the Web. What's still needed is a standard way of transferring data between these Net systems and companies' pre-Net computer systems - without constantly writing custom code to translate between them, experts say.

That's why the eXtensible Markup Language (XML), sometimes referred to as "HTML on steroids," is emerging as the Swiss army knife of future World Wide Web applications.

"XML does for data what Java does for applications," says John Wolpert, IBM Corp.'s emerging technology development manager. It makes data more portable across the networked applications of the Web.

XML is becoming the common format for data to be drawn from different databases and computer systems in a Web application. XML amounts to a set of tags that look something like tags for the HyperText Markup Language (HTML), which deals with the display of items on Web pages. But XML tags carry more information on the specific data structures and content inside the document. In an XML document, a search engine would be able to go to the specific topic, title or word sought, rather than just to the lead page of the document as a whole.

Where HTML is concerned with presentation, XML is focused on the nature of the data in the file. Jon Bosak, Sun Microsystems Inc.'s representative to the XML Working Group of the World Wide Web Consortium (W3C), says XML may be the answer to the health-care system's inability to efficiently combine patients' medical history, treatment and billing information in one system.

Bosak and others predict that the health-care industry one day will offer its own subset of XML that defines the data it uses, so that health-care providers will be able to exchange data readily. Such subsets already exist for math equations and chemistry data under XML.

Data from different systems typically has its own file format, and converting data from one into the format of another is tedious programming work, according to an Object Design Inc. white paper on XML. Instead, XML encapsulates information about the data in tags that can be read by an XML software interpreter. A Web application with a built-in interpreter could recognize the data, regardless of its source.

"It's a language for accessing arbitrarily structured content," as is found on most Web sites, says Giga Information Inc. Group Vice President Michael Gilpin. It will do for Web applications what the structured query access language did for relational databases, providing a standard way for finding what's contained inside documents or files, he predicts.

IBM put an XML interpreter, also known as a parser, for Java on its Alphaworks site in March and was surprised at the rate of downloads for what it considered a somewhat obscure offering. It has seen 25,341 downloads of IBM XML Parser/Java edition, making it one of its most popular downloads. Developers may use the tool and incorporate it into their products after signing a commercial, no charge license for it, says Wolpert at IBM. When added to a Java application, the parser allows it to translate the XML tags of incoming data. It also can generate XML tags to be added to data being exported.

Object Design, which offers its own XML Parser for Java, supplies the ObjectStore database, which recognizes XML data and is sold as an ingredient of electronic commerce sites. OpenStore has been licensed by Open Market Inc., an e-commerce software supplier, which has embedded it in its LiveCommerce application for building large catalog sites.

Object Design's revenue was up 40 percent to $15.6 million, compared with the same period a year earlier.

Microsoft Corp. also offers an XML parser. In January, Microsoft and DataChannel Inc. submitted to W3C an XML working proposal for the exchange of data between Web applications and relational databases based on XML.

XML is proving more viable than any competitor, such as electronic data interchange, as the intermediary between Web applications and the legacy databases and mainframe systems containing data, says Don DePalma, principal analyst at Forrester Research Inc.

The early acceptance of XML as a common denominator language appears to be uniting traditional enemies to back it as a standard.

IBM, Microsoft, Netscape Communications Corp. and Sun all are backing it as a standard for the next generation of Web applications. "I don't perceive any enemies of XML," says Gilpin at Giga. He predicts a wide variety of tools will be produced to work with XML, hastening its acceptance.

Object Design can be reached at www.odi.com

IBM can be reached at www.ibm.com

IBM's Alphaworks site can be reached at www.alphaWorks.ibm.com/Home/

DataChannel can be reached at www.datachannel.com

Microsoft can be reached at www.microsoft.com

Netscape can be reached at www.netscape.com

Sun can be reached at www.sun.com

Giga can be reached at www.gigaweb.com

Corporate use still low, but top vendors work to change that

It's not often that you encounter a technological movement with no detractors, but XML seems to be one. It's got everything everyone loves to praise: It's open, it's extensible, it's Web-related and verbal support for it is almost universal. In fact, it's hard to think of any downside to it, aside from the confusing parade of acronyms it has spawned.

Although corporate use of Extensible Markup Language is very limited now, major vendors such as IBM, Netscape Communications Corp., Oracle Corp. and, especially, Microsoft Corp. are pushing it hard. However, as with a multitude of technologies before it, XML faces substantial hurdles.

XML is a way of describing data description languages in a consistent syntax. Like HTML, XML is a simplification of SGML (Standard Generalized Markup Language). HTML is a specific vocabulary of SGML, meaning it uses a single set of tags defined in an SGML DTD (Document Type Definition).

DTDs are the primary mechanism by which standards bodies and ISVs are implementing XML-based vocabularies. Unlike HTML, XML lets developers use different DTDs and define their own tags. Thus, developers can define arbitrary data structures.

A movement is under way to establish XML as a standard way of defining all data interchange. There are also many industry-based efforts to define vertical vocabularies. The RosettaNet, for example, is attempting to define the IT supply chain in XML. The RosettaNet initiative is a good example of how XML is being used as an enabling technology for EDI (electronic data interchange) based on standard vocabularies.

Vertical vocabularies in the supply and purchasing processes could open up EDI to a wider variety of software solutions. Similar efforts are possible in other vertical markets, such as financial services and health care.

XML applications don't necessarily need a DTD--as long as an application follows certain rules, the XML is considered "well-formed," and any XML parser should be able to handle it.

Not just for the Web anymore

XML is most famous for its connections to the Web, but DataChannel Corp., Microsoft and others provide programming tools for using XML in general-purpose programming situations. For instance, the ICE (Information and Content Exchange) protocol is designed for server-to-server data interchange.

There are many ways to present XML-based data, depending on the environment, but this is one area where XML shows its youth: Ways of presenting data are inconsistently implemented.

Some environments, such as Microsoft's Internet Explorer 4.0, support displays of raw XML data in a simple default presentation. IE 5.0 also supports "XML Islands," which are snippets of XML embedded in HTML. IE and Netscape's Communicator 5.0 support mixing XML and Cascading Style Sheets, which were originally designed for layout of HTML.

Consequently, it's difficult for vendors to know what standard to follow if their XML applications have a presentation aspect, such as a Web-based application driven by underlying XML structures.

To deliver XML applications that work at all in a browser today, corporations must commit to IE 4.0 or IE 5.0 or write server-based applications that convert XML on the server to HTML. Such applications lose the benefits of XML-based logic on the client, although they still take a load off the database server by allowing logic on the Web server or application server access to the structure of the data.

An XML-enabled browser can overcome most of what makes HTML hard to use in sophisticated applications. To HTML, everything on the page is either text or a graphic. XML allows for a standard way to express the increasingly complex structured data found on Web sites and intranets. Data is not just presented in a readable format, it is maintained in contextual structures that can be manipulated by software.

So XML allows software to use the structure in data, but all data in XML is still just text. To address this, a World Wide Web Consortium group has submitted a specification called XML Data that introduces data types, various date and time formats, and complex data structures.

All XML-based systems need an XML parser to read the actual code, and there are two kinds of parsers. A validating parser checks to see if the tag syntax conforms to the rules in the DTD. A nonvalidating parser just checks to see if the XML is well-formed.

Definition needed

Just what constitutes an "XML-enabled browser" is not standard yet, either. IE 4.0 supports XML through two ActiveX-based parsers. One of the parsers is written in C++ and is nonvalidating. The second is written in Java and validates. The validating parser also supports XSL.

IE 5.0's biggest advance is to expose an XML DOM (Document Object Model) to browser scripting and other ActiveX-enabled languages. This should allow for much more powerful client-based applications with fewer round trips to the server. Instead of requesting new queries from a server, the client will be able to manipulate the data locally.

Netscape's support for XML can be seen today in the publicly available Mozilla code (at www.mozilla.org/rdf/doc/xml.html). So far, Mozilla only transforms XML into HTML, but there are plans to provide full DOM support.

The most ambitious use of XML currently in the works is in Microsoft's forthcoming Office 2000. Users of Office 2000 will be able to save documents in conventional binary formats or as "Web" documents, which are HTML documents with lots of embedded XML. If adoption of Office 2000 is in step with previous versions, it should spur large amounts of XML development by corporate developers and ISVs.

Larry Seltzer, a free-lance writer specializing in Internet technologies and development tools, can be contacted at lseltzer@bigfoot.com.


An XML call to action

XML is growing fast, but it is still a young technology. Corporate planners and developers should track its progress to see where it can add value to their networks:

Follow the standards.

Use the World Wide Web Consortium's XML site as a main starting point: www.w3.org/XML/

Track ISV XML pages. Currently, Microsoft Corp.'s is the best: www.microsoft.com/standards/xml/default.asp

Investigate XML-based application development tools:

Investigate content management systems that support XML:

TABLE OF CONTENTS


XML
Home
Architecture
B2B
Catalog Manager
ERP
Introduction
Microsoft
Middleware
Primer
XML to EDI
Extranet
Tech. Specs