Friday, August 02, 2002
A client computer can use the XMLHTTP object (MSXML2.XMLHTTP) to send an arbitrary HTTP request, receive the response, and have the Microsoft® XML Document Object Model (DOM) parse that response.

This object is integrated with the Microsoft XML Parser (MSXML) to support sending the request body directly from, and parsing the response directly into, the MSXML DOM objects. When combined with the support for Extensible Stylesheet Language (XSL), the XMLHTTP component provides an easy way to send structured queries to HTTP servers and efficiently display the results with a variety of presentations.

XMLHTTP: Super Glue for the Web
In the suite of objects that Microsoft packaged with their MSXML parser, perhaps none are more useful or more ignored than the XMLHTTPConnection object. In a very simple sense, it allows one to open an HTTP connection to a server, send some data, and get some data back, all in a few lines of script or code. The data exchanged through the XMLHTTP object is usually XML, but this is not a requirement.

Unfortunately it seems like there's no standard, cross-platform mechanism for invoking HTTP PUT and DELETE methods from client-side javascript. This sucks a lot.

Hmmm. #

Fell: BDG to Etags
Sure, the server response may contain an ETag header, you save this away ascociated with the URL, then on subsequent requests for that url, you include a If-None-Match header with the etag value. If the contents haven't changed, then the server will reply with a 304 Not Modified.

Java Look and Feel Design Guidelines

Part of the Java Series. #

J2EE Design Patterns
design pattern describes a proven solution to a recurring design problem, placing particular emphasis on the context and forces surrounding the problem, and the consequences and impact of the solution. This website provides a palette of patterns you can use in the context of designing J2EETM applications with the Java Enterprise BluePrints. Every now and then, new patterns will be added and new insights will be incorporated into the current patterns, so check back often for updates.

Part of Java Blueprints.

Feature Synopsis for OWL Lite and OWL
The OWL Web Ontology Language is being designed by the W3C Web Ontology Working Group in order to provide a language that can be used for applications that need to understand the content of information instead of just understanding the human-readable presentation of content. OWL facilitates greater machine readability of web content than XML, RDF, and RDF-S support by providing a additional vocabulary for term descriptions. This document provides an introduction to the OWL language. It first describes a simpler version of the full OWL language which is referred to as OWL Lite and then describes OWL by addition to OWL Lite.

Am I the only one who has severe doubts about the viability of all this? I look at these technologies being churned out by the W3C and none of them seem usable or able to completely solve the problem. #

Generating Code at Run Time With Reflection.Emit [via sells]
Java programmers have long enjoyed the benefits of reflection and full-fidelity type information, and .NET (finally) delivers that bit of nirvana to the Windows platform. But the classes in the Reflection.Emit namespace raise the bar even further, allowing us to generate new types and emit new code, dynamically.

Real World Style: css layouts, tips, tricks, and techniques
Blue Robot: CSS Guru
Practical CSS Layout

Cool. #

Hibernate [via morgan]
Hibernate is a powerful, high performance object/relational persistence and query service for Java. Hibernate lets you develop persistent objects following common Java idiom, including association, inheritance, polymorphism, composition and the Java collections framework. To support a rapid build procedure, Hibernate rejects the use of code generation or bytecode processing. Instead runtime reflection is used and SQL generation occurs at system startup time. Hibernate supports Oracle, DB2, MySQL, PosgreSQL, Sybase, SAP DB, HypersonicSQL, Microsoft SQL Server, Progress, Mckoi SQL and Interbase.

This looks very sexy. It even has a powerful, built in Object Query Language. #

Thursday, August 01, 2002
Apache's Xindice Organizes XML Data Without Schema
A native XML database can make a lot of sense for organizations that want to store and access XML without all the unsightly schema mapping required to store XML in a traditional relational database system. Several commercial native XML databases exist; now, we take a first look at Apache's open source offering, Xindice.


MyrC: C# IRC Client
MYrc is an IRC GUI client (similar and probably (more like definately) based on Mirc, hence the name) written in C# made possible with the Thresher IRC Library. It had a Plugin Architecture allowing to easily write plugin in C#.

Thresher is a .Net IRC client library designed to serve as the basis for IRC bots and GUI clients. It is written in C# but can be used by any .Net supported language.

Mckoi SQL Database
Mckoi SQL Database is an SQL (Structured Query Language) Database management system written for the JavaTM platform. Mckoi SQL Database is optimized to run as a client/server database server for multiple clients, however it can also be embedded in an application as a stand-alone database. It is highly multi-threaded and features an extendable object-oriented engine.

And of course, let's not forget everybody's favorite: hsqldb #

Using JAAS For Authorization and Authentication
"Using JAAS for Authorization & Authentication", by Dan Moore, explains how to use the Java Authentication and Authorization API (JAAS). It plugs JAAS into the Struts framework and shows how to use JAAS to secure resources in an MVC architecture; It will first examine the JAAS infrastructure and then look at integration with Struts.

Jabber away with instant messaging
Jabber is an open, XML-based data model and protocol for instant messaging. Coupled with the ever-increasing number of Jabber-based open source and commercial products, this protocol provides a way to break out of proprietary instant messaging services. Various open source Java APIs can help you build Jabber-based services and integrate instant messaging into your application. In this article, Jason Kitchen explains how.

XWT: Xml Windowing Toolkit [via morgan]
XWT is the XML Windowing Toolkit. It lets you write remote applications -- applications that run on a server, yet can "project" their user interface onto any computer, anywhere on the Internet.

Axion: Embedded Java DB [via morgan]
Axion. Spotted Axion today for the first time (thanks for mentioning it Jason!). I've always used Hypersonic or these days its new incarnation, HSqlDb, up to now. It turns out that Axion will be moving to Apache in a few weeks at so I think its time I tried it out...
[james strachan's musings]

In process 100% java database that supports order by. Great for embedding work. Much easier not to have to ask a client to install mysqldb or sapdb to run a desktop app. Cool.

First, there is simplified management and control. "Many of us use CVS as a source code repository," he says. "Now we also have a documentation repository. We can do incremental population of the repository, and these multiple APIs can then be cross-referenced. So if an element in one API references an element in another API, we can hyperlink to it in our interface."

Interesting. Also cool how he solved the database abstraction problem. #

Demystifying Tomcat 4's server.xml File
The Tomcat server.xml file allows you to configure Tomcat using a simple XML descriptor. This XML file is at the heart of Tomcat. In this article, I will focus on the configuration of all of the major Tomcat components found in the server.xml file.

10 Reasons We Need Java 3.0
This article imagines a "Java 3" that jettisons the baggage of the last decade, and proposes numerous changes to the core language, virtual machine, and class libraries. The focus here is on those changes that many people (including the Java developers at Sun) would really like to make, but can't -- primarily for reasons of backwards compatibility.

Not My Type: Sizing Up W3C XML Schema Primitives
There are two fundamental problems with WXS datatyping. The first is its design: it's not a type system -- there is no system -- and not even a type collection. Rather, it's a collection of collections of types with no coherent or consistent set of interrelations. The second problem is a single sentence in the specification: "Primitive datatypes can only be added by revisions to this specification". This sentence exists because of the design problem; lacking a concept for what a primitive data type is, the only way to define new types is by appeal to authority. The data type library is wholly inextensible, internally inconsistent, bloated in and incomplete for most application domains.

The proposed solution doesn't solve the problem. The concept of datatype derivation is far too useful to simply throw out and inventing a Turing-complete, XML-based programming language to describe an infinite array of string validating algorithms is overkill. Many of the criticisms of XML Schema are very valid, though, and will need to be addressed. Unfortunately I think so much time and money have already been invested in the schemas and the schema datatype system has already infected several other specs (most recently, and most unfortunately, the RDF schema spec) that it may be too late make the changes necessary. #

Fielding on HTTP URIs
All identifiers are by their very nature
an indirect means to establishing identity.

This makes a lot more sense to me. #

Dave's Quick Search Toolbar
Dave's Quick Search Deskbar is a tiny textbox that Dave Bau designed for search hounds with weary mouse-fingers. Unlike the Google Toolbar, this little deskbar lets you launch searches without starting a web browser first, directly from your Windows Explorer Taskbar.

Wednesday, July 31, 2002
Best Practices for Using ADO.NET
This article provides you with the best solutions for implementing and achieving optimal performance, scalability, and functionality in your Microsoft ADO.NET applications; it also covers best practices when using objects available in ADO.NET and offers suggestions that can help you optimize the design of your ADO.NET application.

Restapo: Ignored Parameters (2)
See, I don't usually talk about interfaces in the .NET or programming language sense, but rather on the concept of interface itself: something that defines a contract... and WSDL seems to, at least partially, fit the bill here. Now, obviously, that contract can be either strict, or loose, as has been pointed out. In my experience, strict works best long term, altough I agree that loose agreement might work best in the short term. But even so, I don't think syntax is enough, and that seems to be the focus of Sam's article. An interface is pretty much worthless unless there are agreed semantics attached to it. So it becomes a question of what does an interface consumer expect out of it?

Sam comments: Clearly, .NET (or is it SOAP itself?) has cemented association between XML and interface is too strongly.

Is Sam implying that there's a difference between WSDL and any other Interface Definition Language (IDL)? Now, this may be true. The WSDL spec does not define itself as an IDL. But I bet 90% of developers think of it as such. In which case it may be too late to backpeddle. The essence of Sam's excellent Expect More essay seems to be that web services must be like browsers. Browsers have to be extremely fault-tolerant to handle all the bad html out there that contains missing parameters and even invalid markup. Indeed, the overwhelming success of HTML might totally due to the fact that most browsers are able to get it almost 100% right even when the markup sucks.

But will the document-centric, browser-approach to HTML scale to webservices? Not likely. First, it's unreasonable to expect developers to waste valuable time building highly-fault tolerant systems capable of handling all manner of defective client input. Building a browser is not a weekend excercise but building a webservice should be. Second, there are very real questions about the scalability of the HTML "tolerant" approach. Two major browsers dominate 98%+ of the market and just between them there are many, many incompatibilities. What will happen when there are hundreds of "tolerant" implementations of a common web service (think PKI) where there are subtle incompatibilities between all of them due to the values of default arguments? It'd be a complete nightmare. Third, in many web services, particularly financial ones, there will be very little room for error. I don't mind the fact that IE, Netscape, Galeon, Mozilla, and Konquerer all render the same web page differently but I will be very disappointed if different web services providing the same service all affect my credit card account differently.

This third point leads into Tomas' notion of an interface. As I noted earlier, one of the hallmarks of the OO-world is that an interface is behavior. They cannot be separated. If two web services expose identical interfaces than they must behave identically in accordance with identical semantics. This applies to both REST-biased and SOAP-biased services.

Of course upwards (old client, new server) and downwards (new client, old server) comptability is very important and there's middle ground to be found. But I still don't think Vasters was too far off when he wrote that any change to the WSDL implies a change in contract which implies a change in behavior which should merit a change in namespace.

There's another issue here too. I believe the architectural principles behind HTML no longer apply to the web services world. There is a fundamental difference between "document-centric" nature of HTML and the strongly-typed, "object-centric", strongly intertwined request & response nature of webservices.

An Introduction to XML Digital Signatures
The very features that make XML so powerful for business transactions (e.g., semantically rich and structured data, text-based, and Web-ready nature) provide both challenges and opportunities for the application of encryption and digital signature operations to XML-encoded data. For example, in many workflow scenarios where an XML document flows stepwise between participants, and where a digital signature implies some sort of commitment or assertion, each participant may wish to sign only that portion for which they are responsible and assume a concomitant level of liability. Older standards for digital signatures provide neither syntax for capturing this sort of high-granularity signature nor mechanisms for expressing which portion a principal wishes to sign.

Implementing XML Key Management Services Using ASP.NET
In this paper, we examine one such Web service standard, the XML Key Management Specification (XKMS). XKMS defines interfaces supporting trust and discovery services for use with Public Key Cryptographic security solutions. These include support for public key registration/revocation, key roaming, key recovery, locating an entity's public key, and validating an entity's public key.

Hierarchical Data Shaping in ASP.NET


Creating Your Own Collection Class
You can create your own collection classes by inheriting from one of the many .NET Framework collection classes and adding code to implement your own custom functionality. In this topic, you will use inheritance to create a simple, strongly typed collection inherited from CollectionBase.

Tuesday, July 30, 2002
XML Schema derivation by extension superflous? [via reinacker]
Derivation by restriction makes more sense.
D-B-R better captures the notion of subtyping and de-generalization in an XML world than D-B-E does.
The way D-B-R works is consistent with respect to both simple and complex types.
D-B-R also introduces fewer bad OO artifacts into the XML model.
In short, the more I look at it (D-B-R), the more I like it.

Pierre-Antonie Champin's Amazing RDF Tutorial
This document is a presentation of the Resource Description Framework (RDF) recommended by the World Wide Web Consortium (W3C), to model meta-data about the resources of the web. It is described in both documents [1] and [2]; the former focuses on syntactical aspects while the later addresses the definition of vocabularies (often named schemas). Though this is a pragmatic approach, I will use a slightly different plan, more suited to my own goals (using RDF in knowledge representation systems). The first section will describe the RDF model, which is its fundamental syntax. The second section will present the semantic aspects of RDF, the concepts and the corresponding vocabulary. The last section will describe the XML syntax proposed in [1].

Jesse Liberty: When to use IDispose [via wrinkled paper]
I'd like to open a discussion of when to use IDispose (a quick search of the archives shows that this has not been discussed recently).

It is my theory that this idiom will be vastly overused.


Fowler's Patterns of Enterprise Architecture [via joe's jelly]
One of the recurring themes of my career is how to structure sophisticated enterprise applications, particularly those that operate across multiple processing tiers. It's a subject I discussed briefly in a chapter in Analysis Patterns, and one that I looked at in more depth in many consulting engagements. In the last few years I've been doing a lot of work with Enterprise Java: but there's no surprise that many of the ideas and patterns involved are exactly the same as those that I've run into earlier with CORBA, DCE, and Smalltalk based systems.

Commonality: Ignored Parameters

Here's an excellent response to Ruby's Expect More essay which clearly explains the opposite view. Good stuff. #

Monday, July 29, 2002
What do HTTP URIs identity?
Heh. It's ironic that TBL posts this the same day I reach the conclusion that XML documents are not really documents. Personally, I'm not really convinced. A couple of years ago it might've made sense to claim that all URIs label documents because essentially all content was meant for humans and the only valid operations were the universal document operations READ (GET), WRITE (PUT), and DELETE (DELETE) and (the catch-all hack) POST. But this is no longer case. If you restrict http URIs to only referencing documents you're implicitly restricting the set of safe operations to the universal document operations. Since the entire point of web services is to get away from this it seems backwards to maintain this point of view. (That being said, I do think it's in poor taste to use http URIs to label real world objects. Use another scheme or URNs.)

The fundamental issue underlying many of the current debates on the semweb is the Document/Object disconnect. The question that must be answered is: when is a document not a document? Personally, I think a document is no longer a document as soon as you try to perform any operation besides READ, WRITE, and DELETE and expect the operation to succeed. As soon as you begin using operations outside of this fundamental set you've entered object land. URIs that you can POST to are not documents. Even XML documents, which expose the operation PARSE, may not be documents. While many may pine for a document model it's possible that the introduction of XML schema and the way it's infiltrated all other specs it may be too late.

I think at the end of all this debate people are simply going to have to accept the fact that there is nothing universal about URIs. In some contexts (within some RDF vocabularies), will mean the W3 organization and in other contexts it will mean the homepage of the organization. Alas, everything will depend on context.

Sam Ruby: Expect More
That is to say that in the context of web services, one should view WSDL and XML schema as proscriptive (i.e., if you format a message to these specifications, it will be accepted) as opposed to restrictive (i.e, the only messages that will be accepted are those that conform to these specifications).

Note: I am not suggesting anything as crazy that it is OK to substitute a Purchase Order for a Medical Record. Merely that it should not be a given that every extension to an SOAP message, however minor, should not require existing clients and servers be invalidated.

Very interesting. There's an implicit acknowledgement that the only truly safe operations that can be performed on an XML document are READ, WRITE and DELETE. Everything else is just bubblegum luck. The real question: will this scale? If web services are expected to take schemas as purely proscriptive will they ever truly be secure?

It all makes me uncomfortable. We want the flexibility of documents but the expressiveness of objects. Can they be reconciled? #

XML Schema vs RELAX NG

RELAX NG is much simpler than XML Schema but they do not solve the same problems. XML Schema does much more. While Relax NG provides a simple, document-centric notion of providing basic constraints on a document's syntax, XML Schema enters the realm of defining the set of safe operations that can be performed on an XML document. #

Perry on Schemas and Semantics
Yes, one wants only syntax and data, but because "sending an XML document containing raw customer data without any extra semantic information is as useless as sending a fax without a cover sheet" you feel obliged to provide "enough identifying information that it is clear how the data should be processed" and you therefore see the 'problem' as how to " indicate to the final receiver of the message what to do". In my opinion, the problem is that the addressee of your message is not competent to process it, or at least you lack confidence that it will do so with the expertise you desire. Yet surely the point of a web of distributed services is that those services, or at least such of them as you would use, are specifically expert in executing particular processes. It is for that specific expertise that you make use of those services rather than, say, creating analogous processes yourself to execute against your data in private within you own narrowly defined department or organization. With trusting such processes to bring such specific expertise to the handling of your data unavoidably comes the need to trust those processes to perform as they perform, precisely because you concede their expertise. Your fussy attempts to direct or control those processes in the execution of their own expertise are therefore at best otiose and in any case betray a fundamental lack of understanding of why you are sending your data in a message to them.

I'm thinking more from the reciever's point of view. How can the reciever of an arbitrary XML document know which operations can be safely performed on/with/against that document?

More from Vasters on Schema and Semantics

Technically, Vasters is incorrect. Neither the XML Schema Rec nor the Namespaces Rec say that a schema or a namespace carry any sort of semantic meaning and it may be irresponsible to assume that they do at this point in time. There is no way to know for certain that two elements with the same type name from two different namespaces have any sort of meaningful relationship--they might signify the same 'object' or they might not. Namespaces cannot be (safely) used to distinguish between 'my thing' and 'your thing'--all they can do is distinguish elements. They are a document-centric mechanism.

That being said, I think Vasters is essentially right because namespaces will not survive much longer in their current document-centric form. There is the fundamental disconnect between namespaces and XML Schema. Also, I have serious doubts that namespaces will scale nicely, particularly the requirement that every element within a namespace with a given name have the same content model is unreasonable. In the future namespaces will probably evolve beyond their role for distinguishing element and attribute names to distinguishing collections of elements (ie objects) and once this happens it'll be possible to make metadata assertions about the different objects within the namespace (ie RDF) at which point namespaces will have real semantic meaning.

Vasters also comments on the creation of semantic metaschemas. This too makes a lot of sense; in the future it'd make sense to annotate schema with metadata assertions so that if an application encounters elements from different schema/namespace combinations it can still operate safely knowing those elements 'mean' the same thing. #

Are XML documents really documents? And what's this got to do with SOAP and REST?

A few thoughts bouncing around my head. This is all highly speculative.

1. What is an 'object'?

An object is a representation of a problem-space entity which exposes operations that (should) correspond to actions that can be performed on or by the problem-space entity. The operations exposed by any given object are highly dependent upon the context of the object (see 'p. interface to context dependency'). The operations exposed by an object completely define the meaning of the object (see 'p. interface as behavior').

2. What is a 'document'?

A document is an entity which exposes three basic operations: READ, WRITE, and DELETE. The READ operation is an idempotent operation which returns the contents of the document as a sequence of characters. No meaning can be inferred from this sequence (see 'p. semantic transparency')--they are essentially one long string known as the 'document string'. The document string cannot be decomposed into substrings in a meaningful manner because the act of decompisition requires knowledge of the document (and its contents) which exceeds its nature as a document (see 'p. document granularity').

3. Are 'documents' also objects?

All documents in all contexts expose the basic API of READ, WRITE and DELETE and regardless of the context the meaning of these operations is consistent. Documents therefore violate p. interface to context dependency. Further, no knowledge of the contents document can be gained or inferred by examining its interface so documents violate p. interface as behavior. Documents are not objects.

4. Are XML documents 'documents' or 'objects'?

It's not completely clear.

All XML documents do expose the basic document API of READ, WRITE and DELETE immediately violating the definition of object.

XML documents do impose structure on the document string. It is possible to decompose the document string of an XML document in a meaningful manner without knowledge of the document contents by decomposing the document string into a sequence of nodes. XML documents therefore seem to violate document granularity.
But it should be clear that decomposition of an XML document into a sequence of nodes is not a meaningful decompoisition precisely because it does not require knowledge of the document contents. The sequence of nodes is still 'meaningless' because no new information can be gained from the contents of each node or from the index of each node in the node sequence.

The decomposition of XML document strings into node sequences can then be interpreted as either an extension of the core document API with the addition of a new operation, PARSE, or as a pseudo-arbitrary operation that can be performed on the results of the READ operation. Either way, XML documents don't violate either document granularity or semantic transparency and so are documents.

5. If XML documents are documents how can they be processed?

The document API has tremendous power precisely because it is intuitive and context independent. But this context independence is precisely what limits its expressiveness and utility. If XML documents are treated as documents and the only safe operations that can be performed upon them is READ (PARSE), WRITE and DELETE then the utility of XML is severely limited.

In truth, XML documents were, from the start, meant to be processed by machines engaging in an infinite array of context dependent activities. These machine agents are all designed to handle particular XML documents which conform to a specific, well-defined structure and, given a structurally compliant document, these machine agents extract context-dependent information from the XML document. This extraction process requires knowledge of the XML document which not only exceeds knowledge of the document but often exceeds the knowledge in the document. Eg, in order for a machine agent to extract a date from an XML document it must know beforehand that a specific node in the node sequence represents a date and in order to convert the character contents of this node into a date it must be aware of grammar and syntax rules which are most likely not stored in the document.

Thus machine agents consistently perform un-safe, invalid operations on XML documents in order to perform a given context-dependent activity. Every machine agent which performs operations on an XML document that aren't part of the universal document interface or cannot be derived from the universal document interface (the ARCHIVE operation, for example, can be derived from READ and WRITE) is violating the document nature of an XML document.

6. But weren't XML documents meant to be consumed by machine agents?

While this has never been explicit, it's widely accepted that XML documents are meant to be consumed by machine agents and not human beings. It's also understood that these machine agents would all be engaging in context-dependent activities.

At this point, the document nature of XML documents should be questioned.

First, it should be clear that the valid operations for any given XML document will often exceed the universal document operations. It should also be clear that the valid operations for any given XML document are context dependent and depend largely on when, where and the agent consuming the document as well as the contents of the document itself. XML documents therefore guarantee both p. interface to context dependency as well as p. interface as behavior.

7. What's an implicit schema?

Any XML document which is consumed by a machine agent in such a manner that the machine agent performs context-dependent operations on the document (ie operations not within the universal document interface) conforms to a schema. Regardless of whether this schema exists or is declared, it's existence is implied by the machine agent's processing logic and it can be derived from this processing logic.

8. What's this got to do with strong typing?

Any XML document which conforms to a schema (and virtually all do, explicitly or implicitly) is considered to contain strongly typed data. Strongly typed data provides the bridge between 'data' and 'information'. Any data which is or can be strongly typed implicitly defines non-document operations which can be carried out against that data. For example, if a set of data within an XML document is defined to be a sequence of integers then it can be sorted. If the contents of an XML attribute is defined to be a 'flight number' then operations such as 'book flight', 'cancel flight', 'retrieve flight information' can be carried out against the contents of that XML attribute in addition to the basic operations, READ, WRITE and DELETE.

9. So XML documents are not documents but in fact objects?

Yes. Because XML documents are meant to be consumed by machine agents engaging in context dependent activities they guarantee both p. interface to context dependency as well as p. interface as behavior.

Further, machine agents perform operations on XML documents which exceed the universal document operations and these operations can only be (safely) performed by violating the semantic transparency of XML documents and decomposing the document string in a truly meaningful manner. In English: XML documents may in fact be semantically transparent but nobody treats them that way. For even trivially complex processing of XML document knowledge is required not only about the structure and contents of the document but about information not contained in the document eg. how to parse a string into a date. Therefore XML documents violate p. document granularity.

10. So what does this all have to do with SOAP and REST?

Proponents of REST seek to build XML-based, distributed systems that fully exploit the universal document operations. Whenever possible, RESTians advise using the universal document operations READ (analagous to HTTP GET), WRITE (analagous to HTTP PUT), and DELETE (analagous to HTTP DELETE), to express context dependent meaning and semantics. The use of HTTP POST is allowed but because HTTP POST violates the document metaphor (and comes dangerously close towards the object metaphor) its use is advised against whenever possible. The primary value proposition of REST architectures lies in abstract complex systems behind a common api (the HTTP protocol) which is essentially identical to the universal document operations. This increases simplicity, scalability, and interoperability.

Proponents of SOAP treat XML simply as a serialization format for well defined, semantically understood objects. Because of this, documents within SOAP are essentially treated as either objects or as interactions between objects (request or response). Unlike REST, SOAP systems and objects within SOAP systems expose operations which far exceed the universal document operations--they are said to have 'many verbs'. While this makes such systems arbitrarily complex (ie a SOAP system is as complex as the problem-domain it is modeling), it allows such systems to expose interfaces which are very expressive and therefore more intuitive, particularly to those already familiar with Object Oriented Analysis and Design. It should also come as no surprise that many SOAP systems are heavily dependent upon XML schema to provide data binding (automatic conversion) between XML and objects.

It is not clear that either SOAP or REST is the optimal solution for building XML-based, distributed systems. Both suffer from many weaknesses.

REST systems seem to be much harder to architect because all entities must be constrained into the document metaphor, they lack expressiveness. REST systems also tend to treat XML documents as documents which is rarely the case. Because only the basic document operations are allowed, REST systems tend to be both verbose and over-architected.

Meanwhile SOAP systems, because they lean towards basic RPC, may not scale well as the number of verbs and objects within the system increase without bound. Further, because SOAP systems can be decomposed into objects they suffer from all of the reuse problems which plague object oriented systems. One common complaint of such systems is that between different contexts the names of identical operations may be completely different eg the universal GET document operation may be 'GET' in one context, 'READ' in another, and 'RETRIEVE' in yet another.

11. Well, which one should I use? SOAP or REST?

The answer is not clear. Ultimately, it may simply be a question of whether your system is exchanging documents or objects. If, for example, you believe a purchase order is a document exchanged between businesses engaged in a negotiation process then a REST-style approach may be more amenable or elegant. On the other hand, if you believe a purchase order is an object which exposes operations such as cancelPO, completePO, and signPO, then a SOAP-style system may be easier to implement. It is of course also the case that most business processes have already been decomposed into object systems and so it will always be easier and cheaper to translate such processes into SOAP-style systems.

12. It's worse than it appears, isn't it?

Yes, most likely it is much worse. The fundamental ambiguity concerning the document/object nature of XML documents may have serious reprecussions across all applications of XML including distributed systems. The most obvious example of this is the conflict that exists between the XML Schema Recommendation and the XML 1.0 Recommendation, XML Namespaces Recommendation, and RDF Recommendation. The Schema recommendation seems to impose an object and type system on XML that directly violates the document-centric model assumed in all other recommendations. #

Tim Bray On Syntax and Semantics:
Good heavens, is someone arguing that markup or meta-markup have semantics? Seems to me that semantics lives in the human mind and executable computer programs; one of the virtues of descriptive markup a la XML is that you get to choose what semantics you want to apply.

This seems incorrect. If semantics are located totally within the consumers of the document then what can it mean for a document to be 'valid'? If any given consumer can apply any given semantics to a document then the document must be inherently meaningless and no universal notion of validity can be applied to the document. Consumer A's semantics, after all, are just as good as Consumer B's even though they may be total opposites. But we do speak of valid and invalid documents all the time; this is the crux of all schema mechanisms. If a document declares a schema then a universal notion of validity exists for the document, the set of safe operations that can be performed on the document is restricted, and, most importantly, certain semantics are implied when processing the document eg. 'treat the contents of this element as a number'.

Sunday, July 28, 2002
Ogbuji on Markup and Datatyping

A clear statement that addresses the heart of the problem of datatyping (and thus Schema and thus SOAP). #

Can XML Be The Same After W3C XML Schema?

Great article. As I delve more and more into XML Schema it becomes apparent to me that XML Schema alters the definition of XML to such an extent that the two recommendations can't coexist. The XML 1.0 recommendation is essentially invalidated by the Schema rec. By itself this isn't interesting but it has big consequences for the SOAP/REST split. #

XML Data-Binding: Comparing Castor to .NET
I'd like to show you that, but first I've got to lay some groundwork. In this article, I will show how .NET XML data binding works, while investigating the equivalent Java functionality. Java and .NET both have excellent support for data binding, and although they work in slightly different ways, each is just as valid and useful as the other.

W3C XML Schema Made Simple

Everything in this article is right on except for the warning on complexTypes. complexTypes and extension of complexTypes by xs:extension are a major factor of the power of schemas and shouldn't be avoided. He's right about xs:restriction though; it's useless. #


the server side
dotnet junkies
sam gentile
sam ruby
paul prescod
.net guy
jon udell
john robb
blogging roller
desktop fishbowl
cafe au lait
be blogging
kevin burton
james strachan
the truth is out there
brett morgan
blogging roller #2
joe's jelly
psquad's corner
rickard oberg
the saturn times
russel beattie
gerhard froelich
pete drayton
clemens vaster
ingo rammer
ken rawlings
simon fell
bit working
justin rudd
chris sells
john lam
jim murphy
brian jepson
john burkhardt
matt pope
better living through software
loosely coupled
understanding software engineering
rest lst,rdf-interest lst,tag lst ucapi lst

A man, his puppy, and a double barreled shotgun.

Powered by Blogger