Collaborative Authoring on the Web: Introducing WebDAV
1998; Association for Information Science and Technology; Volume: 25; Issue: 1 Linguagem: Inglês
10.1002/bult.107
ISSN2163-4289
Autores Tópico(s)Web Data Mining and Analysis
ResumoThe irony was intense. The frustrating, awkward nature of collaborative authoring over the Internet became increasingly evident as draft after draft of the WebDAV Distributed Authoring Protocol specification was edited, spelling out how collaboration could be much easier, much more fluid, if leveraged on the Web's standard infrastructure. The collaboration scheme the authors used when writing the specification was typical: each draft required the author to make the revisions, then e-mail changes back to the other authors. If an author's e-mail system was bogged down, the draft might not resurface for hours, occasionally losing a work day. Once the draft was received, it went into a directory filled with other similarly named revisions of the document, making it tough to pick out a given revision even a few weeks later. When it was OK for another author to modify the document, they e-mailed, "you have the torch" – but sometimes forgot, leading to confusion, and either lost time or lost changes. Meanwhile, with each successive revision the WebDAV Distributed Authoring Protocol provided a clear picture of a better way. Instead of passing documents back and forth via e-mail, edit them in-place at a URL. Instead of "passing the torch," a locking mechanism prevents overwrite conflicts and permits lock discovery. Since the document is always accessible via its URL, there is no lost time due to e-mail delays, and other collaborators can view progress as it is made. Alas, when writing the WebDAV specification, WebDAV technology was clearly needed. WebDAV provides many benefits: Seamlessness, interoperable publishing of all content to the Web. Now individuals and workgroups can use the Hypertext Transfer Protocol (HTTP) to directly publish their work to the Web. Workgroups can collaboratively author documents in-place on the Web, using locking to prevent overwrite conflicts. Due to the distributed nature of the Web, these workgroups can have members from within the same organization, or which cross organizational boundaries. WebDAV has no restrictions on the type of documents which can be authored. WebDAV provides HTML and XML authoring support, but it just as easily supports authoring of existing word processing, spreadsheet, text, graphics, and all other formats. WebDAV and HTTP provide a common interface to a wide range of repositories, such as document management, configuration management, file systems, databases, etc. In essence, WebDAV makes the Web look like a large-grain network-accessible file system. WebDAV allows embedded devices to safely write to the Web, enabling a whole new class of Web-enabled device. For example, imagine a WebDAV-enabled digital camera with a built-in cellular modem. As soon as a picture is taken with the camera, it is transferred to a Web server using WebDAV and is immediately available on the Web. More specifically, the WebDAV Distributed Authoring Protocol defines a set of extensions to the base Hypertext Transfer Protocol for the following capabilities: Overwrite prevention. Keeping more than one person from working on a document at the same time. This prevents the "lost update problem" in which modifications are lost as first one author, then another writes changes without merging the other author's work. Properties. Creation, removal and querying of information about Web pages, such as its author, last modified date, etc. Also included is the ability to make hypertext links between pages of any resource type. Name space management. Creation, removal and automatic consistency maintenance of collections containing sets of resources. Also, the ability to copy and move Web pages and to receive a listing of resources in a collection (similar to a directory listing in a file system). Current work-in-progress within WebDAV focuses on these additional capabilities: Version management. The ability to store important revisions of a document for later retrieval. Version management can also support collaboration by allowing two or more authors to work on the same document in parallel tracks. Automatic versioning records successive modifications to a resource made by versioning unaware ("downlevel") clients. Advanced Collections. Similar to a symbolic link in a file system, the ability to add a referential member to a collection which can point to any resource on the Web. Additionally, ordered collections allow a client to specify a persistent ordering of resources in a collection. Access Control. The ability to limit the access rights of a given authenticated principal on a given resource. WebDAV assumes the existence of, but does not specify, strong authentication technology. A strongly related effort to WebDAV is the DAV Searching and Locating (DASL) group which is working to develop an interoperable means of searching a repository which is compliant with the WebDAV object model and which organizes its resources into URL hierarchies. The main capability of DASL is: Searching. Client specified, server-executed queries to locate resources based upon their property values and text content. From its inception, the WebDAV working group of the Internet Engineering Task Force (IETF) has been working steadily to produce an interoperability specification which defines HTTP methods and their semantics for the above capabilities. Work in this direction has focused on three documents: A scenarios document [Lass97], which gives a series of short descriptions of how distributed authoring and versioning functionality can be used, typically from an end-user perspective; A requirements document [SVWD97], which describes the high-level functional requirements for distributed authoring and versioning, including rationale; A protocol specification [GWF+98], which describes new HTTP methods, headers, request bodies and response bodies, to implement the distributed authoring and versioning requirements. Though it is an IETF working group, and hence has no official affiliation with the World Wide Web Consortium (W3C), WebDAV does work cooperatively with the W3C, which provides technical assistance and help in contacting interested people within the Web community. DASL is currently in process of becoming an IETF working group and is working to develop extensions to the WebDAV Distributed Authoring Protocol specification (and hence to HTTP) for searching WebDAV repositories. DASL has its own requirements document [RS98] and protocol document [RJR+98], which are still the subject of intense effort within the DASL group. The remainder of this article provides a detailed overview of the capabilities in the base WebDAV Distributed Authoring Protocol. Note that throughout this article the term resource is often used. Resource is the proper Web terminology for any piece of information, such as a Web page, a document, a bitmap image or a computational object which is stored on a Web server and whose location is described by a Uniform Resource Locator (URL) [BFM98]. The WebDAV Distributed Authoring Protocol contains a set of features which can be used in a wide variety of settings by applications which support collaborative work on remotely authored documents. These features can be partitioned into three groups: overwrite protection, properties (metadata) and namespace management. A detailed overview of these capabilities is presented in the sections below. 1 HTTP Clients Bulletin of the American Society for Information Science Once two or more people start collaborating on the same document, the issue of write control comes to the fore. If everyone can write to the same, unversioned document, then it is possible to lose changes made by one or more contributors as first one collaborator, then another, writes their changes without first merging in previous updates. There are many techniques which can be used to alleviate this "lost update" problem. Several of the more common ones are: POTS (plain old telephone service) or "over-the-wall" control. This scheme is a social convention in which collaborators agree to communicate verbally when one author has finished working, and it is safe for another to begin. E-mail can also be used to implement this write control scheme, as can a physical object (e.g., a baton) which is passed from author to author in environments where authors are co-located. Shared locks (also known as advisory locks or reservations). In this scheme, an author indicates to the computer controlling access to the document that he or she intends to modify it, and the computer records this author's intent to edit. If another author similarly tries to indicate an intent to edit, the computer will announce that the document is currently being edited. However, if the second author still wishes to edit, it can be done, presumably by contacting the other author to negotiate access or by taking advantage of extra-system knowledge that no conflict will result (e.g., the other author is in a meeting). Exclusive locking. In this scheme, an author indicates to the computer controlling access to the document that he or she intends to modify it, and the computer responds by locking the document. Once the document is locked, it may not be modified by anyone other than the owner of the lock. Other authors who try to edit the same document are refused, because they do not own the lock. These schemes vary from least protective and most flexible (POTS) to most protective and least flexible (exclusive locking). Currently, the WEBDAV approach is to provide facilities for both shared and exclusive locking. This dual lock support provides sufficiently flexible locks to accommodate a wide range of collaborations. While shared locks best support collaborators who have a lot of awareness of each other's activities, exclusive locks provide a more stringent guarantee of conflict avoidance for less aware collaborators or during periods of high contention for a document. Locks may have a scope of a single resource, including all non-live properties on the resource, or a hierarchy of resources (for example, a collection and all of its member resources). A lock discovery mechanism (a WebDAV property) allows authors to find out if any locks exist on a Web resource. Since the Web is designed so that no lock is required to read a Web page, there is no concept of a read lock. An implication of this is the contents of a resource may change without warning if a write lock is not owned on the resource. Locking usually comes paired with event notification capability, so that other collaborators can be automatically informed by the system when a lock has been released. Notifications are an important mechanism by which collaborators become aware of each other's activities and may occur in multiple granularity levels. Events with a grain size of an entire resource, such as a lock being granted or released, provide document access awareness information, while sub-resource events, such as a word being inserted into a paragraph, can lead to authoring tools which support multiple authors simultaneously working in the same document. Although WebDAV has decided against developing an interoperability standard for Web-based notifications, the recent Workshop on Internet Scale Event Notifications (WISEN) held at the University of California at Irvine in July, 1998 (for details see: http://www.ics.uci.edu/IRUS/wisen/), and the Event Notification Service BOF meeting held at the Chicago IETF meeting in August, 1998, are strong indicators that standardization work may soon begin in this area. Information on the Web has many pieces of associated information, such as the title, subject, creator, publisher, length and creation date. This information about information (called properties within WebDAV, but also known as metadata) can be used to search for Web resources, enforce copyrights or provide bibliographic information. Properties are particularly useful in searching for Web resources due to the inadequacies of existing index-based Web search engines which often return a large number of undesired results to any query. By focusing a search on a the value of a particular property (e.g., the author), properties can be used to reduce the number of undesired query results; the DASL effort is concentrating on providing solid support for queries on properties of resources. Development of a useful set of properties is extremely important – one schema, or set of metadata, which was developed to assist Web searching is known as the Dublin Core (for more information, see: http://purl.org/metadata/dublin_core/). Since other groups have focused on developing metadata sets, the WebDAV group decided to focus on developing facilities for creating, modifying, deleting and retrieving metadata. These facilities allow for the manipulation of metadata from multiple schemas, allowing the schema itself to vary with domain of use. For example, even though the Dublin Core is appropriate for use in the general Web context, it may not be ideal for use in other settings, such as the legal community. By being metadata schema neutral, the WebDAV approach allows the most appropriate schema to be used in any context. It allows WebDAV to focus on "how," as in how properties are stored and retrieved, rather than on "what," as in what do they mean? 2 How An Authoring Client Uses WebDAV Bulletin of the American Society for Information Science WebDAV properties are name-value pairs. The name is a Uniform Resource Identifier (URI), such as a URL, and the value is a well-formed sequence of Extensible Markup Language (XML) [BPS98] content. (For more information on URIs, see "An Introduction to the Resource Description Framework" in this issue of the Bulletin.) If, for instance, a property name is a URL, it can be given uniqueness without central registration by using URL property names chosen from within a domain whose name is controlled by the party defining the property. So, for example, a company that controls a given domain name, like "widgets.com," can choose a property name from within this domain, like "widgets.com/properties/color." An example WebDAV property defined by the Distributed Authoring Protocol is the DAV:getcontentlength property, which gives the length, in bytes, of the response generated by a GET on the resource. The property name is a URI, with a URI scheme of "DAV," which is reserved for use by WebDAV. A sample value of this property is: Name: DAV:getcontentlength Value: 3422 In this case, the length is 3422 bytes, which is enclosed within the XML element. By convention, the enclosing XML element for a WebDAV property takes the same name as the property itself. Using XML to encode the value of properties provides three major benefits. First is extensibility. Since all content within XML is encoded between start and end tags, it is easy to add additional elements to a property by inserting new tagged content. Internationalization is the second major benefit. Since XML mandates support for the UTF-8 and UTF-16 encodings of the ISO 10646 character encoding standard, as well as language tagging, properties can express content in the vast majority of human languages. Finally, by using XML, WebDAV properties can support other metadata activities which are also based on XML, such as the Resource Description Framework (RDF) under development at the W3C. In the current, publish/browse model of the Web, there is scarce need for a user to duplicate or rename Web resources. However, once the Web is used for distributed authoring, the need for these capabilities, plus the ability to get a listing of a directory, becomes extremely important. Being able to discover what resources currently populate a portion of the name space of a Web server and the ability to copy, move and delete these resources, together form the key elements for managing a Web name space. There are several justifications for adding copy and move capability. A resource may need to be copied due to changing ownership, prior to major modifications, or when making a backup. It is often necessary to move (i.e., change the name of) a resource, for example due to adoption of a new naming convention, or if a typing error was made originally entering the name. Copy and move have ramifications with respect to properties: how should properties behave after a copy or a move? It would seem that all properties on the duplicated or moved resource should be identical to the properties on the original. However, there are really two classes of properties: live and static. Static properties have the quality that their value, once set, remains the same until a client explicitly modifies it. Live properties, in contrast, have their syntax and semantics enforced by the server and may vary at any time. One example of a live property is the content length of a resource – every time the resource is updated, the value of the property will also be updated. WebDAV also attempts to resolve conflicts between the existing properties of a resource being moved and those that may be enforced by the server or directory in which it is to be located. Listing the contents of a collection, an operation similar to listing a directory of a file system, is accomplished using the property retrieval mechanism. Most existing directory listing operations (such as "ls" or "dir") provide the name of a file and an option for retrieving limited sets of properties about the file, such as its size, owner and access permissions. However, since WebDAV has an existing property retrieval mechanism, it made little sense to define another property retrieval operation just for listing a collection. Instead, the existing property retrieval mechanism was used. Since WebDAV property retrieval allows, with a single operation, a hierarchical retrieval of properties on a collection, returning for each resource its name and requested properties, this mechanism has enough expressive power to do double-duty as the "list a collection's members" operation as well. Taken together, the WebDAV extensions to HTTP provide the standard needed to make the Web a writable, collaborative medium. What does this mean? Although the future is notoriously hard to predict, here are some likely outcomes of adoption of WebDAV. As WebDAV technology is deployed, it will initially have its largest impact on small to medium sized workgroups, which homogeneously support DAV, allowing their work practices to coalesce around a local intranet. Over time, as critical mass grows, WebDAV will also dramatically reduce the accidental costs of collaboration between workgroups and between organizations. WebDAV additionally shows significant promise as an infrastructure for development of distributed software engineering environments and other complex information products [FWA+98]. WebDAV in the home will make Web page creation significantly easier, since Web pages will be editable in-place. Furthermore, opportunities for collaboration abound in the home: WebDAV will allow school children to collaborate easily on projects and reports, and parents who do volunteer work will find it easier to work on proposals, budgets, schedules and more. By giving more voices access to the global distribution of the Web, and by making it easier to collaborate, WebDAV technologies will have broad social impact. Equally exciting is the unpredictable nature of information technology, such as the unpredicted advent of electronic commerce on the Web. Right now is the ground floor of WebDAV, when future applications are limited only by your imagination. Working groups of the Internet Engineering Task Force are completely open, and may be joined by subscribing to their e-mail discussion list. If you wish to participate in the discussions on WebDAV topics, you may join the mailing list by sending an e-mail with subject "subscribe" to w3c-dist-auth-request@w3.org. The home page for the WebDAV group is at URL http://www.ics.uci.edu/pub/ietf/webdav/ which contains links to current working drafts, e-mail list archives and background material. The related DAV Searching and Locating (DASL) working group has its Web page at URL http://www.ics.uci.edu/pub/ietf/dasl/ and a mailing list which may be joined by sending a message with subject "subscribe" to www-webdav-dasl-request. E. James Whitehead, Jr., is affiliated with the Department of Information and Computer Science, University of California at Irvine, Irvine, CA 92697-3425. He can be reached by e-mail at: ejw@ics.uci.edu; by phone at 949/824-4121; or by fax at 949/824-1715.
Referência(s)