RSS技术(二)
Q What's the easiest way to set up a Web log?

A The easiest way to set up your own Web log is to go to a site like blogger.com and register as a new user. It provides a Web interface for creating a customized Web log that you can use immediately. There are many other sites like blogging.com that provide support for Web log features.
If you'd like more control over the blogging infrastructure or would like to host your blog on your own server, you can also use one of many blogging applications available today including Radio Userland, Manila, and Movable Type梥ome of the most popular commercial products. There are also free .NET blogging applications that are easy to use. The most popular are .Text and dasBlog. To set these up, simply download the bits and follow the instructions. You'll be up and running in minutes.
Functionally, both .NET-based applications are fairly equivalent. However, one major difference is that .Text requires a database, either SQL Server?nbsp;or MSDE, while dasBlog stores everything in XML files (it's based on the original BlogX framework created by some Microsoft developers). Another difference is that .Text is capable of hosting multiple blogs on a single installation (for example, it's what drives http://blogs.msdn.com today) while dasBlog requires multiple installations. dasBlog has one feature that really stands out called "Mail to Weblog", which allows you to post new entries via e-mail.
The new MSDN blogging site and PDC Bloggers are both good starting places for finding Web logs on any software development topic. Simply browse to one of these sites and read their aggregated feeds. Their feeds will expose you to many individual Web logs and over time you'll naturally find some that you like to read more than others. Then, you can subscribe directly to the individual feeds you enjoy most.
For blogs that specifically cover XML and Web services, check out the list on the MSDN Web Services Developer Center. I personally spend a lot of time on some of these Web logs.

Q What's a feed and how can I subscribe to it?

A A Web log can provide a feed to its content by producing an RSS document available via a well-known URL. An RSS document is an XML file that contains a number of discrete news items, such as entries in a Web log (see Figure 1 for a sample RSS feed). As an XML format, RSS is easily consumed by other programs.
An RSS aggregator is a program that reads RSS documents and displays new items. Most aggregators make it possible to subscribe to a feed by simply entering the URL to the RSS document.
RSS makes reading Web logs easy. Most developers who frequently read Web logs use an aggregator of some sort to help them sift through their subscriptions efficiently. An aggregator makes reading Web logs feel a lot like reading e-mail since they highlight new items and cache items for offline reading (see Figure 2).
There are also some online RSS aggregators that consolidate your RSS subscriptions on a separate Web site. This approach has the advantage of being easy to set up and you can access your subscriptions from any computer. The downside, of course, is that you have to be connected to do any reading.
RSS is ultimately what's made blogging such a powerful new form of communication. Before blogging, most developers spent a lot of time sifting through boring, irrelevant posts in order to discover the rare gems that would occasionally appear from people they respect. Blogging puts readers back in control by allowing them to choose which feeds to read, effectively building their own personalized content streams.
Other types of sites can also take advantage of RSS to syndicate content. For example, most of the major news sites including Wired, CNet, Yahoo!, and NPR News provide RSS feeds. Check out Blogdigger and Syndic8 to find sites that support RSS.
At Microsoft, MSDN provides RSS feeds to syndicate new technical content as it's added to the site. The MSDN Just Published feed is a great way to keep up with new MSDN articles and downloads. Even MSDN Magazine has its own RSS feed! Subscribe to http://msdn.microsoft.com/msdnmag/rss/recent.xml to receive a monthly update on what's in the current issue. There are many RSS aggegators to choose from today. You can find a fairly complete list at http://blogs.law.harvard.edu/tech/directory/5/aggregators. Some of these are online aggregators while others are desktop applications. Some are free while others charge a fee.

Q Which RSS version is the most current?

A The answer depends on who you ask. There have been several versions of RSS including 0.90, 0.91, 0.92, 0.93, 0.94, 1.0, and 2.0. Making sense of these different versions has been one of the biggest challenges. Understanding them requires a bit of history.
Netscape created the original version of RSS, 0.90, which stood for "RDF Site Summary" or "Rich Site Summary" (the spec says the former was the official name). Netscape invented RSS 0.90 for use in their Web portal activities, but others latched onto the concept and saw more potential uses. Userland Software was one of the first to begin using RSS commercially in their Web log products.
Version 0.90 was heavily based on the W3C's Resource Description Framework (RDF). Many considered the RDF approach overly complex, so a simplified RDF-free version was proposed and labeled 0.91. It was around this time that control of 0.91 passed to Userland Software. Userland Software continued to evolve the simplified spec with several new versions including 0.92, 0.93, and 0.94. To emphasize their focus on simplicity, it wanted RSS to stand for "Really Simple Syndication."
As Userland Software continued with their focus on simplicity, another group of developers resurrected the original RDF version (0.90) because RDF promised them more flexibility. They eventually published RSS 1.0, which officially stands for "RDF Site Summary" again. This version is fundamentally different from those controlled by Userland Software because it uses RDF while the others don't. Userland Software didn't like the fact that RSS 1.0 seemed to displace RSS 0.94, so it shipped a new version and bumped the version number up to 2.0.
And that's where it stands today. The split that occurred left two major competing versions: one that's based on RDF (1.0) and one that isn't (2.0), but they both share the same name. This is terribly confusing since the version numbers lead you to believe that 2.0 is an improvement on 1.0 when in reality they're completely different specifications with different goals. Another group of developers has been working to resolve this confusion once and for all by defining a new syndication specification that breaks free from the RSS name. They're calling it Atom, a project I'll discuss in more detail later in this column.
It doesn't matter much which version you use. Most RSS aggregators support all RSS versions (and some even support Atom) without a glitch. The decision mostly comes down to whether you want to use RDF, which is typically fueled by one's belief in the concept of the Semantic Web.

Q What do RSS 1.0 and 2.0 look like?

A The RSS 1.0 and 2.0 formats contain the same core information, but they're structured differently. I've provided a sample RSS 1.0 document (see Figure 1) and the equivalent RSS 2.0 document (see Figure 2) for you to look over.
You'll notice the differences start right at the top with the root element. In RSS 1.0, the root element is rdf:RDF, and in RSS 2.0 it's rss. The rss element also contains a mandatory version attribute to indicate the precise RSS format in use (possible values include 0.91, 0.94, and so forth). Another major difference is that RSS 1.0 documents are namespace-qualified, while RSS 2.0 documents are not. The information contained in both documents, however, is essentially the same.
Both versions contain channel elements. A channel element contains three required elements: title, description, and link, as illustrated in the following code:
<channel>
<title><!-- the channel's title --></title>
<description><!-- a brief description --></description>
<link><!-- the channel's URL --></link>
<!-- optional/extensibility elements go here -->
</channel>


In addition to these required elements, RSS 1.0 defines three additional elements: image, items, and textinput, where image and textinput are optional. RSS 2.0, on the other hand, provides 16 additional elements including image, items, and textinput. Examples of these include language, copyright, managingEditor, pubDate, and category. RSS 1.0 allows for making this type of metadata available through extensibility elements defined in separate XML namespaces.
The main structural difference between the two formats has to do with the representation of item, image, and textinput nodes. In RSS 1.0, the channel element contains references to item, image, and textinput nodes that exist outside of the channel itself. This establishes an RDF association between the channel and the referenced node. In Figure 1, the channel element is associated with an image element and two item elements. In RSS 2.0, the item elements are simply serialized in the channel element (see Figure 2).
The item element contains the actual news item information. The structure of item is similar across both versions. The item element usually contains title, link, and description elements, as shown in the following code:
<item>
<title><!-- the item's title --></title>
<link><!-- the item's URL --></link>
<description><!-- a brief description --></description>
<!-- optional/extensibility elements go here -->
</item>


In RSS 1.0, title and link are required, while description is optional. In RSS 2.0, either title or description must be present; everything else is optional. These are the only item elements defined in RSS 1.0, while RSS 2.0 provides several other optional elements including author, category, comments, enclosure, guid, pubDate, and source. RSS 1.0 makes such metadata available through extensibility elements defined in separate XML namespaces known as RSS modules. For example, in Figure 1 the item's date is represented using the Dublic Core module's <dc:date> element.
Check out the RSS 1.0 and 2.0 specifications for complete details on the different formats.

Q So, what is Atom anyway?

A As I mentioned earlier, Atom is the name of a project for developing a new Web log syndication format to address what many feel are the main problems with RSS today (a soup of confusing version numbers, not a truly open standard, inconsistent, poorly defined, and so on). Atom hopes to offer a clean version that addresses everyone's needs. It is designed to be completely vendor neutral, freely extensible by anybody, and thoroughly specified.
Many of today's blogging engines already support the current Atom syndication format. Figure 3 shows a sample Atom 0.3 feed that is equivalent to the RSS feeds shown in Figure 1 and Figure 2. Notice that the Atom feed is namespace qualified but it doesn't use RDF. This gives Atom something in common with both RSS 1.0 and RSS 2.0. It will be interesting to see how Atom's acceptance plays out in the years to come.
In addition to defining a new syndication format, also hopes to define a standard archiving format and a standard Web log editing API (the Atom API). Check out The Atom Project to peruse the specifications and other Atom resources.

Q What's a blogroll?

A A blogroll is simply a collection of Web log feeds. Most bloggers provide a blogroll on their personal Web log. This allows their readers to connect with others who share similar interests or writing styles. Blogrolls facilitate building networks of respect. A blogroll can be exchanged in XML format using the Outline Processor Markup Language (OPML). Figure 4 shows a sample blogroll.
Most blogging engines will manage blogrolls for you and generate the proper XML format when readers request it. Likewise, most aggregators make it possible to import a blogroll and automatically subscribe to the contained feeds. See http://opml.scripting.com for more information on OPML.

Q Can you explain what referrers, trackbacks, and pingbacks are?

A Most blogging software makes it possible for readers to add comments to a Web log. It's actually more common, however, for readers to add an entry to their own Web log that links back to the original post. Bloggers like to keep track of when this happens so that new readers can follow the entire conversation.
A referrer is an external site from which a user clicked on a hyperlink to reach your site. Many blogging engines will automatically keep track of referrers as readers navigate to an entry on your Web log. Most engines will display the list of referrers at the bottom of the Web log entry so readers can navigate back to the referrer's site and see what they have to say about the entry, based on the assumption that they probably wrote something about it if they linked to it. The problem with referrers has to do with this assumption梩here isn't enough information to tell if the referring page actually contains additional relevant information. In fact, spammers have already taken advantage of this loophole to redirect readers for marketing purposes.
Trackback and pingback are similar specifications developed to remedy this situation. Using trackback or pingback, other bloggers can automatically send a ping to your Web log indicating explicitly that they have written an entry that references a specific post. This type of reverse linking allows your Web log to display a list of all entries that have actually commented on your post in a more explict manner. Most of today's blogging software supports all of these techniques. See TrackBack Technical Specification and Pingback 1.0.

Q How can I generate an RSS feed for my Web site?

A Figure 5 illustrates how to generate an RSS 2.0 feed in an .aspx page using an asp:Repeater control. This page assumes that you'll set the control's DataSource property in the codebehind file to the appropriate database resultset.

Q I'd like to aggregate several RSS feeds and display the information on my personal Web site. Can you explain how to do this?

A Since RSS feeds are XML files, accomplishing this is an exercise in using your favorite XML API, such as System.Xml in the Microsoft .NET Framework. Figure 6 contains the code for an ASP.NET Web user control that I wrote to aggregate the RSS feeds listed in a blogroll file (.opml). The code assumes that the opml element will contain a numberToDisplay attribute to indicate how many items from each feed you want to display.


Figure 7 ASP.NET Web User Control

You can drop this control into any .aspx page and it will display items from the various feeds listed in the blogroll. Figure 7 shows this control in action on the Utah .NET User Group Web site.

Q Are there any Web service APIs for interacting with Web logs?

A Many blogging engines provide their own proprietary Web service interface for interacting with a Web log programmatically, but I wouldn't say a standard has emerged yet.
Both .Text and dasBlog provide some .asmx endpoints that provide editing functionality via SOAP, but their interfaces are different. Blogger.com provides an interactive API (Blogger API) based on XML-RPC. Userland Software enhanced the Blogger API and called it the MetaWeblog API. These are probably the most widely recognized Web log APIs today, but still not all Web log engines support them. There is also a separate API for adding comments called the Comment API, but again, it's not universally supported.
The Atom group is currently working to resolve this mess. The Atom API defines a standard Web log API for publishing and editing all Web log content. You can check out their work at The Atom Project.

Send your questions and comments for Aaron to xmlfiles@microsoft.com.

------

回复此文章 |
回复主题:OPML 1.0 Specification | 作者:hofman | 军衔:上尉 | 发表时间:2004-07-17 12:16:18
http://www.opml.org/spec


OPML 1.0 Specification


9/15/00 DW

About this document

This document describes a format for storing outlines in XML 1.0 called Outline Processor Markup Language or OPML.

For the purposes of this document, an outline is a tree, where each node contains a set of named attributes with string values.

Timeline

Outlines have been a popular way to organize information on computers for a long time. While the history of outlining software is unclear, a rough timeline is possible.

Probably the first outliner was developed by Doug Engelbart, as part of the Augment system in the 1960s.

Living Videotext, 1981-87, developed several popular outliners for personal computers. They are archived on a UserLand website, outliners.com.

Frontier, first shipped in 1992, is built around outlining. The text, menu and script editors in Frontier are outliners, as is the object database browser.

XML 1.0, the format that OPML is based on, is a recommendation of the W3C.

Radio UserLand, first shipped in March 2001, is an outliner whose native file format is OPML.

OPML is used for directories in Manila.

Examples

Outlines can be used for specifications, legal briefs, product plans, presentations, screenplays, directories, diaries, discussion groups, chat systems and stories.

Outliners are programs that allow you to read, edit and reorganize outlines.

Examples of OPML documents: play list, specification, presentation.

Goals of the OPML format

The purpose of this format is to provide a way to exchange information between outliners and Internet services that can be browsed or controlled through an outliner.

The design goal is to have a transparently simple, self-documenting, extensible and human readable format that's capable of representing a wide variety of data that's easily browsed and edited. As the format evolves this goal will be preserved. It should be possible for a reasonably technical person to fully understand the format with a quick read of a single Web page.

It's an open format, meaning that other outliner vendors and service developers are free to use the format to be compatible with Radio UserLand or for any other purpose.

What is an <opml>?

<opml> is an XML element, with a single required attribute, version; a <head> element and a <body> element, both of which are required.

The version attribute is a version string, of the form, x.y, where x and y are both numeric strings.

What is a <head>?

A <head> contains zero or more optional elements, described below.

<title> is the title of the document.

<dateCreated> is a date-time, indicating when the document was created.

<dateModified> is a date-time, indicating when the document was last modified.

<ownerName> is a string, the owner of the document.

<ownerEmail> is a string, the email address of the owner of the document.

<expansionState> is a comma-separated list of line numbers that are expanded. The line numbers in the list tell you which headlines to expand. The order is important. For each element in the list, X, starting at the first summit, navigate flatdown X times and expand. Repeat for each element in the list.

<vertScrollState> is a number, saying which line of the outline is displayed on the top line of the window. This number is calculated with the expansion state already applied.

<windowTop> is a number, the pixel location of the top edge of the window.

<windowLeft> is a number, the pixel location of the left edge of the window.

<windowBottom> is a number, the pixel location of the bottom edge of the window.

<windowRight> is a number, the pixel location of the right edge of the window.

<head> notes

All the sub-elements of <head> may be ignored by the processor. If an outline is opened within another outline, the processor must ignore the windowXxx elements, those elements only control the size and position of outlines that are opened in their own windows.

All date-times conform to the Date and Time Specification of RFC 822.

If you load an OPML document into your client, you may choose to respect expansionState, or not. We're not in any way trying to dictate user experience. The expansionState info is there because it's needed in certain contexts. It's easy to imagine contexts where it would make sense to completely ignore it.

What is a <body>?

A <body> contains one or more <outline> elements.

What is an <outline>?

An <outline> is an XML element, possibly containing one or more attributes, and containing any number of <outline> sub-elements.

Common attributes

text is the string of characters that's displayed when the outline is being browsed or edited. There is no specific limit on the length of the text attribute.

type is a string, it says how the other attributes of the <outline> are interpreted.

isComment is a string, either "true" or "false", indicating whether the outline is commented or not. By convention if an outline is commented, all subordinate outlines are considered to be commented as well. If it's not present, the value is false.

isBreakpoint is a string, either "true" or "false", indicating whether a breakpoint is set on this outline. This attribute is mainly necessary for outlines used to edit scripts that execute. If it's not present, the value is false.

Compatibility

Before the 1.0 format was frozen the top-level element of the format was called outlineDocument. Radio UserLand will continue to read such documents.

Limits

There are no documented limits to the number of attributes an <outline> element can have, or the number of <outline> elements it can contain.

Notes

OPML is a file format, not a protocol. When you click on a link in an HTML document it doesn't in any way change the document stored on the server. OPML is used in much the same way.

Wayne Steele did a DTD for OPML 1.0. Thank you.

In general, the mimetype for an OPML document, when accessed over HTTP, is text/xml. This allows Web browsers to use XML formatting conventions to display an OPML document. Radio UserLand's built-in HTTP server looks at the Accept header of the request to determine how it processes an OPML document. If the Accept header says that the client understands text/x-opml, we return the unprocessed XML text. If it is not present, we return the text in the outline with the mimetype text/html.

Copyright and disclaimer

?nbsp;Copyright 2000 UserLand Software, Inc. All Rights Reserved.

This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and these paragraphs are included on all such copies and derivative works.

This document may not be modified in any way, such as by removing the copyright notice or references to UserLand or other organizations. Further, while these copyright restrictions apply to the written OPML specification, no claim of ownership is made by UserLand to the format it describes. Any party may, for commercial or non-commercial purposes, implement this protocol without royalty or license fee to UserLand. The limited permissions granted herein are perpetual and will not be revoked by UserLand or its successors or assigns.

This document and the information contained herein is provided on an "AS IS" basis and USERLAND DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
hofman   2005-11-19 22:46:29 评论:0   阅读:1522   引用:0

发表评论>>

署名发表(评论可管理,不必输入下面的姓名)

姓名:

主题:

内容: 最少15个,最长1000个字符

验证码: (如不清楚,请刷新)

Copyright@2004-2010 powered by YuLog