I'm trying to understand the differences between PubSubHubbub and the rssCloud interface.
Here's my current notes. Comments welcome, i will refine these notes later. For now, i'm going to the park :-D
Discovery
- With PubSubHubbub you put in your Atom/RSS feed the hub URL :
<link rel="hub" href="http://myhub.example.com/endpoint" />
- With rssCloud you put in your RSS feed the hub URL and a few other parameters :
<cloud domain="radio.xmlstoragesystem.com" port="80" path="/RPC2" registerProcedure="xmlStorageSystem.rssPleaseNotify" protocol="xml-rpc" />
- Description of the parameters :
- (cloud/hub) domain is the equivalent of PubSubHubbub link's href (if you add the path parameter below..)
- (cloud/hub) port : can be part of PubSubHubbub's endpoint URL.
- (cloud/hub) path : add that to the domain above to get the full cloud/hub endpoint path..
- (cloud/hub) registerProcedure : seems only usefull in the XML-RPC/SOAP version, not sure what it should be with http-post. I'd say it's useless if we consider HTTP POST only like PubSubHubbub does. There is a single "procedure" anyway for what we're traying to do, it's registering.
- (cloud/hub) protocol : the rssCloud spec specifies soap, xml-rpc and http-post.
=> domain, port, path and protocol, concatenated, are basically PubSubHubbub link's href. Having distinct parameters makes sense for SOAP/XML-RPC, not much for HTTP POST.
So not much differences here.
Subscription
- PubSubHubbub : You must send a POST to the Hub URL, with these parameters:
- mode (subscribe/unsubscribe)
- callback (The subscriber's callback URL)
- topic (The topic URL that the subscriber wishes to subscribe to.)
- verify (sync/async) -- Keyword describing verification modes supported by this subscribe
- verify_token (Optional)
- lease_seconds (Optional)
- rssCloud : A workstation (subscriber) calls the cloud (hub) to register, the procedure takes five parameters:
- the name of the procedure that the cloud should call to notify the workstation of changes (so, a callback)
- the TCP port the workstation is listening on
- the path to its responder
- string indicating which protocol to use (xml-rpc or soap, case-sensitive)
- and a list of urls of RSS files to be watched : this is the equivalent of PubSubHubbub's topic parameter.
=> The first 4 are the equivalent of PubSubHubbub's callback URL . But note that in rss Cloud, the cloud (hub) must guess the IP/domain of the subscriber ("The cloud can determine the IP address of the caller from the request. A workstation cannot make a registration call on behalf of another.")
Differences between PubSubHubbub and rssCloud :
- rssCloud has no unsubscriptions means. It relies on automatic expiration ("By convention registrations expire after 25 hours. Workstations should register every 24 hours for each subscription to keep them current.")
Subscription verification
- PubSubHubbub : Hub sends a GET request to the subscriber's callback URL with these parameters:
- mode
- topic
- challenge
- lease_seconds
- verify_token
- rssCloud just relies on getting the IP of the subscriber when this one subscribes.
Differences :PubSubHubbub has a subscriber verification mecanism based on a challenge and optional verify_token. rssCloud just gets the IP from the subscriber from the subscription request.
New Content Notification
- PubSubHubbub : "A publisher pings the hub with the topic URL(s) which have been updated and the hub schedules those topics to be fetched and delivered"
The hub MUST accept a POST request to the hub URL containing the notification.
- mode
- url
- rssCloud does not specify any ping facility, there is already Weblogs.com's ping protocol.
Content Fetch
- PubSubHubbub :"When the hub wishes to retrieve new content for a topic, the hub sends an HTTP GET request to the topic URL. The request SHOULD include a header field X-Hub-Subscribers whose value is an integer number, possibly approximate, of subscribers on behalf of which the feed is being fetched."
- rssCloud does not say anything about how the cloud/hub fetches the feed.
Note that Google Reader already has a way of reporting subscriber counts.
Content Distribution
- PubSubHubbub : "A content distribution request is an HTTP POST request from hub to the subscriber's callback URL. This request has a Content-Type of application/atom+xml and its request body is an Atom feed document with the list of new and changed items."
- rssCloud : "when a subscribed-to channel changes the cloud calls back to the procedure named in the registration call with one parameter, the url of the channel that changed. At that point the workstation could read the channel, or notify other workstations that the channel has changed, clear a cache, send an email or do nothing.
PubSubHubbub directly posts the content that changed in the subscribed feeds. There is no need for the subscriber software to do a fetch of the original feed to get the new content.
With rssCloud, the subscriber has to fetch the new feed to get the new content.
Both have pro/con i guess.
Note that RSS cloud notes that a subscriber could notify some other subscribers (federation?).
Conclusion
For now I think PubSubHubbub is very similar to the rss cloud element. PubSubHubbub acknowledges the prior art, but it's wrongly written that rss Cloud only deals with pub and not sub. It's written in the RSS spec that "Its purpose is to allow processes to register with a cloud to be notified of updates to the channel, implementing a lightweight publish-subscribe protocol for RSS feeds."
PubSubHubbub is an alternative to weblogs.com's changes.xml part to dispatch RSS updates notifications.
FriendFeed's SUP protocol also replaces changes.xml, keeping the same polling system. (Not sure yet what SUP does that changes.xml not already did ?)
RSS' cloud element has never really taken off (implemented in at least Radio Userland), PubSubHubbub seems to catch up, bringing this protocol from spec to working software (implemented at least in Google Reader and Blogger). PubSubHubbub's presentations explain in a friendly way how it all works.
To me, all this describes the same RSS pub sub mecanism, which I like by the way.
Thank you for the very thorough comparison! You are correct that the PriorArt wiki page (http://code.google.com/p/pubsubhubbub/wiki/PriorArt) does not clearly explain why rssCloud does not solve the subscription problem completely. I have just updated the wiki to explain this better.
In a nut-shell, rssCloud's subscriptions are merely a way of redistributing pings to subscribers. We think this is still on the publishing side of the problem and does not simplify the life of a subscriber. With rssCloud, subscribers must re-fetch the feed to see if it's changed. In contrast, Hubbub delivers the actual changes to the subscriber so they have no more work to do. This makes it much easier to subscribe, and has some nice properties when it comes to scalability.
Anyways, thanks for the great write-up and let me know if the wiki makes more sense now!
Rédigé par : Brett Slatkin | 13.07.2009 à 02:42
Great comparison, really helped me understand the differences. Thanks.
Rédigé par : twitter.com/derek | 08.09.2009 à 04:52
Great article. Another interesting post - http://grack.com/blog/2009/09/07/pubsubhubbub-vs-rsscloud/. Overall it looks like the choice is between making the hub just a notification channel or having it take up the responsibility for content push. I like the later as it simplifies the client work flow.
Rédigé par : Ravikant Cherukuri | 08.09.2009 à 22:49