In thinking of how news propagates through an information network, Dyfrgi and I came up with the (perhaps not original, but new to us) idea of distributed, pluggable RSS filter modules. Somewhat like what Localfeeds is, except with optional control lines. I like to think of them in the way that Galan thinks of LADSPA plugins: a network of connected modules with separate control lines.
You could run a site that has one of these filters that does Bayesian filtering. The filter is presented with an OPML file of your feed collection (or simply the output feed of the plugin below if you're chaining them), the feed reads the file, buffers the contents according to your refresh rules, filters, and generates an output feed. The output feed has embedded XML-RPC discovery information to allow the upstream aggregator to control the plugins.
I'm putting HTTP GET-style queries where there could be XML-RPC. It doesn't quite matter what the transport of the control is, provided that it's agreed upon by filter/aggregator. The user/pass could be replaced by a standard sessionid model.
Something like this, for an OPML feed collection:
http://example.com/filters/bayesian?register=http://foo.bar.cow/my/feeds.opml;user=bar;pass=foo1234
Or like this, for a single feed:
http://example.com/filters/bayesian?register=http://news-site.com/articles.rss;user=bar;pass=foo1234;refresh=30
http://example.com/filters/bayesian?getfeed=http://foo.bar.cow/my/feeds.opml;user=bar;pass=foo1234
http://example.com/filters/bayesian?update=http://foo.bar.cow/my/feeds.opml;user=[...];articleid=123;score=+2
Dyfrgi noted that there are two models this can take:
13:12 <@dyfrgi> It can either a) filter when an aggregator requests a feed,
calling down all the way to the final element, which needs to
fetch the actual RSS, or b) have something polling the RSS
regularly and cache the filtered representation for later
dispersal.
Localfeeds takes the b) model, while a chain of filters might be better suited taking the a) model. With the a) model, the user's aggregator polls the output feed of the filter which then polls and filters the input RSS feed. This would cause any chaining of filters to all be done at the same time, without any need to worry about delays due to caching of the filtered feed.
One other problem is authentication chaining. You don't want every filter to know the credentials for each filter down the line. One way around this is to simply make it common for a filter to give an output RSS feed based on username, but require the password for any manipulation of it.
I searched for related ideas and found this, Content Pipeline which has some good ideas of potential filters.
![]()
All original sound, text and graphics on this site (staticfree.info) are licensed under a
Creative Commons License.
Re: Filter chains
hey thats pretty cool