Ian (lovingboth) wrote,

lazyweb: filtering rss feeds

There is an rss feed I am interested in which has a high level of crap I am not interested in on it. I would like to avoid having to look at the crap.

Does anyone use popfile (email / newsgroup Bayes classifier, usually used to detect spam) and nttp//rss to do this? In theory, the latter should take the rss feed and present it as a newsgroup, for the former to look at and classify according to the 'buckets' I set up. As you may guess :) it's not working for me at the moment...

Are there any other ways to do it?

Edit: Oh, yes, I know about Sux0r.org, but public implementations of that want to concentrate on 'more academic' feeds than the one I am looking at. I might try their software though... Commercial services exist (FeedZero and FeedScrub) but I want, as ever, to do this for free.

Edit2: Ah, it looks like you can at least try the commercial services. Let's see how they do. Annoyingly, both only categorise things into 'yes/no' rather than an arbitrary number of categories (I'd like at least four...)

Edit3: It looks like FeedScrub cannot cope with long URLs for feeds. In a busy feed with lots of crap - exactly the sort of feed you need this for - it also has a habit of a) not looking at it often enough (I cannot find a setting for how often to check it) and b) deciding everything in it is crap, and leaving you to look through the 'I'm not going to show you these, they are crap' feed. Which, if you do it via their website, only shows you about ten items (I cannot find a 'ok, show me more' button). And then when you try to train it via this, asks for your password regardless of whether you are logged in, and then doesn't apparently do anything.

Let's see how easy it is to install Sux0r...

  • Post a new comment


    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened