Hosted screenscraping: HTML to RSS with Dapper

There are several screen-scraping services out there, but Dapper is one that’s both versatile and visual. With a bit of trial and error, everyone can transform html web pages (or more precisely: changes in web pages) into email notifications, a startpage widget, RSS or another syndication format. Take this example:

Jan, who I happen to know as the guy who wants to be the first “Jan” in Google, has this blog where he writes about a variety of subjects… I would like to subscribe to his Ruby postings, but there’s no tag feed. I am sure he would be able to come up with one if I asked him, but with Dapper, it took only 4 minutes to create an RSS feed from the tag page:

The screencast is really simple and straightforward and bypasses most of the features – my only goal was tho create a simple “Dapp” with this feed as a result.

More on Dapper

If it makes you curious, and you want to learn more on Dapper, have a look at the more comprehensive introductory demo, parameterizable Dapps or refining your Dapps. Browse the Dapps published by other users. More Dapper coverage on Techcrunch, Mashable and Readwriteweb.

Other Screen scraping services

From my del.icio.us selection:

  1. Feed43 (Feed For Free) : Convert any web page to news feed on the fly

    “converts free-form HTML or XML documents to valid RSS feeds by extracting snippets of text or HTML by means of applying search patterns, and then joining these snippets together using output templates to form user-friendly content of feed’s items. ” : search by patterns in the html

  2. Feedity – RSS Web Feed
    Generator for Web Pages without Syndication

    “Another html to rss screenscraping service” : similar to Feed43

  3. openkapow

    “Combination of 1. RoboMaker, a desktop visual scripting tool, with which you define screenscraping scripts. 2. OpenKapow, an online service where you can host and share your scripts as REST, RSS, ATOM or HTML services”: a more heavy-duty editing environment and hosting service for developers

  4. Page2RSS – Create an RSS feed
    from any web site

    “It is a service that helps you monitor web sites that do not publish feeds. It
    will pull the updates from any site and deliver them right to your favorite
    RSS reader.”: produces an notification feed for any changes on a page

  5. ChangeNotes.com – Monitor web site changes

  6. WatchThatPage – Monitor web pages extract new information

    Two email notification services that monitor changes on your selected pages

If you know of other services, post them in the comments.

11 Responses to “Hosted screenscraping: HTML to RSS with Dapper”

  1. fritser says » Create your own feeds from html Says:

    […] Van Hecke has a new post up about Dapper.net. It’s an excellent service that allows you to create feeds out of html […]

  2. Simon Says:

    I like Feedity better than Dapper. It is much simpler!

  3. HTML Guru Says:

    I tried using Dapper but didnt have too much success, I think it tries to be too user friendly and so loses out the ability to strip information out accurately.

  4. Pascal Says:

    You might be better off with Yahoo Pipes nowadays, especially if you’re familiar with html markup. I’ve been planning to write more on that, just lack time.

  5. Een nieuwspagina zonder feeds omzetten naar RSS: Dapper screencast bis | Weblog Pascal Van Hecke Says:

    […] tijdje geleden toonde ik een vriend-journalist mijn screencast over Dapper html-naar-rss-screenscraping.  Hij is zwaar Netvibes-gebruiker, maar vloekt maar al te vaak als een organisatie anno 2008 […]

  6. web man Says:

    I would like to suggest a great new site that organizes your RSS feeds.
    It employs a bayesian filter for RSS feeds where you can train the filter what you like and
    what you don’t like. It’s free, try it at http://www.filteredrss.com.

  7. html to rss » secret marketing links Says:

    […] more difficult to use but much more effective is http://www.dapper.net  . See the little video from Jan & Pascal that explains how it works […]

  8. blog Says:

    Мне нравятся Ваши посты

  9. cliverd Says:

    I like Feedity better than Dapper.

  10. MiraJ Says:

    I like Feedity better than Dapper. It is much simpler!

  11. Bev Hobkirk Says:

    I arrived here from Yahoo after seeking out an option to WatchThatPage since it has become unavailable. I ended up joining ChangeDetect. This is a great service, however unfortuantely contrary to the claims on their site isn’t cost-free. Nevertheless at least the site is effective for monitoring web content updates…