FeedSifter — Search Within the Feed

Do you ever subscribe to RSS feeds that have huge amounts of information, just to get the occasional post that mentions a particular topic or two? Yeah, me too. FeedSifter is just the tool for us. Enter an RSS feed URL and one or more words or phrases, and it will build you a version of the RSS feed that contains only entries matching one or more of your requested words. It allows for basic Boolean searching. Words or phrases entered on one line are joined by “AND”; words or phrases on separate lines are joined by “OR.”
A few examples:

The resulting page is, itself, an RSS feed that you can subscribe to in your aggregator or save as a live bookmark in your browser. Or incorporate the sifted feed into a web page using Google’s RSS embedding tool. I can see an obvious use for this tool at the library reference desk. This makes an easy way to set up a quick-and-dirty current awareness feed for patrons, based on news services or journal table of contents, that can tell them when something new has been published in a narrow area.

TinyPaste Offers Short URLs for Long Quotes

TinyPaste is a tool that does for blocks of text what TinyURL does for URLs: Give you a nice, short, URL to pass along, rather than the full-length one for the page. (A TinyURL example: http://tinyurl.com/3f94fe is much shorter than the full URL for the page you get to.)
So TinyPaste lets you copy a block of text, paste it into a form at tinypaste.com, and get a similarly short URL in return. See http://tinypaste.com/5172c — which is the entire text of this blog post. There is also a Firefox extension that makes TinyPaste available from the right-click menu, so any text you see in your browser can be highlighted and turned into a TinyPaste URL.
TinyPaste is handy for getting long blocks of text into services like Twitter or a Facebook status (by putting in the TinyPaste URL rather than the full text), but it comes with several drawbacks. All formatting (other than line breaks) disappears completely. So do links. And most disturbing, to me, is the utter lack of indication of where the original came from. In the web page version it is, of course, possible to manually insert the URL or other attribution into the text before creating the TinyURL. For the Firefox plugin, though, this can — and I think should — be automatic.

[Via Lifehacker.]

Code4Lib Journal’s Third Issue Available

Issue 3 of the Code4Lib Journal was published today:

Feedbooks: RSS to PDF for Offline Reading

Feedbooks is a site that turns an RSS feed — your own or your favorite daily read — into a PDF file for offline reading. A sample of Feedbooks’ PDF options for RSS4Lib’s feed can be found here: rss2pdf. It includes PDF files formatted for A4 paper, the Cybook & Sony Reader, iLiad, and “Custom PDF,” which is in this case standard-sized U.S. paper.
The tool works quickly, generating a PDF on the fly. First, set up an account at Feedbooks. Then, create a “News” item and enter the RSS feed you wish to subscribe to. The system defaults to A4 paper size; there is not a default 8.5″ x 11″ size. (You can set the Custom PDF page size to this selecting the “custom settings” link and entering the page size in millimeters: 216 by 279 millimeters.) The resulting PDF file can then be downloaded; a link is provided for bookmarking.

Other customizations are font (from a handful of common fonts), font size, and line height. The font choice only applies to the item content, not to the item title. Feeds are displayed one per page, which leaves a lot of white space (a solution I prefer to that used by FeedJournal, which I reviewed in December 2007). The PDF download has a table of contents with page numbers, though the page numbers themselves are not displayed on subsequent pages. I also noticed that some posts in the Feedbooks PDF version lost their paragraphs and were presented as one long block of text. The site seems to reproduce all the items in the RSS feed; the RSS4Lib RSS feed has 15 items, all of which are in the Feedbooks feed.

[Via The Distant Librarian.]

AP, Bloggers, and Fair Use

The Associated Press has stepped back from its original position on copyright and the blogosphere and will be developing a (hopefully) more nuanced policy. According to an article in the June 16 issue of The New York Times, “The Associated Press … said that it will, for the first time, attempt to define clear standards as to how much of its articles and broadcasts bloggers and Web sites can excerpt without infringing on The A.P.’s copyright.”
The recent controversy arose when AP requested that the Drudge Retort (a left-leaning response to Matt Drudge’s conservative Drudge Report) remove seven portions of its syndicated news stories from its web site. (I should note that the excerpts varied in length from 39 to 79 words; the excerpt I have in the previous paragraph is a hopefully safe 37 words.)
The AP’s move to better define fair use when it comes to blogging about news is a welcome one. As I discussed last month, there is a vast gap between what publishers desire and what common practice defines in the realm of copyright. The doctrine of “Fair Use” is “unclear and not easily defined.” (This according to the U.S. Copyright office itself!) Fair use is usually decided in the courts, after the fact. Bloggers have taken their stand through their actions — for better or worse, a significant portion of bloggers view fair use liberally. Publishers have, as fits their economic interest, taken a more restrictive view. It is refreshing to see a major publisher declare its interest in finding a middle ground that it can endorse.

Blogging the Iowa Floods

The University of Iowa launched a flood blog to inform students, faculty, staff, and the public about the flooding on the Iowa City campus. The University of Iowa Flooding Blog provides a strong example of web technologies to aid in communication during a disaster. Linked from the blog are Flickr photostreams of campus images, a headlines feed from the Iowa City Press-Citizen, and, of course, updates and news about campus, buildings currently flooded or currently at risk, where there is power, what is open, and what is closed.

Google Has an RSS Embedding Tool

Paul Pival at The Distant Librarian noted a tool from Google that I had not been aware of: the Dynamic Feed Control Wizard, a JavaScript that you can embed in your web site to pull in RSS headlines and summaries from your favorite sites.

Give it a keyword and it will find matching feeds; pick one, and it gives you the JavaScript to display that code in a vertical or horizontal box on your site. I’m showing a screen shot of the set-up interface here:

Screen shot of Google's RSS embedding tool

This is a very handy tool for grabbing a single feed and placing it on a web page; it could be very useful for libraries that do not have an abundance of technical expertise but have a blog they want to include elsewhere on the site.

One part of Google’s interface confused me (odd, as Google is usually so clever at interfaces). When entering a “Feed Expression”, Google will not take an RSS feed’s URL. It expects you to type keywords describing your feed, and it will figure out the actual feed URL. So entering my feed’s URL (http://www.rss4lib.com/index.xml) did nothing; entering simply “RSS4Lib” produced the feed.

Sample

Visit the full text of this entry to see what the code looks like embedded in a web page: Google Has an RSS Embedding Tool.

Continue reading “Google Has an RSS Embedding Tool”

Estimate Your Blog’s Feed Subscriber Base

Figuring out how much bang you’re getting for your blogging effort is not as simple as it first seems. While it’s relatively straightforward to calculate readership of static Web pages on your server, making a similar estimate of syndicated content readership is much trickier. (For an exploration of that topic, see my 2007 blog post, “Counting RSS Subscribers,” and the feedstats application I built to estimate out how many people might be reading RSS4Lib.)
The trick, of course, is when you “free” your content via syndication formats, it becomes harder to tell at a glance how much those syndicated files are being read. This is a problem for libraries, as for any other business operation, because library managers need information about the effect of their publicity and public relations efforts to justify them.
With the goal of creating a tool to help bloggers, library bloggers in particular, quantify their feed readership, I created a version of my older RSS4Lib-specific tool for general use: YourStats.
YourStats parses a log file and generates an estimate of total “direct” readership (that is, readership of the feed itself). It summarizes readership reported by aggregators like Bloglines and counts unique IP addresses for PC-based readers (such as Firefox or NetNewsWire). Of course, individual blog posts often far afield, being reproduced in other systems that do not report readership numbers. Other tools, such as Magpie (the RSS feed cacher) or Yahoo! Pipes, almost certainly redistribute your content to many other readers but do so in completely opaque ways. This sort of readership is not included in YourStats.
A couple notes:

  • If you use Movable Type or WordPress to power your blog, and your RSS and Atom feeds have the default names, the application will take your log file and process it, giving results for both RSS (index.xml or feed=rss2) and Atom (atom.xml and feed=atom) readership.
  • You can specify a different feed by entering its directory path and filename (for example, if your blog’s RSS feed is at http://your.blog.com/stuff/feed.xml, you would enter /stuff/feed.xml.
  • YourStats only handles Apache standard log files.

I hope YourStats will provide you with quantifiable numbers around your blog’s readership. Try it and let me know what you think.

FeedHub Review

I’ve been testing out FeedHub, a tool that organizes and filters your RSS feeds based on topics (“memes,” in FeedHub’s parlance) you express interest in. So far, I’ve found that it helps put things I want to read nearer the top of the list, although it doesn’t do as good a job as my own system of reading certain feeds first.
To get started with FeedHub, you need to set up an account and import an OPML file from your favorite aggregator. The signup procedure was a bit confusing; it wasn’t obvious to me that uploading an OPML file was a required part of the process. Once I figured that out, though, setting up an account was a breeze.
I exported an OPML file from Bloglines (over 120 feeds) and FeedHub started digesting it. The process took several hours; I was warned that it would be time consuming, and it was. When I came back to FeedHub the next day, it had finished its processing and offered me a new RSS feed to which I needed to subscribe to see my filtered feeds. So I added this new feed into Bloglines to monitor it.
Within the FeedHub feed, each post is reproduced along with an indication of relevance, the memes FeedHub has assigned to the post, and a thumbs up/thumbs down rating icon. Sample (for a recent post in OCLC’s WorldCat blog, “Rating and review features updated, more cover art added‘):

FeedHub RSS Entry

Once you click the appropriate rating thumb, a new window appears in which FeedHub displays its updated relevancy for that article and a series of memes that it has found for that particular post. You can then provide an importance for each meme in your overall reading interests. For example:

FeedHub Meme Rating

This post has been assigned two memes by FeedHub, “tools” and “virtual reality”. (The underlying software, mSpoke, is responsible for assigning memes to topics.) Next to each meme is a series of graphics, reflecting five options for rating that meme: “No opinion,” “No thanks,” “Sometimes,” “Usually,” and “Yes, please.” I’ve previously given the “tools” meme a “usually” rating but haven’t previously expressed an opinion about the Virtual reality meme.
FeedHub can display all the memes it’s gathered to describe my reading interests. Here are the memes I’ve rated as “Yes, please” and “Usually:”

FeedHub Meme Rating

You can drag and drop individual memes from category to category, making it simple to update FeedHub’s settings. You can also add other memes via search interface or add memes that are based on source (i.e., “in popular feeds”, “in TechCrunch”, etc.) or that reflect social web sources — “on the del,icio.us hotlist”, “popular on Digg”, etc.).
So what’s the net effect of FeedHub on my blog reading? It’s been mixed in the week I’ve been testing it. I think I would have been happier had I given it a more homogeneous set of RSS feeds rather than everything — library-related, technology-related, blogs of friends that I follow, news, etc. I think it has a hard time initially figuring out what I actually want to read because the sources are so disparate. I have since pruned my FeedHub subscription list to be just library- and technology-related feeds, which seems to have improved its fidelity. (Or it could be that I’ve simply voted on more items, giving it a better sense of what I actually like.)
FeedHub also notes which blog posts I click through to read at the source site. Since most blog posts are published in their entirety in the RSS feed, I’m not sure how useful this is; I rarely go to the blog’s site to read a post.
Overall, I’m intrigued by the tool and plan to keep using it. However, I’m not yet ready to ditch my entire feed collection as individual posts in favor of FeedHub’s filtered approach.