RSS4Lib

JavaScript RSS Box Viewer

I stumbled across yet another RSS embedding tool, the prosaically named JavaScript RSS Box Viewer. (See the “related posts” section below for my descriptions of several other similar tools.) RSS Box Viewer gives you a great deal of control over the output of your feeds (you can set the number of items to show, the width of the box, compact/expanded view, colors for the frame around the box, etc., etc.). Here, in fact, is a simple sample of the RSS4Lib feed that shows the most recent 3 headlines:

A few minor quibbles… The color palettes are “web safe”, which means you can’t match exactly the color scheme on the site. The web page where you configure your box doesn’t handle wide formats for the box very well — so if you want to preview your RSS feed wider than about 200 pixels, the preview overlaps the form you fill out (at least, for the Mac versions of Safari and Firefox). And the form requires you to enter an RSS feed’s URL, not the URL of the site — there’s no autodiscovery.

But in terms of ease of use, this seems as powerful as the hosted version of Feed2JS and as flexible as Google’s similar tool.

Facebook Notes Redirects Your Feeds

I jumped on the Facebook bandwagon as it was pulling out of town and created a Facebook page for RSS4Lib (become a fan!). In the process, as I was adding the RSS feed for this blog using the Notes tool, I noticed something more than a little annoying: RSS feeds added to a Facebook page using Facebook’s Notes application are rewritten to drive all traffic from that version of the feed to Facebook, not your own site. While clearly in Facebook’s financial interest to bring more traffic to Facebook, they do so without explicit permission.

When you set up an RSS feed into Facebook notes, you are asked to agree to a brief terms and conditions that says, in its entirety, “By entering a URL, you represent that you have the right to permit us to reproduce this content on the Facebook site and that the content is not obscene or illegal.”

However, Facebook’s concept of “reproduce on the Facebook site” and mine are somewhat different. While I fully understood that my blog posts would be presented inside Facebook — as they are on the RSS4Lib Notes page, I am surprised that the associated RSS feed includes rewritten channel and item data. As an example, take a look at the feed’s channel data:

<channel>
<title>RSS4Lib: Innovative Ways Libraries Use RSS's Facebook Notes</title>
<link>http://www.facebook.com/notes.php?id=81126379633</link>
<description>RSS4Lib: Innovative Ways Libraries Use RSS's Facebook Notes</description>
<language>en-us</language>
<category domain="Facebook">NotesFeed</category>
<generator>Facebook Syndication</generator><docs>http://www.rssboard.org/rss-specification</docs>
<managingEditor>http://www.facebook.com/pages/RSS4Lib-Innovative-Ways-Libraries-Use-RSS/81126379633</managingEditor>
<webMaster>webmaster@facebook.com</webMaster>
...
</channel>

The feed’s link goes to Facebook (http://www.facebook.com/notes.php?id=81126379633). That page provides reproductions of recent posts. Clicking on a post title, within Facebook, brings up that page in another Facebook page. There is a tiny link at the bottom of the page to “View original post”.

The individual items in the RSS feed are likewise rewritten:

<item>
<guid>http://www.rss4lib.com/2009/05/feedmil_finds_feeds.html</guid>
<title>Feedmil Finds Feeds</title>
<link>http://www.facebook.com/note.php?note_id=82829822943</link>
<description>Full Text of Post Goes Here</description>
<pubDate>Wed, 13 May 2009 16:39:36 +0000</pubDate>
<author>RSS4Lib: Innovative Ways Libraries Use RSS</author>
<dc:creator>RSS4Lib: Innovative Ways Libraries Use RSS</dc:creator>
<source url="http://www.rss4lib.com/index.xml">http://www.rss4lib.com/2009/05/feedmil_finds_feeds.html</source>
</item>

They rewrite the link. They change the author from what it is in the original post, “rss4lib@gmail.com (Ken Varnum)”, assign a creator that is not the author cited in the original post, and link to an RSS feed as the source. (Facebook does display the URL of the post, but clicking it goes to the feed. Depending on your web browser, it may not be helpful behavior to get an XML file.) They don’t provide attribution for individual posts on the site.

The way Facebook is using my content does not fit my understanding of the Creative Commons “Attribution Non Commercial License” I have applied. Among other things, it states that:

You must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work)
You may not use this work for commercial purposes.

I’m willing to give on point 2 — yes, I understand that by reproducing my blog on Facebook’s site that I’m contributing to their commercial gain — but on point 1, I did not waive my right to appropriate attribution as specified in the license on the blog by agreeing to “reproduce” the blog on their site. If this is “remixing,” allowed in the Attribution Non Commercial license, requires that the licensee “takes reasonable steps to clearly label, demarcate or otherwise identify that changes were made to the original Work.” This has not been done.

This sort of misuse of content happens all the time, of course, but rarely so blatantly.

Feedmil Finds Feeds

A new feed-finding search engine, Feedmil, has made an appearance. Feedmil is a feed-only search engine with some clever interface features to help you narrow down your search.
Feedmil’s Google-inspired front page asks, “what are you into?” and provides a sliding control so that you can adjust the results from “surprising” to “well known” — at either end. Want only well known feeds? Move the left end of the slider to the right. As soon as you let go of the slider, your search starts — keeping you from adjusting both ends. I found it a bit surprising that the search started as soon as I moved a slider.

Feedmil gives you many ways to limit or refocus your search once it’s presented the initial results. Here are the results of a search for “rss library” (I was hoping to pull up my own blog, which did, though not in the first place that I crave…):

There are several filtering options running across the top: Feed type (starts at ‘all feeds’, but also lets you narrow searches down to blog feeds, microblog feeds, podcasts, public media feeds, and social media feeds); sort options (Feedmil rank, quality, and relevance), and language.
On the right, there’s a “topic significance” section that lets you select how much weight each of the topics (as determined by Feedmil) should have. Playing with these sliders reorders the search results; as with the front page, as soon as you let go of a slider, the display changes. If you want to restrict results to only one extreme or the other, simply move the slider all the way over.
Disturbingly, the results are displayed differently even if you don’t move the slider at all. For example, here’s the above search before and after clicking (but not moving) the “library catalog” slider:

I need to spend some more time using this tool, but I’m favorably impressed with my first look (aside from the odd interface issues noted above).

Via What I Learned Today.

Journal Tables of Contents in the Catalog

I’ve written several times about TicTOCs (most recently in TicTOCs: It’s About Time). TicTOCs is a JISC-funded free service that collects RSS feeds for journal table of contents. If you go to the TicTOCs site you can search for journals (by title, ISSN, etc.), and find the RSS feed for that journal’s table of contents. They also offer a downloadable list of all the journals in their index, providing title, URL of the RSS feed, and ISSN. This should allow easy importing into a library catalog, federated search tool, or link resolver.
At least one library has implemented this feature. Peter van Boheemen, in his blogWebQuery @ Wageningen UR, describes how he added TicTOCs journal feeds to his catalog. A journal with a table of contents listed in TicTOCs has a link on the right side of the display to “Show recent articles” (this example is Die Naturwissenschaften from the Wageningen UR catalog):

Click for Larger Version

Clicking on that link displays the table of contents for the most recent issue of the journal in the lower part of the screen. Each article is linked to the full text (in this particular example, directly to SpringerLink, but it could just as easily go through a library’s proxy server or link resolver to find a copy licensed for that library):

Click for Larger Version

Is your library using TicTOCs like this? Share your site in the comments.

Fair Use and Quoting Commercial Content

Say you’re blogging about a topic and you want to quote an excerpt of a commercial publication in your post. How much can you quote, verbatim, under the doctrine of “fair use“?
This is the subject of an article in Sunday’s New York Times: “Copyright Challenge for Sites That Excerpt.” Here’s how the Times article sums up the problem:

The legal disputes are emblematic of a larger question that has emerged from the Internet’s link economy. The editors of many Web sites, including ones operated by the Times Company, post excerpts from competitors’ content from time to time. At what point does excerpting from an article become illegal copying?

Courts have not provided much of an answer. In the United States, the copyright law provides a four-point definition of fair use, which takes into consideration the purpose (commercial vs. educational) and the substantiality of the excerpt.

[Yes, I’m aware of the irony of quoting an 88-word passage from an article about the dangers of quoting lengthy passages from commercial publications in a blog post.]

The concept of Fair Use is a tricky, and slippery one. The courts usually determine what is “fair” after the fact, when someone complains. A short excerpt that doesn’t reproduce the heart of the original almost certainly is fair use; a significant excerpt that reproduced the main point of the article probably is not. Reproducing the entire original clearly fails the text.

The culture of the blogosphere has been to define fair use quite broadly; the courts have rarely, if ever, had a voice in the matter. Yet. I suspect that, as the economic woes facing the corporate world drag on, commercial sites will increasingly perceive extensive quoting of their content on other sites as an economic problem. Whether such extensive quoting of published content has a real or imagined effect on revenue — and whether that effect is positive or negative — remains to be seen.

AP, Bloggers, and Fair Use (June 2008)
RSS Feeds & Copyright (May 2008)

Research Blogging — Connecting the Blogosphere to “The Literature” a Link at a Time

The Research Blogging web site is a nexus for peer-reviewed literature and serious academic blogging about it. (No, that’s not a contradiction in terms.) Research Blogging helps readers find critical analysis of scientific reporting by pulling these scholarly blog posts together. Many of these blog posts in reaction to published articles are written by experts in the field and can help you put the article in context. From the Research Blogging site:

Do you like to read about new developments in science and other fields? Are you tired of “science by press release”? ResearchBlogging.org is your place. Research Blogging allows readers to easily find blog posts about serious peer-reviewed research, instead of just news reports and press releases.

While the majority of posts indexed by Research Blogging are in the hard sciences, there are a reasonable number in the area of information and library science, broadly construed.
To join the commentary, you must register. Once you’ve done that, you get a code snippet to include in your posts that are about peer-reviewed articles. Research Blogging then adds your post to its index. Being a scholarly effort, there is peer review of posts indexed by the site — if other registered research bloggers feel your post does not follow the site’s guidelines, it is removed from the index. This keeps the content relevant to the site’s mission.

Yahoo Pulls Plug on RSS Advertising Tool

Yahoo! has announced it will no longer provide advertisements in RSS feeds. Like other wholesalers of online advertising, Yahoo! offered feed creators the chance to put advertisements in their RSS feeds so that they would appear at the end of an item in the feed. Yahoo!’s solution — unlike, for example, Google’s — did not requre that the feed publisher offer subscriptions through a Yahoo!-controlled server. You kept your RSS feed where it was and used some HTML in the feed template to insert the advertisement. (Google, after purchasing FeedBurner, has content creators redirect their feeds through its servers.)
I’m not necessarily upset that a source of ads in feeds is going away. Yahoo! may not be doing well and may be focusing on its serious revenue sources. It’s been reported that other long-standing Yahoo! tools (Briefcase, for example) are also going away. Then again, Yahoo!’s retreat from this market might be indicative of how the perceived value of RSS feeds is changing. If there’s not sufficient revenue from RSS-based advertising to keep a major, though second-tier, player in the game, what does that mean for publishing via RSS?

Mr. RSS Goes to Washington

Have you seen the just-relaunched www.whitehouse.gov? Take a moment and look:

(Click for larger image)

From the tone of the welcome message from the White House’s Director of New Media, this blog is intended to be relatively informal. It’s clearly a “new media” site — aimed at an entirely more connected audience than the past version.
If you look closely at the image, you’ll see a very prominent weblog — just below the picture, on the left. Not just a blog, but an RSS-enabled blog. And — this makes my little RSS heart go pitter-pat — the HTML source of the blog post shows that there is not just one RSS feed, but there are six. Here they are:

Now that’s a lot of RSS! (Although at the moment, the agenda, press, and video feeds appear to be empty.) And in case you want a single feed, here’s the RSS4Lib Whitehouse.gov all-in-one feed, via Yahoo! Pipes.

TicTOCs: It’s about Time

The JISC ticTOCS service has been formally launched after a significant trial period. (I first wrote about this service in July 2007.) The ticTOCs service aggregates the tables of contents (TOCs) from 11,470 scholarly journals from 422 publishers, for a total of 296,186 full-text articles. (Of course, you or your institution must have access to the full text of these journals to view them; the table of contents, though, is free.)
The idea behind ticTOCs is to make finding and subscribing to table of contents RSS feeds a simple process. This free service is long overdue. Getting lists of tables of contents from journal publishers is time-consuming, if it is possible at all. Being able to pull together feeds across journals in one OPML file will prove helpful to libraries wanting to deliver current awareness services, have more up-to-date subject guides (with a list of recent articles in that topics ‘hot’ journals), or to augment catalog records.
The site lets you identify journals of interest by topic, by title, or by publisher, subscribe to their tables of contents (the “TOCs”) by checking (“ticking”) a box, and then getting an aggregated feed of articles and abstracts to review. You can add all journals matching a search (subject, title word, or publisher) to your profile with a single click, or add individual titles.

You can export the subscription list of tables of contents as an OPML file to add to your favorite reader. For example, here is an OPML file of the 25 journals in the Computers — Internet category.
TicTOCs display each journal with its title, the standard icon for the RSS feed, and a menu to add that feed to either Bloglines or Google Reader. Articles are shown by title; a link at the top of the display allows you to show the full abstract provided by the publisher. And each article includes a link to add the citation to RefWorks.

TicTOCs also opens the door to other advanced services. For one example, once you have an OPML file for the RSS feeds for a group of journals, that list of feeds could be run through Yahoo! Pipes or other similar tool to filter for keywords. For another, the OPML file from ticTOCs could be edited to redirect all full-text links through the library’s proxy server, allowing that library’s users to get to the full text articles without any hassle at all.
Future developments I, for one, would like to see include (of course) more publishers — ~~where’s Elsevier~~ — and a simple way to query ticTOCs with a journal’s ISSN or EISSN and get back the canonical RSS feed. Such a service would let libraries more easily add an RSS feed for a journal to that journal’s entry in the local library catalog. It would also be helpful, at an institutional level, to have automatic rewriting of full-text URLs in table of contents feeds that included the library’s proxy server.
This service will save librarians time and, more importantly, save patrons time.

Correction

There are, in fact, 1870 Elsevier journal titles in ticTOCs — thanks to Roddy MacLeod for pointing out my error.

Updated 12 Feb 2009

For you programmers out there, ticTOCs now offers a downloadable file of journal titles, ISSNs, and RSS feed URLs. Not quite an API, but a good start. See the ticTOCs news site for details and or get the ticTOCs data set for yourself.

Code4Lib Journal’s Fifth Issue

Issue 5 of the Code4Lib Journal was published this afternoon. And if I do say so myself (as a member of the journal’s Editorial Committee, it’s another excellent one. Here’s the table of contents:

Editorial Introduction – Issue 5 by Emily Lynema
â¡biblios: An Open Source Cataloging Editor by Chris Catalfo
User-Centred Design and Agile Development: Rebuilding the Swedish National Union Catalogue by Henrik Lindström and Martin Malmsten
Reaching Users Through Facebook: A Guide to Implementing Facebook Athenaeum by Wayne Graham
Affinity Strings: Enterprise Data for Resource Recommendations by Cody Hanson, Shane Nackerud, and Kristi Jensen
Identifying FRBR Work-Level Data in MARC Bibliographic Records for Manifestations of Moving Images by Kelley McGrath and Lynne Bisko
Rasmuson Library DVD Browser: Fun with Screen Scraping and Drupal by Ilana Kingsley and Mark Morlino
Reviving Digital Projects by Dianne Dietrich, Jennifer Doty, Jen Green and Nicole Scholtz
Generating Metadata on a Shoestring sans Programmer, with Our Good Friend, Excel (or Any Spreadsheet) by Jill Strass
SPECIAL REPORT: Creating Conference Video by Noel F. Peden
COLUMN: We Love Open Source Software. No, You Canât Have Our Code by Dale Askey