Tag-lover’s Delight

TagCloud.com is an automated folksonomy tool launched about a month ago. TagCloud searches any number of RSS feeds you specify and creates tag clouds. I’ve created a a few cool tag clouds using their service.

For starters, let’s look at Gotzeblogged’s tag cloud (shorter version), which only takes in my blog feed. The result is great. The tag cloud highlights enterprise architecture, interoperability, and other relevant tags, and even identifies concepts like crossing the chasm.

My personomy then, how does TagCloud.com handle that? I’ve been rolling my own tag cloud for a while, so I’d be interested in what TagCloud would come up with. I decided to break my old OPML-feed (sorry, if anyone was using it; let me know) and create a bloglines-compatible OPML-feed from GotzeTagged with links to the hundreds of feeds I provide there. I fed this OPML-feed to TagCloud and after struggling with their rather poor error messages (and especially my relearning OPML!) for a while, I managed to create an eatable feed and hence: TagCloud: GotzeTagged Cloud. Again, I’m impressed, and promptly added the TagCloud.com cloud to my GotzeTagged Tag Cloud Page, where the automated folksonomy is a great supplement to my “hand-tagged” personomy cloud.

In playing with TagCloud.com, I experienced some issues with non-English content. When I imported my public opml-feed from Bloglines, I got garbage tags since I there subscribe to a number of feeds in Danish and other non-English languages. I removed all non-English feeds from my bloglines-import cloud, which then became quite useful, showing enterprise as the dominant theme, followed by enterprise architecture and service.

Although there are several issues with non-English content, I did manage to create a surprisingly good Danish cloud: Danske nyheder, which is based on a few central Danish news feeds. I had to create a list of Danish stop words (based on, but expanded from this (unavailable; use Google cache).

TagCloud.com is a very cool tool. But what is REALLY cool about it is the way it has been built: It grabs the relevant feeds and stuff them in a MySQL database, and then calls up Yahoo’s Content Analysis web service to extract the keywords from the feed content (article titles and descriptions), and, as the developers IonZoft says, “Everything else is gravy”.

BTW, it’s the Yahoo Content Analysis Service that has I18N problems, I think. But for English content, it’s great. Probably the best web service out there!

Check the Technorati search on TagCloud and on Yahoo content analysis to follow what the blogosphere says about them.

Speaking of Technorati, I guess TagCloud will experience issues similar to Technorati, both in finding a business model (or VC, which Technorati did), but also with scalability in uncertain times like these past few days. Reliability and QOS are issues too, and RSS spam etc is becoming an issue. And the “supply chain” is also an issue; TagCloud relies on Yahoo, and both rely on availability of feeds. Nevertheless, I really hope we’ll see sustainable and prosperous business models emerging, because there is a lot of potential here.

Previous Post
20 standards for interoperability
Next Post
Gone bootstrapping

Related Posts

No results found.