I’ve been running Blogmon since February 2008. You can find more information about this micro service here. In essence it is capable of monitoring blogs and bloggers progress using the Technorati data over time(rank mainly).
Most time we can easily find the top bloggers out there but there are many great bloggers in the middle of the pack or beginners that are marching quickly up the blogsphere. I hope that Blogmon is helping in exposing the rapid movements and progress of few bloggers out there. I some time view it as my own dynamic Favorites that is occasionally reminding me of blogs that are worth more visits. I’m happy to share it with you.
In 2008 Blogmon monitored over 1,000 blogs, some were discovered by me or shared by friends through Twitter and other Social Networks, and others pragmatically by crawling the web. The service reported 658 updates (as of today) through Twitter @Blogmon account along with information about their progress over the year.
Here are Blogmon’s 10 best bloggers for 2008:
|Blog||Blogger||Baseline Rank||Current Rank||Gain from Baseline|
|Startup Meme – Technology Startup and Latest Tech News||Bilal Hameed||1,910,875||11,089||99.4%|
|Parenting ideas from dads – dad-o-matic||Founded by the legendary Chris Brogan and authored by friends||1,943,068||29,490||98.5%|
|Doug Haslam||Doug Haslam||1,072,771||18,984||98.2%|
|Reflections of Time||Milton Chai||4,978,471||93,172||98.1%|
|Cool Mom Guide||Julie maloney||1,546,784||44,421||97.1%|
|Enter the Octopus||Matt Staggs||736,965||24,853||96.6%|
|The Lessnau Lounge – Finding the American dream||John Lessnau||605,911||32,945||94.6%|
|Six Revisions – Web Development and Design Information||Jacob Gube||17,159||939||94.5%|
|Social media PR from Press Release PR owner Danny Brown||Danny Brown||1,286,970||103,392||92.0%|
I have one more super excellent blog on Blogmon top list – yes, good guess:) Since Chris is well known (yet still is working very hard as you can see below) I wanted to allow room for one more new blogger that is storming up the blogsphere. Yes, Danny it is you – keep the good work going.
|Social media business strategy and more||Chris Brogan||1,512||61||96.0%|
The idea behind Blogmon is monitoring progress. Most time you can only see a snapshot of the current state not telling about the direction (up, down or stagnant). Blogmon keeps historical information and can compute both the speed and direction (velocity) with the intention of exposing talented hard working bloggers before everybody else knows about them.
Happy new year and have a great #bl09ing year!
Anyway, for a better 2009 (the Creed One lyrics)
If you want to know about BlogMon please read my BlogMon – The only way is up! post. You can find daily updates on Twitter @BlogMon. You can also see periodically aggregated results reported on The A-List tab, Archive tab, and Perfect Record tab.
I was very happy to see this week that @BlogMon passed the 50 followers mark. This is great consider the fact that it does not follow others.
I recently added new pattern looking for blogs that cross tiers. Based on Technorati rank I look for blogs that cross to the top: 10000, 1000, 100, and 10.
Here is an example of an update to the @Blogmon Twitter account reporting this change:
#CrossToTop10000, http://thecurvature.com , Owner: Cara, Gain:47.70 %, Since:9/16/2008, Rank:9961, Tags:”misogyny”,”patriarchy”
This pattern was fairly easy to implement and the only challenge was dealing with Technorati data abnormalities. From time to time the rank is totally off, like in the following scan records for wpthemesplugin:
For now, I solved it by looking for the previous minimal rank for comparison and not just using to the last one.
I think that it should be very exciting to be listed on the top X blogs. This is definitely a milestone to report on.
So, I hope to see you at @BlogMon. You can submit your blog for monitoring going to BlogMon home page or by following @BlogMon on Twitter.
I see more and more people raising the questing about the need for blog search engine, especially when Google is doing such a great job finding good content from blog as well as from web sites. It seems like that Google itslef is not investing too much in their blog search too. So, in this post I will explain what I think should be the duties of a blog search engine and why I still see a need for one.
Blog search engines (should) serve multiple purposes
- Finding great bloggers, blogs and blog posts
- Recognizing great bloggers, blogs and blog posts – rank.
- Categorizing blogs and bloggers in multiple ways not limited to content type. Categorize blogs by their objectives: personal blogging is not the same as corporate blog or professional bloggers, subject expert, politics, go green, artist or others. It is not just about what that the blogger writes about but also about what the blogger is trying to achieve.
- Monitoring blog and blogger progress – is this blog alive? a shooting star?
- Web-now – see Twitter Search Trending Topics, Twingly’s Hot right now or Technorati’s what’s percolating in blogs now
- Alerts – a list of new blogs in a given category that are doing well
- Community building – increasing cooperation among bloggers (e.g. you should read this blog)
What do we need to know?
- The top bloggers in a category
- The top blogs in a category
- The top blog post in a category
Who needs it?
- The readers – to know what to read, what is going on in real-time
- The blogger
- To present a case to a sponsor
- To know whom to look up to
- To see and share about the blogger progress
- The business
- To know where to buy ad real-estate or whom to sponsor
- PR – where to spend my effort effectively
The challenges of blog search engines today.
Using the reaction counting method for ranking, the service needs to distil humans actions from automated (bots) one in order to be accurate. So far this is not working well and adds another questions mark around the validity of blog search engines.
Here are some example for both:
- Human reactions
- Blog post reacting to another
- Update on Twitter or Jaiku
- Digging on Digg
- Submitting to social bookmarking site
- Posting on a social network
- Bloggers community
- Bot reactions
The number of sites that offer posting of human blog reactions is growing faster than the crawling capabilities and sometime does not offer access to crawlers.
The service should also remove the “me” links from the count i.e. links from all the social object under the same owner.
A couple of thoughts
Maybe someone could think about another way to rank blogs and bloggers. Measuring traffic is probably a more accurate way (Alexa). The traffic is relative to the category. I assume that a blog about Technology will get more traffic than a blog about biology. The rank should be within a category and not across all (or not just across all blogs).
In my opinion there is a need for blog/blogger search engine but the emphasize of the search capability should be less around finding content (leave that to Google) and more about discovering leading blogs and bloggers.
It does not need to be a free service at least not for the business. The premium or a sponsor account model could work as well.
What is BlogMon?
You can find here results from a small application I wrote for monitoring Technorati rank changes over time. I find these blogs and bloggers in two ways:
- The first is the same way as you add links to your favorites when you like the content. I browse, I like it, I add it to the list of blogs to scan. If you follow me on Twitter @kerendg and your blog is listed on your Twitter About section I will be tempted to monitor it.
- The second is pragmatically by “crawling” the data in a certain way that is helping me to find more great resources.
I see it as if I have “dynamic Favorites” – a list of blogs that is worth to get back to. This is great way to find who is consistently getting the crowd’s attention. I know that it seems simple but I have my share of complexity dealing with multiple constrains and “interesting” data.
The patterns reported using Twitter (on daily bases)
I plot daily results to my Twitter @Blogmon account
- Rank Change: more than 25% gain in rank since baseline and at least 7 scans (snapshots)
- New High: more than 35% gain in rank since the first Rank Change report. The application reports the first time and then if the blog reached a new high the tool will report only every 30 day since last report.
The information reported to BlogMon results web site
The A-list blogs consists of bloggers with Technorati rank under 1000 (in other words the subset of the top 1000 bloggers that I monitor).
I publish monthly results to BlogMon Results for the A-list, and for the rest here. The rest are bloggers that moved up in rank more than 50% since the baseline (the first time I started monitoring them).
Speed: calculates as the percentage gain in rank divided by the number of days it took to make it.
Perfect records – these are the rare blogs that are moving up consistently. The conditions are at list 9 scans (snapshots) and more than 20% gain in rank.
You got nothing to loose – The only way is up^
Joining in – if you’ de like me to monitor your blog please submit your name and blog URL in the small form on the BlogMon home page. EMAIL IS OPTIONAL!!! – I don’t need it for monitoring you blog. I do need your Blog URL. Please add it in the Comments section.
Your blog need to be claimed on Technorati.
I only report if your blog is going up. So, if you are not doing so great, now, your blog will not appear in the statistics. So, there is nothing to loose by joining in. If the blog is not going up in more than 30 days I will slow down the scans for this blog from every other day to every week. So, if your blog is making good progress you can show it off (or charge more from your sponsors).
I’m using Weebly to build the Blogmon Results web site. It is a great service that saves me a ton of time. I use Google Docs to generate the spreadsheets. I can publish them online and then embed the table as an iFrame inside the web pages – another great time saver.
- I monitor 794 blogs. I scan 252 blogs weekly. The rest every other day.
- Most of the blogs reside between 1-100,000 rank
- In August I had 70 blogs that were making good progress (more than 50% than the baseline rank).
- in August I had 22 blogs on the A-List
Any ideas how to make it more useful and interesting?
After few month of monitoring Technorati rank progress for few hundreds of blogs I was looking for a way to compare the ones who are making significant positive rank shifts. The way Technorati rank work in is that if the value is going down it means that the blog’s position is improving and there are less blogs ahead of it on the way to the top. It is hard to compare rank moves when the variance in rank is so huge.
So I looked at blogs that had more than 50% positive rank change and I started looking at the speed of their progress.
The way I calculated the (average) speed is:
- Speed = Percentage gain (from baseline) / Duration
- Duration = last rank update date – first rank update date (in whole days)
- The units of this calculation is: percentage change in rank/day (or percent per day).
You can see the full table with the results in here (using Google Docs ). :
Blogger’s Speed results – 6/6/2008 (If WP could support (i)Frame it was possible to see this table embaded inside this post, sigh).
Here is a subset of the table:
|url||speed||gain||# days||start rank|
I highlighted (bolded) few of the lines to show how speedy these bloggers are. There are a couple of bloggers like Doug Haslam and Jennifer Leggio (Mediaphyter) that are showing consistent improvement even if they are not the fastest in their group (they have beautiful positive monotonic rising curve) .
The table is sorted by the starting rank (baseline). The blank lines were added between blocks of bloggers that started in a similar rank range.
Note: I assume that the Technorati ranking system is not liner and somehow things are moving faster where the rank is a really large numbers (i.e. at the bottom of the blogsphere). This make it a little harder to compare between blogs with great disparity in ranks ranges yet it only emphasize how great are the bloggers that manage to move fast even at the top. This is not exact math so please take it with a grain of salt. Maybe one day when somebody come up with a different ranking system he can take my approach in consideration.
A couple of thoughts:
- Speed within category – it will be great to compare how different bloggers are doing within their domain of interest/expertise.
- Acceleration/Deceleration – in this method I actually calculated the average speed moving between the baseline rank and the latest. What that I don’t show here is who’s speed is accelerating and who’s is slowing down. This could be monitored as well. One more thing that I can plot to BlogMon (Twitter)
Since I did not make the links in the table clickable I added them below (you can also click on the link from the Google doc table:
http://travellperkins.com, http://www.multichannelmetrics.com, http://blog.francinekizner.com, http://doughaslam.com, http://www.four20.net, , http://mediaphyter.wordpress.com, http://www.seotops.com, http://www.purplecar.net/, http://www.twitterholics.com, , http://learntoduck.com, http://daisysdeadair.blogspot.com, http://dossy.org/, http://www.prfekt.se, http://gobigalways.com, http://www.socialmediaexplorer.com, http://bing-thegreeninme.blogspot.com, http://sixrevisions.com, http://www.veronicabelmont.com, http://daily.mahalo.com/, http://thenextweb.org, http://www.bloggerbuster.com, http://blog.twitter.com, http://laughingsquid.com, http://www.37signals.com/svn/, http://refueled.net, http://www.blogher.com, http://blogs.abcnews.com/politicalpunch, http://www.designspongeonline.com/, http://blog.makezine.com, http://www.mixx.com
So who do you want to put your bet on?
In his post “Could Someone Explain Technorati” Chris Brogan wonders about the consistency, accuracy and reliability of Technorati service. I can’t explain the behavior of the system over there but I can share some of my experience dealing with different challenges using online APIs (web services) and data. The objective here is to help other mashupers to better prepare for future integrations effort across multiple web services. Since it appears that the mashupers community is growing faster than the web service provider I’m sure that more fellow API consumers can share some stories of their own. I will be happy to hear about.
I see three participants perspectives in this “love triangle”: the web site visitor, the mashuper (the API consumer) and the service provider.
My visitor experience:
Chris Brogan talks about his experience from the user perspective in his post. I have nothing to add here but I would say that as a service provider, this should be my top concern satisfying my loyal community. Maybe the way to deal with this in the case from Chris’s post is by monitoring for exceptions (drastic rise or fall in the rank/authority).
My mashup experience:
As I mentioned in some of my earlier posts (here, here and here) I’m working on a small project for finding productive bloggers by monitoring for consistent improvements in their Technorati rank. So on a frequent basis I monitor the rank for over 800 bloggers now. I plot some of the result to a designated Twitter account: blogmon.
The first set of challenge is dealing with volatile data:
- Some times I see no authority in the results (inboundblogs).
- Some times there is no valid last update date in the results: <lastupdate>1970-01-01 00:00:00 GMT</lastupdate>
- Most time there is no author (the user did not add it)
- Some time there are no tags (the user did not add it)
- Some time as Chris mentioned the rank is off for a short period of time
For example see Seth Godin’s Blog rank history:
last update rank authority
2/12/2008 19 8599
2/25/2008 18 8697
3/17/2008 19 8658
3/22/2008 16 8827
4/10/2008 15 8946
4/19/2008 16 8882
4/23/2008 17 8819
5/12/2008 17 8828
5/14/2008 16 8863
5/20/2008 15 8890
These are the details that a consumer of online volatile data must plan and look for ways to compensate for.
- Check the validity of the date
- Don’t just count on the last result i.e. search for the last valid result and monitor over time.
- Be prepare to plot partial results (e.g. no top tags or author).
- Most important: guard your data i.e. protect what that you take from the service and store in your records.
The next set of challenge has to do with the web service behavior:
- I get the fowling error once or twice: Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host.
- Some API requests come back with:
<META HTTP-EQUIV=”REFRESH” CONTENT=”2; URL=http://api.technorati.com/bloginfo?url=****&key=****&version=0.9&start=1&limit=30&claim=0&highlight=0″>
**I intentionally masked the URL, title, image and my developer Key with ****
This result can crash your system if not handled.
- Finally: and I get this one a lot:)
<?xml … “http://api.technorati.com/dtd/tapi-002.xml”>
<error>You have used up your daily allotment of Technorati API queries.</error>
- I can’t picture my dev world without Exception Handling – this is the ultimate protection against web service unexpected behavior in this specific case. So guard any call, loading XML result and data parsing by wrapping them with a try and catch block.
- Logging – log expected and unexpected behavior for later analysis and recovery.
- Build the system so exceptions are caught, logged but the execution can move on to the next task.
- This is something that I learned from a smart Army office: “If there is a doubt there is no doubt” basically saying that it is better to not report at all than to report inaccurate data.
- Find ways to minimize the API calls – e.g. I ask for tags only when I find a blog worth reporting on
- A thought: I’m not an expert in XML and DTD but could it be that using DTD slows down the web service. If you know more about it please share with me/us. Is this really necessary on a read only calls?
About the service:
I can’t talk much about what that a web service provider feels or experience (I’m sure that Ian Kallen from Technorati has a lot to share about this subject) but I want to say few things:
- Please don’t get this post wrong I’m a fan of Technorati – I use it and deeply appreciate their service and thankful for having the option using the APIs . As I said earlier the intention is to share from experience and to allow you to better prepare for such effort.
- I guess that it is hard to estimate the load on the system with such growth in the number of mashupers out there. So my heart is with them.
- There are two more threats that the web service provider needs to protect itself from and I’m sure that those consume some energy: protecting the hard gather data and its environments from abuse and malicious attacks.
One last comment: ironically I had none problems with Twitter so far:) but I’m aware of the pain that some of the Twitter API user suffer occasionally.
As I continue playing with the small application that I’m writing for monitoring positive shifts in bloggers’ Technorati rank I realized that I’m actually finding bloggers writing about almost everything. The only common thing I could find so far is that they are just consistently great.
The tool scans and builds historical data for over 700 blogs so far. I build this growing blogs’ URL list using my favorites (i.e. humanly picked in multiple social ways) and the crawling algorithm I previously explained in this post.
I won’t get into the operation details (and there are plenty of details) but I mange to get a lot done not exceeding the 500 API calls daily Technorati limitation.
I output the result to BlogMon Twitter user for now so please, you are invited to be a follower.
Example of outputs:
Short-term pattern: http://wpthemesplugin.com, rank gain: 18.10 %, since: 4/12/2008, Top Tags: “wordpress”, “themes”
Long-term pattern: http://mediaphyter.wordpress.com, rank gain: 76.10 %, since: 2/1/2008, Top Tags: “Social Media”, “Security”
As you can see I log the URL, the rank gain, since when, and the top two tags to give you an idea what this blog is all about. I found that in most cases this is good enough. Do you?
Why am I doing this?
- First, it keeps me engaged with the mashup opportunities and there are lots of those available today .
- Second, I enjoy doing it.
- Finally, you may find it useful in some way – you can leave a comment on these blogs and maybe get some traffic to your website/blog. I will be happy to hear if you did.
I may be tempted to mashup more web data sources/services in the future or explore discrepancies between Alexa data and Technorati rank .
I’m also using a great early stage service developed by Microsoft called Popfly to build and deploy a small (too small and simple at this time) application to my Facebook profile called BlogTwitt. BlogTwitt will show the recently posted updates to the BlogMon Twitter user I use for outputting the daily findings from the application I’m working on.
At this point I could not share this application – I don’t know why so I left a message on the Popfly Facebook wall. As of this time I got no answer. I do appreciate what that they are trying to do, saving me the time learning/working with the Facebook API.
I think that I will write soon a post about the Popfly and the challenges writing a good mashup. I do encourage people that are just starting their mashup thought process to look at this tool and also at Yahoo pipes (fantastic interface) to play, understand, get ideas and brain-storm with the numerous available web services (API) out there. This is like working in a software solution architect group for a company that offers multiple products and findings new way to increase the value of the existing modules by symbiotically integrating them to new offerings.
Finally I don’t think that this is Software plus Service like Microsoft tries to sell it I see it as Service plus Service (the service is build of software, da). Maybe Service X N.
As always, I would love to hear your thoughts so please use the comment section.
Update: I forgot to mention that what that I like about using Twitter vs. my blog to post results is that it does not add to the blog reactions count. So, it goes under Technorati radar and does not impact the Rank (avoiding the Observer Effect). That may change one day when they will realize that Twitter’s twitt with blog’s URL is actually a blog reaction.