Archive

Archive for July, 2008

What Did Cuil change?

Google for many people is the bleeding edge technlogy company. The place where the best engineers and scholars work or wish to work at.

1. At this point it is hard to see a scenario where Cuil wins over Google but in my mind they did change something. Their indexing technology and scalability seems to make Google’s look old, wasteful, and maybe even obsolete. I don’t know if this is actually true because a lot of people were disappointed from Cuil’s search results so far, yet this could be a relevance problem and not indexing related. So, in my mind Cuil’s accomplishment so far is putting a certain question mark around Google’s technology and perception.

2. When I was asking Is there a way around Google? I did not think about the options that someone will seriously consider to battle with Google on the index size front. I thought that if you can find a cheap data source to index and prove your technology then this is the right course of action. You can’t not admire what Cuil is doing. Cuil degraded the force of intimidation that Google had and created a crack (small) in one of their barriers to entry. I see more to come…

Web presence – piecing together an Identity

July 31, 2008 3 comments

People leave missing information all the time. No blog About page, no employer name, no picture, no blogger name, Twitter account without web page link. Some time the simple link connection is not enough to piece it together. Your network too can help in finding connections or confuse people if your connections are spreads on more than one social network and accounts?  In some cases it is done intentionally and no harm done but in others cases when done by mistake it could lead to lost traffic and opportunities. From what that I see and through my experience most times if the information is not just there, only few will bother looking for it. Isn’t bringing these connections forward and bridging information gaps the role of the new Social Graph Search Engines?

This post will cover:

Finding out how objects are connected across multiple web applications. Overcoming cases when the information falls between the web cracks or is deliberately missing. Looking beyond the trivial context of web links (URL), friends, fans or followers.

  • Understanding the problems – some examples
  • Looking what tools can help piecing the missing information together
  • Bringing it forward – making it easily available when needed

Understanding the problems

Example 1 – me and my blog:

I did not add my blog URL to my LinkedIn profile. I did not add my employer’s name to my blog About page. I did it intentionally. I like to keep them separate for now. Omitting these two pieces of information seems to work so far. This missing information is not bridged by any social graph search engine that I’ve seen so far. It is ironic but a simple search of my name on Google will reveal the connection (warning: there are couple more Keren Dagans out there – both has nothing to do with software or technology). The connection in this case between me, my blog and my employer is my identity (similar profile info such as name, picture and location).

Example 2 – disconnected social networks:

I keep my Facebook and LinkedIn networks separate. I only have a couple of relatives overlap. I use Facebook for personal connections and LinkedIn for professional ones. I use Facebook sometime to post my new blog posts. It seems like social search engines can link between my friends across the networks, yet again, there is no association between me and my blog. Some information inside one’s activity can help making the connection.

Example 3 – multiple presence using disconnected accounts:

Case 1: My “personal” Twitter account is @kerendg. The other day I submitted a query to Twitter Search, searching for references to my other Twitter account @BlogMon. I found out that Stowe Boyd (stoweboyd) was asking “who is running BlogMon?”. The link to my blog is on BlogMon web page, and can be easily found using Twhirl or Twitter Search. I don’t know if he ever got an answer to this question or no. Soon after, I started following @stoweboyd using my @kerendg account (I don’t follow from @Blogmon). He did not contact me or follow-me on Twitter till this day- maybe because it is hard to make the connection or maybe it is just not that important. Your blog is an important piece of the new identity (FYI – WordPress using your blog URL for your OpenID).

Case 2: Some entrepreneurs runs both the start-up blog and their own blog. Only in few cases there is a link from the corporate web site to the entrepreneur’s (see Mashery  -> Blogroll for Praxis blog ran by their CEO, Oren Michels – even this is not easy to connect). Same story using Twitter accounts (one for the business and one for personal updates). In LinkedIn organization is a connection. This is true across networks in addition to your role.

Example 4  – blog action:

Case 1 – comments: I don’t get too many comments on my blog. I can only wish to get more. Yet, I did get some comments from people with vast web presence. Is this some kind of connection? Did I Digg/Saved one of their blog post ? Did I mentioned their companies? You give me your attention I see it as another type of connection.

I also leave comments occasionally, mostly on the same 4 or 5 blogs.  This information can help to understand my preferences.  Similar interest is a another piece in the puzzle. Past activity on my blog too. Frequent reader is yet another type of connection with the blogger.

Case 2 – traffic source and blog reaction: I look at WordPress, BlogStats page. The section of Referrers shows traffic that is not coming from two type of sources:

  • Unidentified – there is no way to track it back to the person that looked on my web site.
    • Traffic from search engines
    • Traffic from commends that I left (and there is no reply)
    • Traffic from “similar post” links
    • Traffic from incoming link to my blog from other blogs
    • WordPress tags
  • Identified – tracked back, with some luck
    • Traffic coming from some sort of a network like Twitter, Jaiku, Digg, reddit, StumbleUpon, Twine, Pijoo.

Since there is nothing to do about the first type of traffic (non invasively) there is nothing to add here, but for the second type: let the search begin…trying to find the source. Who dugg my blog? save it in reddit? is there a reply to my comment (like in Techcrunch comment threads)? Who mentioned it on Jaiku or Twitter? The information is scattered all over the net.

Most of my Twitters followers came through my blog. I added friends, people that acted upon my posts in other networks. I care about the origin reaction.  When it is actually possible to track it back to the source, it involves a lot of  leg work. Blog reaction is another type of connection . As I wrote about this before in “What is a blog reaction these days?” I don’t refer to “Blog Reaction” in the narrow definition of someone writing a counter blog post (pingback is a trivial link).

Some of the tools that are available today for piecing it together:

  • Google search(Web) – searching for the person’s name, organization association – this is enough to discover presence across multiple networks.
  • LinkedIn – searching for the person name and organization – looking at both profiles to cross reference with the other details to make sure that this is the same person.
  • Google Alerts  – add your link, blog name, your name, Twitter @account and link – looking for references.
  • Technorati – looking for blog info and fans  
  • Twitter – this is a process
    • If the information is coming from WordPress Referrer then you can follow it (unless it is coming from your account and not worth following)
    • If not – you can search your link as is but it is better to try hashing it using TinyLink or http://is.gd/ or http://snurl.com or other URL shortening services. Use Twitter Search for that. Tip: select in all Languages – I found out that if this setting is not on a link search will return nothing even if the update is in English – the default.
    • Use Twitter Search to search your name, and @account too.
  • Delver – network graph – this service can save some of the leg work checking multiple networks.
  • Jaiku – same as Twitter. What that is nice about Jaiku is that Google Alerts pick up on conversion.  
  • Flickr – some people like me don’t have account but do have tagged pictures submitted by friends.
  • WordPress Referrer – this is the starting point
  • Digg, reddit, del.icio.us and a like – go to your profile page and see who dugg/saved/rate your post

Do you know about more tools?

Now, wouldn’t it be nice if there was one tool that does all that and bring this information forward when it is most relevant. 

Bringing it forward

If this information could be gathered through single tool then my ideal solution is something like SnapShots. When I click on any account, link from anywhere on the web present me with the graph. Show me this entity’s web presence. Show me how can we connect? This person blog, other accounts. The information could be context sensitive – e.g if I’m in Twitter show me all Twitter accounts for the same entity – show me if we are connected through Twitter first.

Alternatively, send me an alerts about subtle semantic links to me and my blog. Something like.

  • This individual
    • from this location
    • working at
    • in this role
    • own this blog
    • x degree from you on Y network
  • Acted:
    • was once at your blog before
    • looked at your blog on this post
    • you profile in LinkedIn
    • your other Twitter account
    • respond to your comment
    • dugg your blog.
  • Options
    • Do you want to make a “trivial” link and connect?
    • Go to his blog

Summary

In this post I was trying to explain that if we want to build a complete social graph network it is not enough to look only on the “trivial” links. This is just the beginning. It will not present a full picture of one’s web presence and identity. In order to construct a useful graph there is a need to look at other type of links. These links are scattered across multiple social network and services. They are part of the enhanced meaning of one’s profile attributes including activities and relationship.

I see three steps moving in this direction. The first is piecing someone’s identity drawing on information from multiple sources. The second is using this information for finding new ways that people are connected i.e. building the complete and rich social graph. the third is presenting it when relevant.

I did not cover the uses cases for having such information at hand. On top of my head I can think of a few:

  •  It could be handy to QA web presence – especially if the entity is a business
  • It could be handy for web-sites to understand their crowd
  • It could be handy for business development

Task based search – using the right tool for finding the right information.

July 27, 2008 1 comment

There are multiple ways today to find information on the web. There are different kind of information to search for. The  search task experienced could be overwhelming, frustrating, long and tiring or fun, efficient and successful. It is helpful to think through the search keywords, the search objectives, the type of information and the source of the information beforehand. I listed in this post my most frequently used searching tools. I also added a table mapping some of the possible search tasks to the tool that I think is the best for accomplishing it. 

My top 9 search engines:

  1. Google – what did you expect?
  2. Del.icio.us – the social bookmarking web site
  3. Twitter search (formerly Summize)- dipping into Twitter’s archives
  4. Twingly – spam free information stored in blogs
  5. Technorati – blogs, tags, rank
  6. Delver – social graph and search engine
  7. Xoost – social search engines
  8. Stumpedia- social search engines
  9. LinkedIn – yes, the networking tool

Mapping search tasks to proper search tool

Task Search tool Notes
Terms and buzz words (Google is my Wikipedia index), maps and directions, images, stock tickers, businesses near me, product, spelling and idioms check, time (around the world), and more Google I usually start my searches here.
  • Searching for free stuff for real.
  • Searching for technical information (software, in my line of duty).
  • When I get too many poor results from Google.
Del.icio.us Google just fail when you type the word “free”. You get too many results promoting non-free stuff. I found out multiple times that I can find the best results using this web site. The wisdom of the crowd works for me in this case.
  • What is hot now?
  • Does anybody care about a cartain subject (yes, including me or my stuff)?
  • Is it a good/bad product (movie, computer, etc…)?
Twitter search  Don’t leave the first page too quickly. By Examining the Trending Topics I just know what’s on people’s minds today. It is sometime requires to drill down to the conversationitself to understand the listed term (by clicking the link).
  • What is hot now in the rest of the world (outside US)?
  • When I’m tired of spam from Google search results.
Twingly Twingly’s “Hot right now” list is a little biased towards Europe – and that’s a good thing.
It is early but they recently added Blog profile so in the future I will use it looking for blog information.
  • Blog information like: post reaction, tag cloud (getting general impression about this blog connect).
  •  Location in the blogsphere looking at its rank and authority.
  • Top 100 blogs
  • What’s “percolating” now?
Technorati  I rarely use the tags searching capability for content. Maybe I should use it more – not sure.
Who’s connected with whom through whom? Delver  It is just the beginning so it is not as rich as other more matured search engines but in the multiple times that I used it to actually search for information (not connections) I got excellent and very clean results.
I tried using it for searching information about individuals too and I got LinkedIn bio info.
  • Who knows how to search well?
  • Tell me something that I don’t know.
  • Show me something that I did not see before.
  • Recommend me something.
Xoost and Stumpedia These two covers the “I don’t know what that I don’t know” problem.I can also look at what other people are searching and like about other peoples search results.
  • Searching about a candidate.
  • I did not do it myself but a friend told me that he can learn a lot about companies’ business development activity through LinkedIn. I guess by monitoring target people’s new connections info.
  • If you are looking for a job it is also a great tool to learn about the new employer.
  • The Q&A section is a fantastic way to learn new stuff (and what people care about).
LinkedIn One of the first thing that I do once I get a new resume is to check this candidate’s profile page in LinkedIn. I can also check to see if we are somehow connected.

I hope that by writing this post I can help people to become aware of their search activity and the available options today on the web. I will be happy to hear about more search tasks, objectives and tools.

The three elements for successful Mashup sign-on process

July 25, 2008 1 comment

In this post I will discuss the three elements of Mashup sign-on process: Security(SSO), Access Control and Single Identity. I see a lot written and done about each individually but I think that it is not always clear what solution map to which problem.

If you are familiar with this subject then you can skip to the next paragraph (or this post entirely). For the ones that are not familiar with mashups and how to get them working for you, please read this short preface. There are many online services today: social networks like Facebook, bookmarking services like ma.gnolia, news like Digg, media sharing sites like Fliker and YouTube, and more. In the screen capture of the form below you can find 43 such services. These services provide online APIs (a way to request data and to execute functionality in remote from service by the world outside). This allows the development of new services on top called Mashups. The new service interact with the underline services and add value because of the unique mix created. The first form below is taken from FriendFeed a mashup application that help you keep track of your friends’ activities across many web services. In this form you are requested to select the services that you permit the current application to pull or push information from and to (in the case of FriendFeed, pulling only). In order for the application to be able to access your data the system needs to know who you are i.e. your user name (login). In some cases it will ask you for your password too.

FriendFeed version:

Services-Sign-Up

MyBlogLog version:

Services-Sign-Up2

These form raises three hot issues in the growing environment of open API and mashups. If you want to see how rapidly this world is growing look in this excellent source of information: ProgrammableWeb

  • Security: not having login and password information stored in multiple places. Single sign-on (SSO)
  • Access Control: having control over what the service can do with my data. Defining security policy.
  • Single Identity: not having to re-enter my profile and friends’ information all over again. Data Portability.

Security

Every service offers a sign-up process where you type in your login and password. Companies like Google, Microsoft and Yahoo that offer multiple applications online offers kind of single sign-on mechanism that once you’re signed in to one service you can safely go to the next one without re-login. The available solution for web sites that are not belong to the same company is the OpenID and here is an example for how to use it from WordPress. It is in a way the solution for single sign-on on the web today. Not all the services today support it but the adoption seems promising. If you want to see a decent amount of available options to authenticate across service just click on the “Sign in using” drop down list in ma.gnolia’s login page.

Access Control

When I allow a service to access my data from another service I don’t have a way telling the source what I allow them to provide. I can’t tell the service if I allow it to just read my data or also the update information (e.g. updating Twitter status). It is mostly determined today by the APIs. If there is a way to configure it (to some extent in Facebook) it is not consistent across the web. I know that there is an effort by multiple leading software companies to deal with it. For more information read the page about the new OAuth protocol.

Single Identity

The term profile today refers to way more than your name, address and email. I think that Facebook took it the farthest including your media preferences, activities and your choice of applications. But most important it includes your contacts i.e. your network.  It is in the basis of most social network services that your experience and satisfaction from the site is in direct relationship with your network size. Yet, no one want to re-type his personal information and re-build his network. Some claim, and I agree, that this data should not belong to anyone but you. The Data Portability initiative is trying to eliminate the need for recreating your online identity and profile over and over again by defining a new open standards that will allow services to port it to your request. This is a great step and I can only hope to see it implemented across the web soon.

If you are new to the subject but not new to using mashup applications I hope that you’ll find this post helpful – maybe now you can start using the OpenID option instead of your login. If you are about to start a new service or mashup I hope that this will help you to think about how to make it easy for us to interact with it.

Do you see more ways for improving this process?

Few tipping points

July 22, 2008 2 comments
  • Delver should add Mybloglog to their “Locate your Profile” section – it will help them building a wider social graph by drawing from a reach network. It may help them finally find and associate my blog with my profile. If not, at least let me add/claim it by myself.
  • Muxtape is cool. Its simplicity is like Twitter, attractive. And the same as Twitter it needs something like Summize (now Twitter Search) for finding cool and matching mp3 mixtapes. I’m very curious to see where this service is going.
  • Xoost and Stumpedia, two social search engines that are powered by human, needs some way showing who has good searching skills in specific area. It will be great to be able to ask for help in a search task from someone that already trawl the web, looking for information in certain domain of knowledge. It will be nice too, to be able to say thank you for a great find.
  • Twhirl will save me the trip to Twitter’s web page looking for my followers count if it adds this value somewhere in the Followers tab. I cherish any new follower to  my Twitter’s Blogmon account and this is the only reason that I visit the Twitter web page today.
  • Techcrunh need to hire someone to process comments in full time:). There are 685 comments and counting to this latest post: We Want A Dead Simple Web Tablet For $200. Help Us Build It. It is amazing to see that when you try to change something there is a strong reaction for both better and worse.

A call for Web 2.0 development framework – it is time for some reusable modules

July 17, 2008 1 comment

As someone that is a beta tester for more than few web sites now, I come across many solution implementations for some very common problems.

Almost every Web 2.0 type of offering provides the following basic list of services:

  • Registration – including email confirmation and Captcha
  • Login – including “remember password” and “forget your password” options
  • External contact import and management – for viral distribution
  • Internal contact management – show user (popular, recently joined, you may know them).
  • Inter web-site communication means – e.g. messages, wall to wall, IM, notes, comments.
  • Profile editing – providing a web page for editing basic profile information such as name, address, birth date, uploading a picture and claiming a blog (or web-site) will answer 80% of the requirements. It will be nice if the framework will allow some custom attributes for required specific information like “favorite quote”.
  • Feeds and subscription, search
  • API – see what I wrote about outsourcing it

So, I hope to see that Microsoft, Google, Sun Microsystems or Adobe will be kind enough to come up with a library including some of the basic modules answering the requirements listed above.

The benefit:

  • Allowing the start-up companies to focus on its core offering.
  • It will saves development cycles solving the same problem again and again.
  • Having a standard solution will save us the users to learn new mechanism with every new web-site
  • It will provide more robust implementation because it will be using industry proven and efficient design pattern for solving these common problems.
  • It will help one of the above software vendors in luring developers to use their technology if they can start from a “higher ground” with a framework like this.

What that I have in mind is something like WordPress. A platform that allows both rich enough default implementations as well as powerful and advance fully customize solution. WordPress also allow both hosted solution as well as self hosted and managed option.

I know that there are existing modules today solving few of the use cases listed above but to the best of my knowledge there is no single platform that can jump start a Web 2.0 offering built today.

Taking the Search personally

Are we going to see a new alternative added to the growing list of online content search solutions (I wrote about some of them here)?

So what do we have so far? popularity search engine (Google), semantic search engine, location aware search engine, blog search engine, spam free blog search engine, social search engine, conversational search engine, visual search engine, passive search engine, social bookmarking search engine. Did I miss any?

The new alternative is a profile based search engine. I’m not talking about Google Web History solution for personalized search engine using historical searches for better qualifying search result. It is a great idea though. I’m talking about using information, about me, from my profile, to better fit results to my search query. For instance if I’m a programmer searching using the Hibernate keyword I don’t want to see results about bears. If I’m a doctor querying about Viagra I don’t want to get all the spam in the world from now till eternity. If I’m 20 years old my world is different than the life of anyone else (if you call it a life after 30:)). Now these examples are for static profile attribute, yet there are many dynamic qualifiers that can be used to improve search results. One example was implemented already in location-aware search engines using your where about. Another is using your network dynamics like what that Delver is trying to do.  

It was the Web 2.0 applications with Facebook, as one of its leading example, that helped us to realize that one’s profile is not just name, email and contact. Your network (friends and groups), media choices (pictures, movies, books, and games), and activities (feed, expending your network, and participation) is also part of who you are. So, why don’t use it to help us find relevant content?

Google Engine Vs. Amazon EC2 – using Google Trends

Will India help in Google Engine adoption? I would suggest to take it seriously. There are many eyes in Bangalore watching…

GoogleEngineVsAmazonEC2-GoogleTrend

By the way, thank you Google for a great new tool.

Using SUMMIZE part I – real-time snapshot of culture and news (July 11th, 2008)

Summize is a “conversational search” tool pulling updates from Twitter. It allows searching (near) real-time through Twitter entries.

The home page is Google’s style page with a single edit box for entering the search term. It is very common these days minimizing the functionality presented on the first page. I’m not sure that this is in favor of this powerful tool in this case. In addition there is a list of Trending Topics showing very smartly selected items that are frequently mentioned.

Trending topics: iPhone, iTunes, Hellboy, App Store, Slurpee, Hancock, Loopt, IndyMac, GPS, Twittelator

The list above was taken on July 11th, 2008 the day Apple released its iPhone 3G, Dark Horse Entertainment released its Hellboy II movie and Seven Eleven served free Slurpee (7/11). We can also learn that the movie Hancock is still on peoples’ mind or watch list (released on July 2nd). Is Loopt the next big thing? IndyMac bank crashed – the financial sector is still in big trouble (7/11 was a bad stock market day).

Now, real-time tag cloud generating tool like Twitscoop should find these too but I did not see it or it just changes too fast.

Now this is what that I consider a great summary of that day’s agenda. Don’t you think?

It is very American oriented, I know, yet some of it has relevance to the rest of the world too.

Few comments on the home page minimalism:

  • I think that some of the Advanced Search capabilities should be brought forward. I really like the Attitudes section.
  • Maybe by showing examples for some relevant queries they could help people see how to use this important tool.
  • Trending links could be cool too  – I guess that this could be a challenge due to the use of multiple URL shortening services.

In the next post I will try looking beyond single “subject” search. Possibly looking at multiple subjects and looking for the associated “verb” too. Maybe there is a chance to learn more about the crowd, using Summize. Can you see how the business world can benefit from using Summize?

I will leave you with two Summize, iPhone 3G queries for July 11th: positive and negative.

I’m not sure what can be learned from it but it is fascinating to see how a digital gadget is able to draw so much excitement, disappointment, frustration and appreciation. Summize provides a nice way for seeing it.

FriendFeed exposes the need for FeedFriend

July 8, 2008 1 comment

FriendFeed  is an application for sharing much of your online activity and for folowing others and their feeds. You can add up to 41 services as of today collecting feeds from each to your feed stream. The value of this is still in question but it is not the subject of this post.

Now, one can assume that if you share something in one place and many of your followers can see it, this should be enough to market your blog, but apparently it is not. One can assume that this could be a great time saver for a blogger marketing his recent blog post by cutting down the leg work going around and posting the blog post link in multiple places (some of the 41 options).

The reality shows the opposite. I still see some of this service’s heavy users going around adding the links everywhere including the same marketing shpiel. How I see this – ironically using FriendFeed. Because I follow them and run Twhirl on my desktop I keep seeing the same light-weight pop-up windows with the notification about those users’ activity and only the source is differnt.

I assume that the reason for that “busy working” activity is the fact that many of the target audience is not yet on FriendFeed and these bloggers can’t rely on the message to go around using FriendFeed service exclusively .

So, I see an option to change the direction of the feed as well – i.e FeedFriend. Having a publishing service that can add the blog post link with a message simultaneously to multiple services (41 is a good number) such as Digg, del.icio.us, StumbleUpon, Twitter, Jaiku, reddit and more.

It will be helpful too if FriendFeed or Twhirl will filter duplicate notifications so I will not see the same blogger market the same post per each Social: Network, Bookmarking or Status service.

Your thoughts….

Follow

Get every new post delivered to your Inbox.