Posts Tagged ‘search engine’

The web is like an onion – the layer effect!

August 20, 2010 1 comment

image The web is like an onion built from the inside out adding more and more layers in the ever race for web traffic and ad revenue.

  • One buys a domain and somebody creates a short link
  • One builds a website and somebody adds a dashboard
  • One wants to be found and many creates search engines plus ads
  • One starts a blog and somebody creates an ad network
  • One builds a mobile app and somebody creates an app ad network
  • One wants to be presence and connected and many build social networks

The fight over content has many fronts: java script tags, shot links databases, smart devices, apps, search engines, your “friends”. I’m sure that there are some other fronts that we are yet to see, maybe because we are still inside the onion.

Picture credit: Darwin Bell

Do you think that you can live without Google?

March 25, 2009 1 comment

InfrastructureHere is my latest guest post on AltSearchEngines blog.

Google’s search engine is the 21st century infrastructure.

A quick summary:

  • Search is a very large task
  • Search is costly
  • Search has become essential to the modern economy
  • Google is effective but it is a monopoly

It is similar to infrastructure on a large scale like roads, train tracks, ports, and utilities – all things that are essential to the smooth running of our economy.

Today it is so mission critical that we need to watch it closely or maybe even break it up.

Reblog this post [with Zemanta]

Eight good reasons for using headup (Firefox add-on)

January 25, 2009 6 comments

Headup – the semantic web Firefox addon

I recently started using Headup. I’ve been looking for this kind of addon for some time now. When bits of information are missing from peoples’ profile pages, product specs, media, and other online content it is crucial to combine multiple data sources to piece together a complete picture. Headup does this!

Using its smart semantic mapping of entities and relationships Headup gathers and links information from multiple online sources. To complete the picture it then personalizes the results using your presence on multiple web services like Gmail, Twitter, Facebook, Digg, etc.
Headup is not only innovative in its semantic approach to linking data, it also integrates nicely with your Firefox browser and offers you a few ways to access the data it discovers. One example is Google searches: After installing Headup you can expect to see your search term annotated “Headup:[search term]” with a thin orange underline at the top of Google’s results. When your mouse hovers over the term a click-able circular plus sign loader will allow you to open Headup’s overlay  interface.


The starting point – googling eagle eye.


The complete picture – headup-ing eagle eye

I recommend you visit Headup‘s website to learn how to use it but as a whole it’s pretty intuitive and I prefer dedicating this post to the reasons you should get it:

My eight reasons for using the Headup Semantic Web Firefox add-on :

  1. Because hyperlinks simply aren’t enough – Relying merely on arbitrarily selected outbound links that send you to find info related to the page you are browsing is limiting. There are more relationships among the different entities on the page that could be leveraged to retrieve associated information. Headup already mapped out these semantic links and makes them available for you in a neat and accessible interface. The experience doesn’t end with search results.
  2. Because you can save valuable search time - Both the user interface, and the way information is presented, require less clicks to complete an in-depth search through multiple search sources.
  3. Because the information comes to you – Search can be an exhausting task. In many cases it involves either a recursive drilling down into multiple levels, or traversing the search vertical up and down for additional information. Google itself is aware of this potentially laborious process and is making an effort to bring associated information to the first SERP: Recently when I googled the term “movie” I got three results that were movies playing in theaters in my area. Headup provides multiple data types as a default: Using Headup on the “Pink Floyd” will get you a summary relating to the term, the bands albums, see photos depicting it, listen to the bands songs while reading their lyrics, find news blogs and web activities related to it, and much more.
  4. Because it brings down the chances you’ll miss key information – “Headuping” people is a terrific way to learn more about them. I “Headup-ed” my friend Bill Cammack on Facebook and immediately discovered that he’s a video editor with an Emmy award to his name. In this case the extra information regarding the Emmy award was brought in from Bill’s LinkedIn profile.
  5. Because you can learn and find information you didn’t expect -  If the example from my previous item wasn’t proof enough here’s anoter example: I ran Headup on “Kill Bill” (what can I say? – I’m a Tarantino fan) and discovered this blog post published today (1-2-2009): “More Kill Bill on the way” – Tell me this isn’t cool!!
  6. Because it’s personalized – When configuring Headup after download, or later via the “Settings” option, you can choose to connect Headup to the online services you are subscribed to. Headup connects to a wide variety of web services like: Gmail, Delicious, Twitter, Facebook, FriendDeed, Digg, etc. The information Headup retrieves from these services allows it to personalize the info it discovers for you: If you Headup a firm you’ll get friends of yours that work there. If you Headup a band you’ll see who in your network likes them. This is another example of how Headup is not just a search tool but a browsing experience.
  7. Because you don’t lose your starting point – Headup is designed as an overlay window that keeps your starting web page, and anything else you have open on your desktop, visible beneath the interfaces’ SilverLight frame. Inside Headup you can drill down endlessly, but when you’re done you are back where you started.
  8. Because your information is safe – from Headup’s Privacy Policy – “In plain English”:
“We here at Headup treasure our privacy and that’s exactly why we made every effort to create a browser add-on that would live up to user privacy standards we would be comfortable with. We’d be embarrassed to let you download an add-on we wouldn’t download ourselves.”
**You don’t need to sign-up for using Headup and your information is stored on your machine only**

 **Bonus: one additional reason – because on some pages it ROCKS! Try it on and you’ll see why it ROCKS…literally! By the way, the Headup user interface lets you watch videos and listen to music like a regular media player.

My questions for the Headup team

I plan on occasionally checking Headup’s blog for updates. At this point Headup supports Firefox on Windows and on Macs but I know that they plan to support more browsers in the future. I think that at this point the key thing to focus on is that the Headup concept works.

I do have few questions for the Headup team:

  1. Do you plan on adding vertical derived classifications? I can see some use cases for health (and maybe even for software development). Just as headup was able to map out “Actors”, “Films by the same director”, “Web Activities”, “Related News”, “Trailers”, etc. for a “Film” type entity. I can see it applied in a similar fashion for a “Health” type entity – retrienving things like: “Case”, “Treatment”, “Clinics”, “Pharmaceuticals”, “News Groups” etc…
  2. Do you see enterprise usage for Headup? I still need to give it more thought but having Headup in my email could be cool. Another possible implementation is supporting corporate CMS tools.

Epilogue – Is Headup’s “Top Down” approach the face of the future Semantic Web?

The Semantic web promises to make information understandable by machines. If you follow Alex Iskold‘s excellent series on Semantic Web on ReadWriteWeb you are aware of the multiple approaches to make this happen. The top-down method implemented by Headup helps brings the future to us a little sooner. I think Headup is giving us a taste of what future browsers will look like in an age when they, and other tools, will be able to understand more than just hyperlinks. When using Headup it feels like I’m doing more than “browsing” or “searching” I feel like I’m experiencing a new web!

One last thing: using Headup for some objects didn’t yield complete results. Don’t judge them too harshly for it, instead please focus on the concept. My experience with Headup so far is that in most cases the relevancy of the information provided was more than reasonable. I think that for a small company just out of Alpha what has been accomplished in the short time the company has exited is impressive and promises that improvements will be fast coming.

I’m using Headup and gave you the eight reason I have for doing so. If you are using it too I’d be happy to hear why…

Webnomena – eight prominent Web 2.0 phenomena

October 13, 2008 1 comment

This blog name is Webnomena web + phenomena  (plural of phenomenon) and this is where I share my observations living the web. This post sums ten month of observations and blogging. I’m not sure what lies ahead but at time like these we can count on it that there will be some changes. Some of these phenomena will vanish and new one will appear. So, before moving on to the very next thing (or maybe backward) I chose some of the most prominent phenomena signifying the Web 2.0 era.


Many to many communication – the old media consisted of only few broadcasting to the masses. It was distant and regulated. You could, with some effort to comment back on a news article yet there was very little chance that your comment will appear in the following edition. Your ability to interact with the newspaperman was limited if at all. Social media changed all that. You can be the news. You can comment almost anywhere instantly. You can interact with news generators. You can follow and  listening to the same source on multiple channels (Blog RSS feed, micro-blog, media streaming (pictures, video and audio – blogtalkradio), and comments). It is also easy to find what is going on right now i.e. Web-now – reading the news and the meta-news in real-time, at the same time. News become more conversational – bi-directional. This kind of media is impossible to recruit or regulate – this change comes with both the good and bad.


Google Trend for the term social media

 Search engine is not a search engine anymore - it is a spell checker, idiom checker, translator, map and direction, alerts, knowledge base, value comparing tool, research tool, people business and relationship finder, content organizer, content visualizing tool, meaning extractor, entertainment and event planner, tasks based, support micro-format for integration with other tools, API, ad by context placer. And above all it could be a money making machine!! So much data and sooo much more metadata. I’m not sure how to call it anymore but if you want to take on Google it is hard to imagine someone wining by just providing a new magnificent search engine. 

Selfless sharing as a strategy - sharing without holding back is the heart of social media.  You stay ahead, read a lot of blogs, learn and experience new tools, technologies, means of communication, monitor changes across the web and then share your : information->knowledge->expertise->thoughts->feeling->goals with the community.


Google Trend for the term blog

 Digital autobiography (digital life stream)- my, my, my: activity feed, timeline, location, preferences. The meaning of the term profile in the old days was your slow changing private attributes like address, age, marital status.  Now, it means a whole lot more. In the social network world the term profile is dynamic and includes your activity, your friend, your friends friends, your relationships, your chosen tools and games and more. There are new ways available for finding influencers and new marketing campaign startegies.

Ultimate empowerment – Anyone can do it and we all have access. Build your own [search, content, network, dashboard, scalable web-app (cloud computing)]. The new web world is more like cafeteria model than restaurant. It is up to you to go and get/build what that you need. Do you want that power or do you want to be served? Not everyone are excited about been empowered, not everyone can, not everyone will thrive in this kind of environment! Also, cafeteria food is not always great. This is why it is cheaper:)

Graph awareness – the graph is no more just a data structure that only computer geek’s care about. People are aware of their network(s). The Graph is now a strategy, it provides both power and knowledge. It started with LinkedIn then Facebook and continue with Twitter and Jaiku.  It is a way to filter out poor content and a mean for finding great one. Some social graph’s phenomena are: crowd sourcing, influence, cliques, and more. One big question is still unanswered: Who owns the graph?

Digital community work – the new social media helps to overcome fear of strangers. In a  way it forces you to interact. It is almost impossible to make it alone! Your succeeds is very much depends on your web relationships. What is the optimized investment allocation between writing, SEO tuning, and building a great supportive community? This is where Blog Networks make their case.

Online without offline -  it is possible to make it not leaving your computer!? No conferences, no meet-up, no leg work. Also, what should come first or what is more important? The offline or the online activity? It is not clear anymore even if you do both. Offline – You do something and then you go telling to your online friends. Online: You create opportunities to meet friends offline around certain online subjects. There are so many ways to build your Web presence – “I blog there for I’m”. Beside blogging there are numerous self marketing, and self branding tool out there. FriendFeed is one example. Sub phenomenon to the online presence is the strive for self scalability – dealing with inbox zero, growing social commitments, knowing the right time allocations between web tasks (research, reporting and community building), avoiding distractions, learning to say NO. The alternative is to work yourself to death.

Other sub-phenomena:

Attention starvation – too many puppies fighting for traffic. This leads to poor comments strategies: the first look something lime “great post..”, the second is comment trolling, and the third is building bots for spam comments and splogs. The real dark side is viruses and worms. The gray side is link bait and viral marketing.

Proven scalability patterns – incorporating some or all of the scalability technics like decentralization, virtualization, asynchronous operation mode (push vs. pull), keeping building blocks small, isolation (of responsibilities and resources) are essential for success on a web scale. We did see some examples for product with significant growing pains. We saw some great development in this area too like Ping services, RSS, and recently what the Gnip is now building.

Mashup – companies open their API as a growth strategy. The value for the community: the sum is greater than its parts. Examples: Twhirl, FriendFeed.

There are more web phenomena that I did not cover in relation to social media like the new development in the social media and location aware Mobile web, otheres are obsessive rating, widgets and viral marketing. I’m sure that there are more and this is great news for my Webnomena blog:)

backtype provides new ways for serving the blogsphere

September 26, 2008 9 comments

Google Alerts sent me an email today with information about another reference to my blog name, Webnomena. This time it came from a service called Backtype. I had to follow the link! Every now and then I come across a new tool that fills my mind with numerous options and ideas. Twitter and Twitter search were two of those, MeeID is another, and now backtype is causing the same effect. Smart,flexible, simple, and useful.

In short, this is a service that crawl and collect comments from blogs and then organize them for you, using the comment’s URL. If the URL is your blog then all your comments from around the web are now in one place. More than that, you can see other bloggers and their comments too. The key feature is that you can follow other people’s comments (Twitter style).  Brian Solis from PR 2.0 wrote a great post about the true value of backtype in”BackType Unearths Blog Comments to Identify Relevant Conversations”.

There are more features that this service offers. The backtype blog also tell us that Mike Montano and Christopher Golda the founders are hard at work adding more cool stuff.


Yet, in this post I want to focus on how backtype suppots the blogsphere.

The tasks that backtype can help us with:

  • Finding what the professional bloggers are reading and caring about – for instance here is where Om Malik hangs out.
  • Improving comments’ quality – maybe we will see less instances of “great post, now please come and visit my blog”
  • Finding implicit connections to complete one’s social graph picture. A comment is one way connection between the reader and the blogger. If the blogger responded or the blogger left comment on the reader’s blog then the connection is now bi-directional. They may not be linked as friends, fans or follower though. Social graph search engines like Delver and Nsyght can use this information for adding more connections (from like minded people) and enriching their search content pool.
  • Complementing the Web-Conversing-Now dashboard – joining Twitter Search Trending Topics telling us what is hot now.
  • Finding smart people – I notice some cases were a comment was better than the target post.
  • PR – Using micro site like backtype for building great web presence to help your business – Danny Brown provide insights on this new trend in his recent Are Micro Sites the Next Wave of Business Promotion? blog post.
  • Listening – having more ears on the web in addtion to Google Alerts, Twitter Search RSS feeds and more.

Brian Solis wrote:

The process of listening isn’t only relegated to the research and analysis of individual reputations. Listening is also instrumental in the creation of new communications and service initiatives as well unearthing the specific conversations that matter to your brand – for gathering data and also discovering opportunities to respond

I wonder if backtype have a plan to open their API for building new services around it!?

How do you plan to use this service for serving your objectives?

Reversing the data pyramid – raising metadata awareness

August 6, 2008 Leave a comment

Are we overloading the system (web) with metadata?

Data explosion

First we digitized our knowledge, and businesses, then our conversations (the IM, one on one model expended to the micro blogging broadcasting model that is now searchable), lately we digitized our relationships and preferences using social networks. Yet, the most prominent change that is hard not to notice on the web is the explosion of metadata; the data about the data.

  • Links to pictures, videos, blog posts and more are tagged, annotated and rated.
  • Conversations are Stared (favorite), and #hashtaged.
  • Blog info contains author, Tags, Tag Cloud, rank, authority and more.
  • Public bookmarks and news items allow searching for the person who saved/submitted/dugged/rated them.
  • Profile data is expended beyond static information like name, address and DOB to include experiences.
  • Each one of us now owns a News feed generated by our mini network’s activities.
  • The FOAF and XFN protocols adds metadata programmatically about relationship between individuals.
  • More relationships and meanings’ defining protocols are developed and adopted by semantic search engines and social networks. You can now find a lot of information today about the data.

Almost any object on the web today is wraped with meta information about it.

An intresting question is what is the current ratio: metadata/data and how quickly does it grow? But, the most important question in my opinion is: Will it help to find and organize what that we need, when we need it? Will we see an inverse of the current data pyramid or will it only bury important information with too much data about the data?

Data reduction

This extra data that we and many new applications generates today has the promise to organize the data better for our needs. Yet, at the same time it does overwhelmingly increase the amount of information out there.

So my question is does it also increase the tasks of searching, processing, organizing, filtering and presenting valuable information for us or does it make it more efficient?

Taking the Search personally

Are we going to see a new alternative added to the growing list of online content search solutions (I wrote about some of them here)?

So what do we have so far? popularity search engine (Google), semantic search engine, location aware search engine, blog search engine, spam free blog search engine, social search engine, conversational search engine, visual search engine, passive search engine, social bookmarking search engine. Did I miss any?

The new alternative is a profile based search engine. I’m not talking about Google Web History solution for personalized search engine using historical searches for better qualifying search result. It is a great idea though. I’m talking about using information, about me, from my profile, to better fit results to my search query. For instance if I’m a programmer searching using the Hibernate keyword I don’t want to see results about bears. If I’m a doctor querying about Viagra I don’t want to get all the spam in the world from now till eternity. If I’m 20 years old my world is different than the life of anyone else (if you call it a life after 30:)). Now these examples are for static profile attribute, yet there are many dynamic qualifiers that can be used to improve search results. One example was implemented already in location-aware search engines using your where about. Another is using your network dynamics like what that Delver is trying to do.  

It was the Web 2.0 applications with Facebook, as one of its leading example, that helped us to realize that one’s profile is not just name, email and contact. Your network (friends and groups), media choices (pictures, movies, books, and games), and activities (feed, expending your network, and participation) is also part of who you are. So, why don’t use it to help us find relevant content?

Is there a way around Google?

June 20, 2008 1 comment

In this blog post I will cover the different ways that entrepreneurs building search engines solutions are positioned to take on the Google challenge.  As you’ll see soon there are many fronts to this battle.

The Challenge

I have to admit that now when I look for something I go Google first. My homepage is Google but mostly I use the Google toolbar. Google is there when I use the web from the browser. Too close. This is my “active search”.In 90% or more of the times I’m satisfied with Google’s results for my “active search”. Most of the time I find what that I need in the very first result page. I use Google as a spell checker, idioms checker, map, business and people finder. I use it when I’m  looking for : images, videos, stock tickers (then I go Yahoo Finance) and more.

When I “listen” to the web I have more options. I do use Google Alerts but not exclusively.  I have Alerts set for some keywords, yes, including my name, blog name and URL but not just those. I have more “ears” on the web like FriendFeed, Twitter, many RSS feeds, Newsletters and more. This is my “passive search“.

So, how can you lure people to use your search solution (so you can sell ads and make money)?

What are the additional dimension/fronts to this battle over content discovery?

Let’s first look at some of the existing fronts:

  • Active search
    • Microsoft – new IE install defaults to Live Search as a homepage – failed (around 9%). Cashback?
    • Yahoo – if they were better they did not had to strike a deal with Google but I think that this is as good as it gets (around 20%).
  • Building a developers community – Yahoo – the new Search Monkey initiative is empowering and will probably help Yahoo pushing their search as a service while building a developers community. Some claim that they need to open it even more.
  • Social bookmaking – not bad but not so great either. It seems better when the search part was a result of the social network content organization and not the original objective. Tags like in can help in finding information with some crowd wisdom. It could short search time too. Yet, it seems better for some categories then others (it is great for developers).
  • Sophisticated search algorithm – If aimed at “Active Search” then I’m not sure. If I get 90% of the times what that I want from Google. I may go there after giving up on Google but not as my first choice. Maybe after many times proven better I will consider it but I did not see one that good yet. Do you? I did not play yet with Powerset that uses natural language processing to understand meaning so there is a hope.
  • Better UI for the search results – again, If aimed at “Active Search” then this is not the way to take on Google.
  • Blog search engine – seems like aiming at the long tail using advertising network is the next attempt taken by Technorati. Probably serving the blogger community and presenting what is hot now on the service’s web site was not the answer (for a business model). Apparently, most people don’t see the difference between a blog and web site when they go looking for information (or even after finding it in a blog). So, they Google (in most cases not using Google blog search engine). I want to see what the creative people at Twingly will come up with. So far, it looks great. What that I see in this front is more opportunities because it is not bound to textual search only. Bloggers, blogs and posts has relationship, i.e. meta data that has value beyond the content. Blogging is discussing, teaching, preaching, mentoring, provoking, guru-ing, sharing. Systems that knows how to capture this meta data, store it properly and leverage it will have a chance to create a advantage over Google in matching ads to blog readers.
  • Vertical search – we need to see how this goes but based on this blog post talking about GenieKnows it has some momentum.
  • Passive search – using Alerts, Feeds (using pipes) and Filters like Filtrbox – this can keep me away for a while from Google till I need an ad hock search. Same is posting questions – but I found that it depends on the community. Finding what you want could be slow but for tough questions this could be a great option. Yedda is one example.
  • Search results aggregation with social touch: I like Xoost‘s premise that some searcher are better than others and sharing searches could save time for everybody. The tool searches across multiple search engines – today: Google, Yahoo and Microsoft Live Search.  I’m still trying to figure out all the features in this service (and there are lots of them) but this seems to be a beginning of a long term relationship. Look at some of the feedback after you join in, you can learn a lot from it. One warning, the pages are overloaded with options and that make it hard for first timers to know where to start, so give it some time. Once you get use to it the Channels tie it all together nicely.

Other fronts not covered here:

  • Mobile search engines  – I’m yet to explore this domain. I know that Google is investing in this area a lot of effort. Also Powerset did something in this area yet it only bound to Wikipedia for now.
  • Social Search engines - there are so many of them (more than 40) – do you see a leader?
  • Search focus on products’ discussions – like Omgili -
  • Enterprise CMS search engines – not my cup of tea and actually Google is getting there replacing old solutions.

Additional fronts to consider:

  • Distance – get in between me and Google. When I launch the browser it is too late for a different search engine today.  Few options:
    • Use Adobe Air for building a light desktop app (that looks like a web page).
    • Integrate with Twhirl or other “distributed discussion” desktop client.
    • Sneak-in through my blog – like what that Zemanta is doing (in a way this is both active and passive).
  • Don’t let me leave my “passive search” page/tool/widget. Few options:
    • Maybe by going social.
    • Maybe building a widget that turns my “Active” searches to “Passive” (no matter what engine I use) .

Do you see more fronts to take on or by-pass Google?

Is it a search and destroy….capital mission?

More on short URLs and are we going to see a new search engine developed by Short URL Redirection Services?

June 13, 2008 11 comments


I’m still intrigue by this subject and can’t stop coming up with more questions and thoughts even after I wrote in my previous post

I took my blog URL and looked at the results from few URL shortening services.

Origin URL:

Hare are the result: – not so useful when you get a longer URL than the original!! – most popular – this one is interesting – does it gives you some control over the hash key? – use by Twhirl – the shortest available today

As you can see the length of the domain name is a key for a short URL. Most services uses around 1-5 characters (A_Z and 0-9) for hashing the long URLs (that should be enough for a while).

Google could use their recently acquired, supper short domain name for shortening URLs:

They are using it for redirecting to the Chinese localized version of their search page. I can only guess that it makes more money than tiny links service.

There is also the shortest domain that I know about: but it does not beat  – how did they get this domain registered?

If you are looking for information about short domain name look at this very interesting post.

Here is a list of Free Short URL Redirection Services.

Another point: if I’m right then the hyphen (-) is also a valid character in URL names. I’m not sure but I think that the URL cannot start or end with the dash. What is the reason that these services don’t use this extra characters (36 is better than 35)? Is it too complicated because of the start end constraints? Does not worth it? I’m just curious.

Final thoughts: do you think that whatever inside each of these services database is more valuable than what that a crawler can come up with? Think about it:

  • These URLs are picked by humans (lots of them in Twitter and Plurk).
  • They can keep statistics for how many times each URL was requested.
  • They can build a search engine using these links without the need for building a sophisticated crawler for discovering new URLs.
  • They can see what is interlinked inside this tiny linksphere

Am I making a big deal out of a tiny subject:)?


Get every new post delivered to your Inbox.