Archive

Posts Tagged ‘google’

The Ultimate Goal of Digital Branding

One letter….

image

Now that was a surprise!

image

15 simple ways to search Google for an instant answer

Categories: Product, Software Tags:

When it comes to life expectancy the world is not flat yet

July 10, 2010 2 comments

I recently discovered that Google shows Life Expectancy graphs for many countries around the globe.

I assumed a big gap between the developed and the third world countries in their average life expectancy,  the data did confirm my assumption, but it become way more apparent when I actually saw it using Google graph.

In Japan, the country with the highest average longevity in the world, based on the World Bank, World Development Indicators data, people lives up to 83 years old.

On the other end of the world, in Afghanistan, the average life expectancy is only 44 years.

ALMOST HALF!

Life expectancy - ranges

It is also very interesting to see the growth rate. In Japan LE grew from 68 to 83 during the years 1960-2008 (~22%) whereas in Afghanistan, LE grew from 31 to 44(~41%) during the same period. Yet, I as you can see from the graph above it is harder to add more years as the average grows.

It is important to monitor the growth rate for each country, as an indicator for improving health condition in each region.

Here is another picture showing more countries and their corresponding Life Expectancy graphs:

Life expectancy - all

Here you can see a huge growth for China during the 60th, and sadly, a huge drop for some troubled areas in Africa mainly due to HIV/AIDS infections. – some hope here.

Finally, average life expectancy does not seem to be totally correlated with financial success as you can see in the next picture for Iceland and Greece two recently troubled economies.

Life expectancy - EconomyJPG

Probably beyond initial crucial conditions, other factors like work life balance, health care system, crime rate, dining habit, and others contribute to the health of the entire population increasing the average life expectancy.

Other source of Life Expectancy data is Wikipedia – List of countries by life expectancy. This page shows data for the years 2005-2010 and the country with the lowest LE average is Swaziland with 39.6 years.

The world is getting more flat over time, but there are still huge gaps between different regions in the world due to lack of basic human needs.

Google’s search engine is the 21st infrastructure.

June 11, 2010 5 comments

Google’s search engine is the 21st infrastructure.

Search is infrastructure

When we think about infrastructure on a large scale we think about roads, train tracks, ports, and utilities – all things that are essential to the smooth running of our economy. Online searching has become so essential to our lives today that I think that we should add it to the traditional world infrastructure list.

Building and maintaining a search engine is so expensive and labor intensive that it requires the same kind of planning and upkeep that, say, the Golden Gate Bridge does.

I see two similarities between traditional infrastructure and search engines. The first is that a search engine is a mission critical system. The second is because the cost required for building and maintaining a good search engine is enormous—just as the costs are for ports, railroad tracks, and the electrical grid.

Mission critical system

Can you imagine a week without Google? Think for a moment how many times a day you use a search engine for a task. Life would be much harder without it. We are using a search engine to find a place, a person or a job. It is the same case when looking for information about a disease, a company or a product. Modern search engines also help to find directions, contact info, stock quotes and innumerable other things. I can’t think of a day without using a search engine (mostly Google but others too). Metaphorically search engines take us from one place to another (like planes, trains and boats), and if well designed and maintained they can save us an enormous amount of time and energy. But if that is not the case, they can be a big waste of time!

The mighty task

The web is big and expanding. In February of 2007, the Netcraft Web Server Survey found 108,810,358 distinct websites (not pages). In March of 2009 (only two years later) the number had more than doubled, to 224,749,695. The number of web pages is more accurate than the number of websites but I think that the numbers above tell us enough about the size of the web.

New blogs are popping up every day, and blogs can post in some cases multiple times a day. With the recent introduction of microblogging services like Twitter and other personal life streaming tools, content is growing even more rapidly. The information is also dynamic: websites go down and pages are being constantly modified. Blogs allow people to leave comments over time. Content is much more than text and can include video, audio, and images.

A search consists of many steps. It usually starts with crawling – getting the data. This is a mighty task that requires building an army of web crawlers to spider the web. It requires a crawling plan using sophisticated algorithms looking for new content and also for keeping the stored ones up to date. It necessitates an immense amount of storage space and heavy computation resources.
The other tasks include indexing, lingual processing and ranking (for relevance and popularity). (If you are interested in learning how Google scales this process by breaking down tasks even further, read the following blog post about Google Architecture)

It is impossible to compare entirely, but it seems like building and maintaining a large-scale search engine is as hard as building a new power station and probably costs as much too.

Living with Monopoly

The purpose of this section is to get you thinking about my analogy and what it might mean.

The Monopoly question – do we need more than one search engine?

In some ways, a search engine industry might fit the definition of what’s known as a “Natural monopoly” (wikipedia):

  1. “…it is the assertion about an industry, that multiple firms providing a good or service is less efficient (more costly to a nation or economy) than would be the case if a single firm provided a good or service.”
  2. “It is said that this is the result of high fixed costs of entering an industry which causes long run average costs to decline as output expands”

Google could be defined as a natural monopoly.  It now has more than a 70% market share.
The first definition raises the question: why do we need to more than one search engine provider? The second could explain why only one provider may survive.

Why we don’t need more than this one?

I’m personally not concerned about Google’s monopoly power to set rates. As a consumer I don’t feel any pricing power:) but maybe the companies that pay for ads do.

I do have a couple of concerns: The first is about the cost to the country and the world of maintaining a search engine or duplicating the effort in a large scale.
The second is that because it is such an important and world critical system, more stakeholders around the globe should be paying attention.

High Energy cost

Here is an excerpt from Data Center Energy Forecast – Executive Summary – July 29, 2008.

“As of 2006, the electricity use attributable to the nation’s servers and data centers is estimated at about 61 billion kilowatt-hours (kWh), or 1.5 percent of total U.S. electricity consumption. Between 2000 and 2006 electricity use more than doubled, amounting to about $4.5 billion in electricity costs. This amount was more than the electricity consumed by color televisions in the U.S. It was equivalent to the electricity consumed by 5.8 million average U.S. households (which represent 5% of the U.S. housing stock). And it was similar to the amount of electricity used by the entire U.S. transportation manufacturing industry (including the manufacture of automobiles, aircraft, trucks, and ships)”

Google is making an effort to reduce the cost of their data centers’ energy bills. My concern is that having multiple Google size search engine companies around seems as wasteful as pooling multiple power lines to every home. I also think that the energy consumption should be distributed across the globe since the search engine serves the entire world and not only one country.

What will happen if Google goes belly up?

I know that this seems radical and almost unimaginable at this point, but what if one day advertisers find another place to buy ad-space other than SERPs? Our lives are so dependent on Internet search technology that if no one can pay for the cost of maintaining one, that would have a direct impact on the world economy.

Maybe we need a different solution?

To reiterate:
-Search is a very large task
-Search is costly
-Search has become essential to the modern economy
-Google is effective but it is a monopoly
Yet today it is so mission critical that we need to watch it closely or maybe even break it up.

Regulations

One way to deal with a mission-critical natural monopoly is to turn it into some sort of government-granted monopoly. In this case it is not the government but some sort of world organization that can enforce regulations and demands like:

  • More energy efficient data centers
  • Better storage solutions
  • Crawl to cover more ground – deep web
  • Accounting governance and building cash reserves.

I know that this might sound like a radical idea. Please remember, the purpose of this article is not to support a return to a controlled market but to get us aware of the cost, power and dependencies associated with search engines.

Explore alternative search technologies (similar to exploring alternative energy sources)

In addition to possible regulations, there are other ways to address the functions that a natural monopoly like Google currently serves:

  • Split the search task like crawling, storage and indexing and distribute them across multiple venors.
  • Create better crawling algorithmsCuil claimed to find a more efficient and scalable ways to crawl the web (it is not about Cuil it is about the idea).
  • Real-time search (conversational search) – If you believe that real-time search is the future than you already know that maybe there is no need for deploying such a huge crawling tasks in order to find great content. Let the crowd do the job.
  • p2p - distribute the the crawl, indexing, ranking and storage, across many search users. This technology mitigates the single point of failure risk and leverages existing unused computational resources.

Summary

The new president of the United States, Barack Obama, is leading his 21st Century New Deal with the hope that big investment in the country’s infrastructure will spur economic growth and prosperity. Online search has become a mission critical task in our lives. It has an impact on the world economy and energy consumption. I think that it should not be overlooked. To the traditional infrastructure list of transportation, telecommunication and energy we should add the 21st century infrastructure – online search engine.
In the same way that nations monitor the condition of their infrastructure, they should be looking at search engine implementations and technologies.

A few points that I like you to take from this post are:

  • A search engine is more than software
  • The tasks of building and maintaining new search engine on a large scale have an impact on society
  • Search is a global objective
  • We are heavily dependent on this technology
  • Google is a monopoly – for better or worse.

Do you share my opinion that search engines have an impact on the world economy?
Do you agree with me that Google is a mission critical system today?
Should we be worried if someone might duplicate the task of keeping a large portion of the web crawled, stored and indexed?

**This blog post was published before on AltSearchEngine.com (my guest post) and it is no longer available so I decided to publish it here again.

Picture credit to my favorite artist Ron Shoshani

Reblog this post [with Zemanta]

On a blogging break – playing with Google App Engine

April 7, 2009 1 comment

GoogleAppEngine I took my head out of Twitter and I’m taking a short break from blogging. I’m playing with Google App Engine. So far Google‘s documentation is very helpful so getting started was fairly strait forward.

Here are my ramp-up tasks:

  • Read through the Getting Started section
  • Ramped up on Python – very cool and easy to use scripting language
  • I learned to JSON using simplejson- it works nicely with python
  • I’m now adopting new Web Framework django for Python
  • And I’m getting up to speed with a new data storage concept

All are great technologies.

I’m also testing the PyDev plugin for using eclipse IDE to develop for Google App engine – here are the instructions – so far so good.

Useful links:

Google App Engine and misc

Python

If you have additional useful links relevant to the technologies listed above please let me know.

*I plan to update the additional useful sources from time to time as I find more content

Reblog this post [with Zemanta]

Do you think that you can live without Google?

March 25, 2009 1 comment

InfrastructureHere is my latest guest post on AltSearchEngines blog.

Google’s search engine is the 21st century infrastructure.

A quick summary:

  • Search is a very large task
  • Search is costly
  • Search has become essential to the modern economy
  • Google is effective but it is a monopoly

It is similar to infrastructure on a large scale like roads, train tracks, ports, and utilities – all things that are essential to the smooth running of our economy.

Today it is so mission critical that we need to watch it closely or maybe even break it up.

Reblog this post [with Zemanta]

Twitter killed the RSS reader

March 20, 2009 1 comment

First, I stopped using my favorites, then Digg, Delicious and other social rating/bookmarking websites,  now I found myself using less and less the RSS reader, Google or Netvibe. I find great content on Twitter, Twitter search Trending Topics and recently even greater quality content using Twitter based search tools. These are services that mine links from Twitter updates, using different algorithms and post them in an organized fashion. I will refer to these as real-time news search services like Feedly, Microplaza and others.

RSS readers limitations:

Limited selection – it takes time to find and build selection of great blogs.  What if the selected blog did not produce any good content lately?

Scalability – it requires the time to organize feeds into tabs or folders. Also some readers, after adding more content grew slower (some more than others).

Social rating/bookmarking websites

I do use delicious for bookmarking of great information and some time for search but I rarely visit the Popular Bookmark page. Submitting content to Digg is too slow and I think that rating is not as powerful as retweeting.

Email subscription

There are some blogs that I follow constantly and I find the email subscription option to work best. This way I know for sure that I’m not missing new content on a daily basis.

The new feed

I now count on Twitter and a growing number of real-time news search websites to feed my curiosity with links.

Feedly – the irony is that Feedly is actually taping into your Google Reader feeds and tags, but it also brings content from other sources including Twitter. You can even see Hot topics via Twitter i.e. trending tags and hashtags. Read more here

MicroPlaza – this service looks at popular links posted on Twitter by the people I’m following (my timeline view). You can also see popular links posted on the public timeline. There is a new feature called Tribe, it is in the work but this option allows me to filter/organize popular links by grouping (enrolling) different people whom I follow on Twitter, into different Tribes. I wish I could use Twellow or WeFollow to speed up organizing my personal list into categories and use them as Tribes in MicroPlaza but this is still better filter than TweetDeck grouping option. In MicroPlaza I only see the popular links from the tribe and not other useless chatty noise – this is a great filter . There are more features and I do plan to cover this service more thoroughly in another post but here I want to focus on the new Trend.

MicroPlaza

There are growing number of similar services out there. I’m monitoring an additional one but I won’t mention the name yet (giving them a chance to improve). The key feature for me is the quality of the links. How good is the information that the service successfully managed to mine from all the noise on Twitter. The speed is important too. So far the two mentioned above are doing fantastic job.

Using Twitter timeline for the content source pool, employing millions of human web crawlers, filtered by the people I trust (follow) and other mining technics seems like an improved method for finding the best content out there. It truely gives me an edge over RSS feed reader.

Did you stop using your RSS reader too?

I owe it to Sagee Ben-Zedeff for helping me to become aware of this change in my habits and the new Trend. This is another great thing about Twitter – I now reflect more rapidly:)

Reblog this post [with Zemanta]
Follow

Get every new post delivered to your Inbox.