In this post I will discuss the three elements of Mashup sign-on process: Security(SSO), Access Control and Single Identity. I see a lot written and done about each individually but I think that it is not always clear what solution map to which problem.
If you are familiar with this subject then you can skip to the next paragraph (or this post entirely). For the ones that are not familiar with mashups and how to get them working for you, please read this short preface. There are many online services today: social networks like Facebook, bookmarking services like ma.gnolia, news like Digg, media sharing sites like Fliker and YouTube, and more. In the screen capture of the form below you can find 43 such services. These services provide online APIs (a way to request data and to execute functionality in remote from service by the world outside). This allows the development of new services on top called Mashups. The new service interact with the underline services and add value because of the unique mix created. The first form below is taken from FriendFeed a mashup application that help you keep track of your friends’ activities across many web services. In this form you are requested to select the services that you permit the current application to pull or push information from and to (in the case of FriendFeed, pulling only). In order for the application to be able to access your data the system needs to know who you are i.e. your user name (login). In some cases it will ask you for your password too.
These form raises three hot issues in the growing environment of open API and mashups. If you want to see how rapidly this world is growing look in this excellent source of information: ProgrammableWeb
- Security: not having login and password information stored in multiple places. Single sign-on (SSO)
- Access Control: having control over what the service can do with my data. Defining security policy.
- Single Identity: not having to re-enter my profile and friends’ information all over again. Data Portability.
Every service offers a sign-up process where you type in your login and password. Companies like Google, Microsoft and Yahoo that offer multiple applications online offers kind of single sign-on mechanism that once you’re signed in to one service you can safely go to the next one without re-login. The available solution for web sites that are not belong to the same company is the OpenID and here is an example for how to use it from WordPress. It is in a way the solution for single sign-on on the web today. Not all the services today support it but the adoption seems promising. If you want to see a decent amount of available options to authenticate across service just click on the “Sign in using” drop down list in ma.gnolia’s login page.
When I allow a service to access my data from another service I don’t have a way telling the source what I allow them to provide. I can’t tell the service if I allow it to just read my data or also the update information (e.g. updating Twitter status). It is mostly determined today by the APIs. If there is a way to configure it (to some extent in Facebook) it is not consistent across the web. I know that there is an effort by multiple leading software companies to deal with it. For more information read the page about the new OAuth protocol.
The term profile today refers to way more than your name, address and email. I think that Facebook took it the farthest including your media preferences, activities and your choice of applications. But most important it includes your contacts i.e. your network. It is in the basis of most social network services that your experience and satisfaction from the site is in direct relationship with your network size. Yet, no one want to re-type his personal information and re-build his network. Some claim, and I agree, that this data should not belong to anyone but you. The Data Portability initiative is trying to eliminate the need for recreating your online identity and profile over and over again by defining a new open standards that will allow services to port it to your request. This is a great step and I can only hope to see it implemented across the web soon.
If you are new to the subject but not new to using mashup applications I hope that you’ll find this post helpful – maybe now you can start using the OpenID option instead of your login. If you are about to start a new service or mashup I hope that this will help you to think about how to make it easy for us to interact with it.
Do you see more ways for improving this process?
So you built a new Web 2.0 like service. It gets some traction and people are crowding in. The site just published an open API. Before you know it, the system crumbles down under the weights of its own success. Sounds familiar?
If your API only exposes “read only” API i.e. an option to pull some of the data out of the system you’re only in half of a trouble. In a case where your API allows the “writing” option too, i.e. modifying system records in the database, now things gets really interesting. Example for read only: Technorati provide blog, bloggers and posts information to Internet Bots and badges. Twitter is an example for both read and write API. Bots built using Twitter API can get members’ status updates as well as automating posting status messages.
Btw, both of te above examples are for Data Producers that are working night and day to scale these open API support.
The problems are generally the same and so does the solutions: performance (throughput and latency)hardware sizing and cost, traffic pattern predictability, load balancing, throttling, caching, stateless web nodes, multi-casting, table partitions (having skilled=$$$ DBA for building high availability database), backup, API format (there are too many of them), message queuing, redundancy, recovery, security, quality of service (premium services), statistics, logging, error handling, monitoring, abuse protection, you name it.
Gnip is a new start-up founded by Eric Marcoullier that is working to address some of these common problems. Reading their blog shows how much thought and sweat is put into addressing some of these common scalability problems. They aim at addressing some the other pain point in the open API arena like consistent API and Identity discovery. Having a consistent/normalized entity ID across multiple web services can solve one of the biggest obstacle today for using WYSIWYG mashup tools like Popfly – but this is for another post:).
Let’s assume that the “read” part is getting better due to service like Gnip (ping in reverse) same way that blogging platforms improved new posts indexing using ping service. Now, bots and mashup services don’t need to be “busy waiting” on the API. What about the “write”? What can be done to make this reusable and scalable?
I think that it all come to a new opportunity here to outsource the entire open API development and support, and to save a bundle. Here are some ways to save on this effort through consolidation.
- Reusing hardware through hosting solutions whether physical or virtual like Amazon EC2 or Google App Engine
- Reusing technologies implementation and integration like using memcached, terracotta and many more
- Reusing expertise – Database Administrator and Security experts
- Protocol and meta data standards
- Monitoring tools and technics
Saving: blood, sweat, tears, grief and reputation (in other words avoiding embarrassment).
Bottom line is, that in my opinion, Gnip take it a good distance forward but there is a room for another reusable, consistent and scalable layer between the Data Producer and Gnip.
What do you think?