Archive for December 20th, 2010

One, none, a hundred thousand

Monday, December 20th, 2010 by Roberto Saracco

I just had a discussion with a member of Deutsche Telekom Strategic Board who is also a consultant of the German Ministry of Innovation, Wolfgang  Wahlster,  Director  &  CEO  DFKI. We briefly discussed how the increase in storage capacity and the growth of storage in devices is changing the idea of centralised Data Base and the overall architecture of data management.

The world is webbed into data

The world is webbed into data

Someone is starting to say that the idea of having ONE Data Centre to whom any device can connect to get the data it needs is fading away. Since devices have so much storage capacity they can host within themselves all the data they may need for their processing. Hence the HUNDRED THOUSAND Data Centers that we can consider as the future of data architecture.

On the other hand, I would say, you cannot imagine having hundred thousands Data Centers independent of one another. Data will necessarily be duplicated and you need to synchronize them if some coherent processing is required. In a sense, then, we have NONE Data Center to pinpoint since a Data Center exists only as a figment of our imagination (this is also good because it relates to one of Pirandello, an Italia storywriter, masterpiece, One, none a hundred thousand…).

Can we imagine a massive “perfect” distributed virtual Data Center? Surely not. But probably a sufficiently adequate one yes. I guess not data are born (and more importantly used) equal. If you are planning to see a movie any copy of that movie is fine. You do not expect different copy location to hold different plots of that movie. Non synchronization, therefore is required.

If on the other hand you are interested in the traffic situation in an area you need to be sure that the copy you are accessing is adequately up to date. Now, what does it mean “adequately”? It all depends on the kind of use you have for those data. If the use is planning a transportation system a copy that was updated a month ago is perfect. If you need the information to decide the route of your trip that would take you through that point in 10 minutes time you need to have a situation that has been updated in these last few minutes (you surely do not need the situation of the last 2 millisecond). In other cases you are looking at telecommunications traffic in a radio cell to decide where to route packets and there you need to have a microsecond refresh of the traffic.

I feel that as we move forward into the Information Society we will have to learn to deal with massive distributed data centers and operating in a loosely connected virtual data base. The characteristics of data will not exist in isolation, rather it will derive from the function of use and it will change over time. In turns, the function of use depends on the application and on the user. This is what creates value to the end user and this is bringing us from the Web 2.0 to the Web 3.0.

It is a data architecture that also from a conceptual view brings us into the ecosystem paradigm. In biological ecosystem we really see that the data architecture is based on the ONE, NONE, a HUNDRED THOUSAND paradigm.