Archive for August, 2011

120,000,000 GB: Now, that’s a big storage!

Wednesday, August 31st, 2011 by Roberto Saracco

IBM has announced the construction of a 120 PB storage for one of its client. It will be a single storage unit, although it consists of 200,000 Hard Disk Drives working together. This far exceeds the largest single storage unit ever built, a mere 15PB.

Building such a large storage presents many challenges and these have been met by IBM Almaden researchers. The two most crucial aspects are the access speed and the reliability.

Through  proprietary architecture and way of splitting a file over many disks the speed is kept high, in spite of the huge amount of data. A trial has shown the capability of indexing 10 billion files in 43 minutes, that’s way better than the previous record when 1 billion files were indexed in 3 hours.

The reliability (promised) is also amazing: your data will be safe for a million years (so it is likely you won’t be there to complain when they fail). Ensuring reliability often decreases the access speed. Again some ingenuous solutions have been found with replication of data and dynamic replication at different speeds when a failure is detected.

An IBM record cards storage in the 1950ies. They would now fit in a flash pen, with plenty to spare

Only a few years ago it would be a nonsense to talk about a data centre having 100 PB, now we are talking about a single storage unit having 120 PB. But what really amazes me is the fact that there is a demand for such a storage. To put this in perspectives, 120 PB is enough to store the full WayBack  Machine Web Archive 60 times!

And I still remember, back in the 80ies, when the first McIntosh having a 10 MB disk was making the news!

Will service providers prefer “Personal Data Stores” to “Data Aggregators”?

Tuesday, August 30th, 2011 by Corrado Moiso

In my post on August 5th, I discussed from the end-users’  viewpoint the advantages of introducing a “Personal Data Store”, i.e., an environment  enabling individuals to collect and manage their personal data, so as to create a user-centric ecosystem where individuals can play an active role.

Unfortunately, such advantages for end-users are not a sufficient motivation to promote a successful transformation of the current model of handling personal data, for instance those generated during activities on the Web: these data are collected by entities providing services, aggregated by data brokers and sold to their customers. This chain happens with limited or no participation of the end-users. In order to move from this approach, businesses must see a return over the costs of adopting new technologies, otherwise nothing will change.

Some recent analysis (e.g., see “Personal Data Ecosystem Consortium”, http://personaldataecosystem.org/, or the document “Personal Data: The Emergence of a New Asset Class” of the World Economic Forum,
http://www3.weforum.org/docs/WEF_ITTC_PersonalDataNewAsset_Report_2011.pdf )  shows that the current model is no longer sustainable. From one side governments are regulating these activities, e.g., by introducing “Do Not Track” policies, anonymization constraints, or limitations in the amount of information about individuals for a limited time period. From the other side, this is not the best approach to get useful information about individuals: in fact, the information collected in this way would provide only a partial view of individuals. Only individuals are able to create a complete “digital footprint”,  by collecting from different sources, aggregating, and offering in a controlled way information about themselves, in order to create their “digital footprints”.

The adoption of “Personal Data Store” could overcome the drawbacks  of current model: “Personal Data Store” would provide individuals with the capacity to collect, aggregate and manage their own data. They could create their digital footprints, by storing as much data as they want for as long as they want, and offer them to the service providers in a controlled way, according to their policies and rules. Such digital footprints would offer new opportunities for service providers, by enabling the delivery of new classes of services exploiting higher quality information on either single or groups of individuals.

The following figure, derived from a drawn by Marc Davis (http://marcdavis.me/), one of the authors of “Rethinking Personal Data” document published by the World Economic Forum, offers a pictorial representation of the possible evolution of the handling of “Personal Data”.

“The red dot shows us what’s happening today: some data aggregators are necessarily self-regulating by limiting the amount of time they keep data, and governments are limiting data retention and anonymization practices” (see the red arrow). “The blue dot shows us what would happen if people were given the capacity to store and manage their own data – if they could keep as much data as they wanted for as long as they wanted…”

In order to have a successful migration from today situation to user-centric ecosystem, I think that individuals should be able to create “digital footprints” as rich as possible, by enriching “Personal Data Stores” with almost automatic support. Only in this way service providers would be encouraged to endorse the new approach, by preferring it to the current model based on “data aggregators”.

Discovering Data Patterns by using a “Chaos” of Particles

Monday, August 29th, 2011 by Antonio Manzalini

We are living in world of data. Discovering patterns in huge amounts of data plays an essential role in mining associations, correlations, and many other interesting relationships among data; moreover it helps a lot in data indexing, classification, clustering, and other knowledge extraction tasks. In order to achieve this, data mining techniques make use of computational approaches from statistics, machine learning and pattern recognition. Recently Swarm Intelligence (SI) techniques, that only a few years ago were just a curiosity, are attracting a growing interest. SI approaches are beads on distributed intelligent paradigms for solving optimization problems: SI took its inspiration from the biological behaviors of swarming observed in flocks of birds, groups of fishes, or swarms of bees, ants or particles.

Swarm Intelligence

In Particle Swarm Optimization (PSO), for example, a number of simple entities are placed in the search space of some problem or function, and each particle evaluates the objective function at its current location. Populations of particles are organized according to some sort of communication structure or topology, often thought of as a social network. Each particle determines its movement through the search space by combining some aspect of the history of its own current and best (fitness) locations with those of one or more members of the swarm, with some random perturbations. Following iteration takes place after all particles have been moved. Eventually the swarm as a whole, like a flock of birds collectively foraging for food, is likely to move close to an optimum of the fitness function. Obviously a single particle by itself has no way to solve the problem; problem solving is a population-wide phenomenon, emerging from the interactions of the particles.  For example, the data to be analysed are assigned to so-called datoids (which can be imagined as a bird carrying a piece of data on its back) forming a swarm on the n-dimensional plane.

Excellent results, also in terms of convergence, are achieved by using chaos as a apparently random-like perturbations source. Chaos is typical behaviour of non-linear, dynamical system, which is non-period, non-converging and bounded; moreover it has a very sensitive dependence upon its initial conditions and parameters.

Have a look at this paper http://www.ipcsit.com/vol3/045L015.pdf

At the end beauty and effectiveness of PSO is amazing, but they should not surprise as we’re just trying mimicking Nature: as another real example, think about the interactions of electrons in a network of resistors, which can be described, as known, in terms of random walks of the particles. This simple behaviour at the microscopic level, leads to a complex optimization problem solving at the macroscopic level: Kelvin showed that the patterns of potentials in a network of resistors is exactly the one that minimizes heat dissipation for a given level of current flow. I wonder whether this, or chaos behind superconductivity phase transitions, could be other possible SI-PSO techniques.

Don’t mix the Cloud with Supercomputers…

Sunday, August 28th, 2011 by Roberto Saracco

Often I hear people tailing of the “network” as the “computer” and the Cloud as its implementation.

Now I happened to read a few data on supercomputers that I’d like to share with you and that convinced me of the enormous distance between what can be done by a Cloud and what can be done by a supercomputer.

In the Cloud you can have an enormous processing power by adding as many computers as you please. However this processing power cannot be equated to speed of processing. It is like saying that the Amazon river has a tremendous water transport capacity, although the “speed” of the water it transport is a few meter per second. If your goal is to get a water stream at 100 meter per second the Amazon is not a solution.

Now, going back to the Cloud and the Supercomputer.

IBM Blue Water, the faster it gets...as of today

The IBM Roadrunner (2008) has a 1 PFlop capacity, that is it can perform 1 million billion floating point instructions per second. It is not a single processing unit but a cluster of units (as it is the case for any supercomputers today), so you cannot equate capacity and speed. However it ha 50 thousands channels, each working at 5GBps, connecting the various processors. In a Cloud you have channels with a connection speed that is at least 5,000 times slower.

If you take Blue Waters, the most recent IBM supercomputer delivered this year, it has a ten times higher capacity (10 PFlops) and 5 million optical cables each running at 10 GBps. By 2014 supercomputers will expect to reach 1 EFlops (1 billion billions floating point instructions per second) and optical connections will total 1 billion at a channel data rate approaching 50 GBps. Interesting to note that the cost of the optical interconnection in today Blue Water is 10% of the total supercomputer cost, whilst in the EFlops supercomputer it is estimated to be 40% of the total.

This kind of speed, infrastructure and even more important pace of evolution is a world apart from the Cloud. The two will live parallel life with no one superseding the other.

Have you gisted today?

Saturday, August 27th, 2011 by Roberto Saracco

Getting a gist means getting a rough idea about something. You may not be an expert in a certain field and by listening to somebody giving a lecture you are not becoming an expert but you get a “gist” of what that field is about.

Google translator has become the tool for getting a gist...

Now “to gist” is starting to be used to mean “getting a feeling” of what a text, written in a language you don’t know, is about. And to get “the gist” more and more people are using Google Translator.

According to Technology Review, the use of Google Translator is growing significantly and it has been started to be used by the US Government to check the content of messages and web pages. It does not provide a good translation but it is sufficient to understand what that particular text is talking about and if something looks suspicious than a hum a translator steps in.

Today, Google Translator provide a “gist” in 60 languages. It is quite far from a human translator.

Look at this sentence in Italian (from one of our most famous books, I Promessi Sposi):
Quel ramo del lago di Como

che volge a mezzogiorno
tra due catene non interrotte di monti
tutto a seni e a golfi

Gets translated in this sentence:

That branch of Lake Como
that is coming at noon
between two unbroken chains of mountains
in all breasts and a round of golf

The first and third lines are pretty good but the second and fourth are completely off mark (they should be translated something like: “that is orientated to the South”, and “full of promontories and bays”).

Still, if you are completely at loss in a foreign language, as all of us are for the majority of them, using Google translator is a good step forward. You can read a web site in Arabic or Chinese, take a look at what Libyan newspaper are saying and so on. This is something that was impossible just few years ago and that is already changing our grasp of the world.

OK. let me stop here, now I need to gist some blogs in Chinese.

Cisco to focus on M2M opportunity

Friday, August 26th, 2011 by Gualtiero

The Internet of Things is a whole new world as shown in this Beecham Research diagram

Recently also Cisco has announced dedicated routers for the M2M market, stating that it believes it will become an important mass market. This is just the latest announcement of a series of recent initiatives in the M2M market, both in the US and in Europe.

In April of this year Ericsson (another OEM!)  announced the acquisition of long-time  M2M platform provider Telenor Connexion, while in July TeliaSonera announced that it had signed the cooperation agreement with France Telecom-Orange and Deutsche Telekom to increase the quality of service and interoperability for M2M services.

In May of this yeat the operator T-Mobile USA announced that it had cast-off its M2M operational business to long time service partner Raco Wireless, although in July T-Mobile USA struck a partnership with asset protection provider IContain and Asset Protection Products LLC (APP) to help reduce operating costs in the US$7 billion US rent-to-own (RTO) sector.

These and other initiatives signal that the M2M market is deemed ready to become a truely mass market, and players (from hardware providers to M2M specialists passing through telco operators and sytem integrators) are trying to position themselves to reap the benefits.

Does eveyone have a clear strategy? For telco operators a few ideas:

  • The market is exploding now (Yankee Group forecasts that SIM volumes will almost triple from 23 million in 2011 to 61 million in 2015, a CAGR of 22%.), the time to have a clear vision on the M2M market is now, tommorrow could be too late… Cisco stated it wanted to position itself as a trusted hardware provider, but others have more ambitious plans…
  • Being a complex market, where it is highly improbable to have all the competencies required for a complete M2M offer in one company, it is essential to form partnership and reinforce a balanced ecosytem for all the actors involved. Each party should be able to bring its distintive skillset – Telco operators should be able to leverage its tradition of network building & management – to build a complete offer.
  • Using their experience in the Cloud, and continuing to develop their distinctive offers for that market, Telco operators can leverage that experience since many of the problematics of an M2M service can be solved and/or alleviated by using Cloud technology. One offer could reinforce the other.

Any other ideas?

Bits and Atoms

Thursday, August 25th, 2011 by Roberto Saracco

We are getting closer every day more to the fusion of atoms and bits. Augmented reality gets better and better and coupled with better cell phones and tablets like the iPad 2 can provide a growing sense of immersion.

Junaio is an example of  company that is exploring the evolution of cell phones and tablets to change our perception of the world. Take a look at the clip and also browse their web site.

In the coming years we will move to an environment populated by surfaces that are able to display information and hence to connect bits and atoms. Portable displays, like those of the cell phone or the ones of tablets, will also become magnifier lenses through which we can see bits and atoms going hand in hand.

I already discussed this in a number of other post and I will do it again as technology advances and the ingenuity of researchers provide us with new ways of exploiting existing technology.

Designing smart material on the blackboard

Wednesday, August 24th, 2011 by Roberto Saracco

The design of smart materials has been moved forward by the collective endeavor of many researchers but the process is mostly based on serendipitous discovery. This happens but takes a lot of time.

The new smart material created through design and simulation

This is why I was so interested in reading a news on the feat achieved by researchers at Stanford and Harvard on a completely new approach to design smart materials. They started on the blackboard (actually they used plenty of computer power and time) to analyze in theory how a given organic (which means carbon based) semiconductor could be improved. They found out that seven different ways of altering the starting organic semiconductor were promising, on the paper. Through simulation they manage to focus on one as the most promising one. At that point they moved from the blackboard to the lab to develop it and, behold, they got an organic semiconductor that is 30 time faster than the semiconductors used today in LCD (and these are silicon based).

You can see the result in the photo on the side. The white segment represent a length of 10 microns.

The interest in organic semiconductors stems from their flexibility. If you create a display made of them you can roll it and carry it around in you pocket. So far there are a number of screen that you can roll but the speed is too low to be able to display moving images. No so with this new organic semiconductor. So let’s get ready to have the iPad in our pocket.

What is even more interesting, for me at least, is to see how the way of designing smart material is changing. Again it is a matter of economics. It cost so much less to design and simulate through computers than in the real lab that it makes the design of smart materials much more affordable.

And if you just look around, you’ll see plenty of smart materials, from the silk of a spider web to the capillary strength of a giant red wood that can bring water up to its highest branches. We re just not able to design this king of things, yet.

It is not just more, it is a different stuff!

Tuesday, August 23rd, 2011 by Roberto Saracco

Data have been growing as result of more ways to intercept the external world (sensors of many types) and more data are becoming available as communications gets cheaper and pervasive and storage manages to handle PBs. Another important aspect is that the whole machinery of our daily life, and of biz, is now digital, thus thriving and generating more data as fall out of the ongoing processes.

There are many studies showing that the growth, in stored data, in transmitted data, in generated data, in consumed data, in inferred data, is exponential and such will remain for this decade and few years of the next one. However, we are reaching a point where saying that more data is available is meaningless. It is no longer about more rather about the fabric of data that is being generated. In a way, to use an image, we have seen data growing like threads, than we have seen interconnections raising among threads forming nets. Now we are starting to lose the sight of nets as the interconnection gets so tight that we are starting to see fabric and objects.

Representation of a social graph

Out of these objects, as represented in the figure, we are capturing new information that is more related to the characteristic of this virtual object than of any specific data set. If you want to expand this vision, think of data as the representation of cells in a living being. Once you get enough cells and their interconnections you can start losing sight of the individual cells and even of the organs they may form and start to look at the whole as an organisms and begin studying the behavior of the whole organisms, forgetting that it is composed of individual cells.

This sort of vision starts to be applied at cities, communities, power grids, enterprises and we qualify this vision of the whole and the actions that can be taken as “Smart”: Smart cities, Smart communities, Smart grids, Smart enterprises….

Looking at the whole of these big data requires new approaches to data management and data mining. and this is what is being discussed in this days in San Diego at the ACM Knowledge Discovery and Data Mining Conference.

As Technology Review points out, there is now a widespread interest in mining data, where once this was the domain of scientific domains. Today may commercial enterprises have a worldwide market counting millions of users and they feel they can provide much better services if they understand the data their users generate. Netflix has offered on million dollars to whom can provide them with a smarter analyses of their customers to provide them with better suggestions on what to watch next.

Social media tools, like Facebook, are ideally positioned to harvest user behavior data and to make inferences out of them. And so are Telecom Operatos.

Making inferences is a very powerful way to gain understanding but it is also potentially invasive of privacy. This is why Operators are so careful in playing this game. There is a need for regulation and accountability, for technology to understand what is going on and to opt out.

However, my feeling is that if we want to move from cities to smart cities, from communities to smart communities, from homes to smart homes and so on we need to leverage on data. And, of course, the challenge is to it “smartly”.

Spintronics and straintronics?

Monday, August 22nd, 2011 by Roberto Saracco

One of the things we seldom realize when comparing a computer and a brain is the difference in energy consumption. It is huge. Fact is, all Nature creations are parsimonious in energy consumption. Cells perform an amazing variety of “constructions” (building proteins) and use an infinitesimal amount of energy.

On the contrary, whatever we create requires an enormous amount of energy (in comparison with Nature’s creation) to operate.

Assembly of FeMg8 leading to a magnetic super atom

But now scientists are starting to use Nature’s tricks to create very low consumption devices. Researchers at the Virginia Commonwealth University have combined two disciplines, spintronics and straintronics, to create a circuit able to store a bit using a billionth of a billionth of joule!

To give a comparison a Flash memory card consumes 1,000 billion times more!

The results have been published on the Applied Physics Letters and if you are a fan of spintronics and straintronics just enjoy the reading.

If you are not, but are curious to know at least what these two words are about: spintronics is an emerging technology that exploit the spin (rotation) of the electron and the related magnetic momentum to store an information; straintronics is an even younger technology exploiting the piezoelectric effect to generate electrical power out of strain posed on some materials.

There is still lot of work ahead but it just show how far we are from reading the limit of technology evolution. Actually, rather than saying everything has already been invented it is more appropriate to say that we have really just begun!