Well actually not your car today, but your car in the next decade. At least this seems clear if you consider that the Google self driving car consumes 750MB of data per second! That is the volume of data provided by all sensors equipping the car and providing the bases for the on board computers to navigate a urban environment.
What Google Car sees…
The car is equipped with a number of cameras that feed the on board computer. Is that a ball rolling out between those two parked cars? Watch out! There might be a kid running after the ball…
In the picture above (credit: Google) there is a spatial representation of the model developed by the on board computers as the car is making a left turn. All pictures are composed into a 3D model containing the size and potential behaviour of any object. A tree occupies a given space but it stays where it is, whilst a truck can move and occupy a different space and it is better that such a space does not overlap with the one that will be occupied by the car!
So, it is not enough to see, the car has to understand and speculate…
And doing this requires an amazing amount of data and of data crunching. Just think about hundreds of cars munching data and exchanging them (one car cannot see behind a corner, or a bend, but can radio its 3D understanding of the surrounding space to any other car in the vicinity). That is real broadband!
Researchers at the Selangor Malaysian University have developed a software to let computer understand emotion by looking at the lips of a person.
Interestingly, they are using a genetic algorithm to enable learning as more and more lips are “read”. They have classified expressions into six main categories and can detect if you are happy, sad, if you are angry, disgusted or surprised or if you are just “running idle” – neutral.
The image captured by a camera is analysed by the software that identifies the upper and lower lip and place each one inside an ellipse measuring the distortion with respect to the ellipse. These data are used to “calculate” the emotional status.
The team of researchers is lead by Karthigayan Muthukaruppan. In the article published on the International Journal of Artificial Intelligence and Soft Computing they acknowledge that our ability as human being to recognise emotions by looking at other persons’ faces is much more sophisticated and we make use of more “hints” than just lips, but still this is a first step and the results based on lips analyses are pretty good.
They expect that such a emotion recognition will become part of every day computer interface and it will let the applications to adapt the interactions based on the emotions being detected.
I find also interesting the fact that innovation is now coming from all over the world!
Most of the world we perceive generates far many more signals than the ones we really captures. Hence, if new sensors can intercept that information unseen to us we can get a better view, and understanding of the world.
This is what MIT researchers have made by extracting information from normal video capture using a process called Eulerian Video Magnification.
In simple terms a computer analyses tiny nuances in each frame identifying changes over different frames, changes that are so subtle they are not perceived by our eyes.
As an example, our heart beats once a second and as it beats the blood pumped in the arteries creates tiny ripples on our skin and minute changes in redness of the skin, too weak to be seen by us but sufficient to be detected by the computer applying the Eulerian Video Magnification. See an example in the figure where the redness of the skin is amplified to become visible.
The shape of the ripples can tell a lot about how the heart is beating, what is the blood pressure and the health of the arteries and veins. I t can also pinpoint some pathologies, particularly if it is possible to compare historical data.
It doesn’t take a particular camera to pick up the frames, any normal mass market camera will work. The trick is in the software that is able to amplify tiny differences and therefore enable other software to do the analyses.
In the future we might expect to see a growing number of data analyses based on this technology and a doctor will really be able “to see” our heart beating, just looking, through the application on his desk at its rendering derived by the images of our face or arms his webcam is taking right then.
Also, imagine what a saving in cost this implies. No more queues at the lab for exams, it is all done on spot, in the doctors studio. And, moving down the lane of time, why not imagine one of the (several) webcams in our home pick up tell tale signs of problem and raising a red flag to our doctor who will get in touch with us is something might get wrong, and well before it does!
Predicting and avoiding illumination of a single drop. In this experiment, the drop is first imaged with 5ms camera exposure time (shown above magenta dashed line). The system latency is shown above the cyan dashed line. Images shown are composites of the 14 frames needed for the drop to traverse the field of view (FOV). Since the experiments are repeatable, the ground-truth image on the left shows drops illuminated throughout the entire FOV (a). The drop is falling with near constant velocity at 16 pixels per frame, so prediction is straightforward (shown as yellow boxes) (b). The light would be projected during the dark frames. (Credit: CMU)
Well, researchers imagination never stops to amaze me. I run onto a prototype by CMU researchers, working in the CMU Robotic Institute, that have created a system to cancel the rain, or snowflakes, from your vision as you drive your car! Just flick a switch and, like magic, the rain disappear from your windshield.
The idea is based on the fact that when we drive under a squall, or a snow storm, the beams light is reflected by the rain drops and snow flakes. This reflection makes them visible to our eyes and creates a sort of halo, a persistency, on our retina that confuse the image.
The researchers placed a camera on the windshield and have connected it to a computer that “sees” the raindrops and snowflakes. Hence it continuously redirect the car beams to avoid illuminating the drops or flakes. They have shown that their system is able to detect a raindrop, predict its movement and redirect the beam in just 13ms.
Our eyes are not able to see the blinking of the lights as it gets redirected and get a clear image of the road since it is no longer obstructed by the illuminated raindrops.
The effect is amazing, and you should take a look at the video showing what is like driving in a rainstorm without or with this system:
I don’t know about you but I wasn’t particularly good. Being able to “do it” was plenty of satisfaction and I never tried to beat the clock. I remember there were championship were the goal was beating the clock, solving it was a given. And I remember being impressed by watching these guys solving the puzzle in less than a minute.
But now I saw this:
Cubestormer II is the name of the robot that can solve a Rubik’s cube in less than 6 seconds. And what is amazing as well is that it is not a rocket science robot. Its “brain” is a Samsung Galaxy cell phone, its bones and muscles are Lego Mindstorm. You can get both at your department store for a few hundred bucks.
The robot uses the cell phone both to inspect the cube faces through the cell phone camera and to compute the moves required to fix it. Then it sends the commands to the Mindstorm for execution. Looking at the faces, calculating the strategy and implementing it requires slightly more than 5 seconds. The best human can do that in half a second more, but that does not include the time he needs to look at the faces and work out a strategy, it is just the pure rotational activity.
This news picked my interest at two levels.
First, it is just amazing to see what a combination of a cell phone with a toy can do. Just 5 years ago the cost for doing this kind of thing would have required some hundred thousands of dollars.
Second, we can easily imagine what this kind of evolution will bring: through a cell phone we will be able to connect the physical world with the Web and rely on sophisticated services to interpret atoms and provide services to act on them. Welcome to the Internet With Things!
Thursday, January 27th, 2011 by Juliana Maria Magalhães Christino
According to Martin Lindstrom, in his book “Buy-ology”, 85% of our consumer habits are unconscious (see an interesting interview with the author where he gives some interesting examples).
If most of the time we don’t know why we do or buy things, how accurate can be the surveys that have been made in the last decades?
For sure they had and still have their value in market segmentation and products launching, but more is needed, specially if we think that a very big number of new products (more than 40% in the USA) disappears before the end of their first year of existence, and almost all of them seemed fitting for potential customers in previous surveys with experimentations or questionnaires.
Many efforts in a deeper understanding of our consumer behavior are being done. Neuromarketing approach is one of them, which goes straight and literally inside our minds. Although it has a great potential accuracy, this technique is very complicated, expensive and is carried out in artificial scenarios.
The question, so, remains: How could we have more trustful insights about consumers in order to, not only describe, but also predict their behaviors?
Some interesting researches are being done at Media Lab, and I particularly liked two of them to help in this matter:
The first one is being developed by the Human Dynamics Laboratory, led by Alex Pentland, and one of the main feature of this research is its ability to quantify the non-verbal aspects of human face to face interactions, which have been shown to be highly predictive of the outcome of interpersonal exchanges of the most diverse kinds, what they call “Honest Signals”, not only, but mainly because they are collected in natural environments where real life situations happen (for further information: http://hd.media.mit.edu/)
And the second one is similar to the technology commented in the post Watch out, the ads is watching you… (December 22nd, 2010 by Roberto Saracco). It has being developed by the Affective Computing Laboratory, led by Rosalind W. Picard. It argues that people express and communicate their mental states through facial expressions, vocal nuances, gestures, and other non-verbal channels. And because of that they developed a computational model that enables real-time analysis, tagging, and inference of cognitive-affective mental states from facial video. Applications range from measuring people’s experiences to a training tool for autism spectrum disorders and people who are nonverbal learning disabled, but I think it could also be used as a managerial tool (a version of this system is being made available commercially by the spin-off Affectiva, indexing emotion from faces.
In my opinion these kinds of improvements in the field of consumer behavior researches could be interesting and useful in helping us to gain a better control of what and why we buy and, on the other side, help companies to develop products more valuable in the eye of consumers.
I remember reading a story, some years ago, of a child going up in the penthouse rummaging through old boxes full of old things. Among these he found a strange pane with plenty of buttons, each with a faded away letter on its top. The pane had a wire connected to it with a sort of plug at the end. He took it but he had no idea what it could be. So he took it down to his grand dad and asked him: what is this? “A keyboard” he replied.
Is this the future of the keyboard?
This story came back to my mind reading the outcome of a discussion in a panel at 2011 CES in Las Vegas on the future of Interfaces:
Since the early days of the Personal Computer we have been interacting with it with a keyboard and a mouse and the reason we are still using them today, according to the panel (and I share this view) is that we are still using a PC.
However, we are starting to use several other computerized devices, like cell phones and car navigators. Whereas we have not been complaining using a keyboard on a PC we have complained several times using one on a cell phone. Our fingers often do not fit the tiny pushbuttons. Carrying around a full size keyboard is not practical either. Hence the grumbling and the hope for something better.
Voice controlled devices are now available but are far from perfect. Touch screens are also every day experience but touching tiny images of simulated pushbutton is not nice either.
However, now there is a growing need for a different interface and that is the real driver to explore alternatives to the keyboard and mouse.
These alternatives are likely to be voice and image recognition based, at least within this decade. In the more distant future may be a wireless connection brain to machine may move from science fiction to mass market. What we have seen so far in the labs (controlling prosthetic arms or pointer on a screen with our eyes or just by our thoughts) is fascinating but far from real application in the mass market.
My bet is that we will see keyboard and mouse disappearing once the PC will disappear completely and processing will become embedded in objects whose surface will then become the interface.
As I mentioned in some recent posts, Augmented Reality is moving to the mass market exploiting our smart phones capabilities. However the industry is also moving ahead with specialized devices that can provide more immersive sensation, like AR goggles.
Goggles for Augmented Reality
Clearly, these sort of devices are not for everyday use but I can imagine that once in a while they can be used to look at that monument with…new eyes. Also, they may find application in education to get more information on a certain specimen. Studies are going on, in this area, to evaluate the effectiveness provided by immersive augmented reality in studying certain topics.
Of course these devices are only as good as the content that is being provided and overlapped onto the real world. To have a seamless experience requires an understanding of the world around the viewer and a precise mapping of the various objects to align the augmented information on the right spot. There is still a lot of work that needs to be done in the area of image recognition and rendering but the progress has been amazing in these last two years.
Big screens have started to appear in shops with clips running to attract the attention of the shoppers. Are they really effective? This is the question being asked by owners and this is something researchers are working on to find a good answer.
Big screen to grab your attention and a hidden camera to watch you!
A new company, CognoVision http://www.cognovision.com/, is ready to provide it. They are offering video screens with an embedded camera. Their software analyses the images captured by the camera and is able to detect faces and determine the gender and the approximate age. Based on this information the appropriate clip is selected and shown, therefore increasing the probability of engaging the customer.
But it doesn’t end there. The camera will keep watching you as you are watching the clip and the software will be able to detect if you are pleased with what you see and possibly customize in real time the video based on your reactions.
A little bit intrusive? Yes, for sure but in a way advertisement is intrusive and the more effective it is the more it is! Marketeers team up with sociologists and psychologists to dig into our minds and look for ways to etch their message into our brains conditioning our behavior.
It has been said that any sufficiently advanced technology is indistinguishable from magic. And I just stumbled onto something that really looks like magic, at least to me.
Satellite photography has become a given. We look at places on the Internet using Google maps to see where that hotel is located, what is on its surrounding and so on.
I just discovered that satellite images can provide much more information that I ever suspected.
A series of images of a place taken at different times using radio analyses techniques can detect changes that would be impossible to detect on ground. An example is the processing of images from a portion of the Sahara desert. The tiny difference in the sand surface can be analyzed to detect that a truck was there, that a camp was set up and from several hints it is possible to understand how many people were there and when.
As you can imagine these kind of processing is being perfected by the military but it can have applications in many fields. Controlling (illegal) immigration paths, understanding the movements of herds, detecting the changes in resources (like water) exploiting the skill of animals to find them, calculating the deep displacement of tectonic plates following an earthquake to asses the risk of mudslides in the area and so on.
Archeologists are taking advantage from these technologies to discover ancient trails of civilizations long disappeared, looking beneath the sands of the desert and through the pluvial forests in the tropics.