Posts Tagged ‘DNA’

A new kind of computer: you!

Saturday, April 6th, 2013 by Roberto Saracco
A XOR gate that can be performed inside a cell.

A XOR gate that can be performed inside a cell.

Our bodies are made up by hundreds of billions of cells and now researchers at Stanford have managed to create a (sort of) computer that goes inside a cell and uses molecules available in the cell itself for performing computations.

Any computer has to carry out three basic operations: store data, transmit data and process data. The Stanford team made the news last year by succeeding to perform the first two operations using cellular processes and molecules. Now they have announced in a paper, published on March 28th on Science, to have succeeded in implementing the third missing part: create a functioning processing unit within the cell.

The basic processing unit in a computer is a XOR logic gate. If you can do a XOR operation you can do any other operation you can dream of. If you want to dig into the properties of XOR and how to combine it to perform any kind of processing you can read the explanation on Wikipedia.

You can see the basic scheme of this “processor” on the left. What the researchers did was to find out a way to use DNA and RNA to create this logic gate. They have called this a “transcriptor”, and it is a sort of biological transistor. It works using a well calibrated mixture of enzymes, and the researchers have chosen those that can be found normally in bacteria and other living cells so that indeed this transcriptor can operate inside a living cell.

A very weak signal that may not lead to the activation of a gene when processed by the transcriptor gets amplified (similarly to what happen in a transistor that amplifies and electrical signal) and therefore can activate the gene(s).

According to the researchers this transcriptor can be used to assess the presence of certain compound in a cell, and remember it for later use, it can tell a cell, on command, to produce a certain protein by activating a specific gene and so on. Actually, they claim the the range of applications is so broad that it is impossible to be defined by a single group and this is why they have opened up the results of their work by placing it in a public domain to let other teams to further polish the method and come up with other applications.

Why DNA has been chosen by Nature ?

Wednesday, January 30th, 2013 by Antonio Manzalini

DNADNA is the well known macro-molecule with a double helix, encoding all genetic information in a language with 64 three-letter words built from an alphabet with a set of four different letters. The used symbols are A, C, G, T and mean adenine, cytosine, guanine and thymine (thymine T is replaced by uracil U in RNA). Since the discovery of the molecular structure of DNA in 1953, by Watson and Crick, a lot of progresses have been made in studying the ensembles of molecular structures of the genetic code.

Scientists are investigating why this special language has been chosen by Nature.

As we have read in the last post there are effort for mimicking this language in informatics:    an avenue towards DNA-based computing and bioinformatics. On this matter, let me go back again to symmetry.

DNA has two helices, which run anti-parallel to each other: this is an inherent symmetry, which is highly important in the replication process of DNA. Furthermore, they say that the genetic code has an exact A-G permutation symmetry and an almost exact T-C permutation symmetry with respect to the third nucleotide. Given the enormous importance of spontaneous symmetry breaking in several physical phenomena, these symmetries in the genetic code are even more amazing!

In 1993 this paper proposed explaining the degeneracy of the genetic code as the result of a symmetry breaking process. This can be can be compared with explaining of the positioning of the chemical elements in the periodic table as a consequence of an underlying dynamical symmetry (which, in turn, are reflected in the electronic shell structure of atoms). Have a look at this recent paper to read more details about this fascinating perspective. Universal characteristics of symmetry breaking are even here, in the language of Nature.

Since the discovery of DNA huge progresses have been made in unveiling the genetic code, and the rate of discoveries in this field is accelerating day by day thanks also to the growing amounts of processed genetic data: a multi-disciplinary approach capable of integrating Mathematics, Physics, Biology, Informatics and Engineering could bring to a breakthrough, changing profoundly ICT horizons.

A powerful programming language: the DNA

Tuesday, January 29th, 2013 by Roberto Saracco

This is the century of the Brain, at least this is what many scientists claim, but this decade is the decade of the genome. We (they) have been able to improve the sequencing process at a rate faster than the Moore’s law and the expectation is to be able to sequence a full genome in less than an hour at a cost of less than 100$ by the end of this decade.

Schematics of a DNA strand

Schematics of a DNA strand

This will create a big data base with millions of genomes that can be analysed for understanding how variation in the genome can lead to problems and diseases. And here comes the point: harvesting a tremendous amount of data and increasing our knowledge is good but in order to have an impact on our well being we need to be able to manipulate the genome to fix issues and also to increase its capabilities. Although this latter seems a second order priority (and clearly creates ethical issues) this is what is happening right now and what we will see increasing in this decade.

There are already several techniques for splicing, that is for introducing specific genes inside a chromosome. It is thanks to them that we have been able to create bacteria producing methane or bacteria eating oil to fight pollution in the ocean after an oil spill. However, today, splicing is very difficult.

Hence, the interest for this news from MIT and researchers of the Rochester University.

Researchers are proposing to use a natural bacterial protein and RNA able to recognise and cut viral DNA to produce a component, called Cas9 that binds to RNA sequences. These can target specific location in the genome and cut the DNA in that location.

Once the DNA strand is cut it can be recombined without a certain gene or a new gene can be inserted. In other words, rather than attacking the problem of changing the DNA itself researchers are proposing to use RNA as a tool for doing that.

With this approach one can easily program the DNA to instruct for production of any protein, like inserting a subroutine into a program.
The RNA complex is very precise: only an exact match with the DNA location will cause the cutting. The “power” of this programming language is not the variety of basic instructions (there are only 4 symbols that are assembled in triplets) but the billions of already written programs that have been tested through hundreds of millions of years in the field and finely tuned!
We can expect by the end of this decade to see many programmers at work, although when you ask them if they are using Pyton or older languages like C++ they will answer: I use RNA and DNA and it works pretty well!

Suppose you need to store something and retrieve it 10,000 years later…

Thursday, January 24th, 2013 by Roberto Saracco

Well, of course, the first thing coming to mind is that it is unlikely that you get to be blamed if the storage can no longer be read (or if you do you likely don’t care!).

However, our world is becoming more and more described through bits and with the present storage media they are unlikely to survive even 100 years. Compare this to the masterpieces from painters or writers. Can you imagine a world without a Van Gogh or Shakespeare heritage? Indeed, if Van Gogh and Shakespeare have chosen (not that they could) to store their masterpieces in jpg or doc files we won’t be able to enjoy them today.

Comparison of DNA vs MagTape cost taking into account the cost of writing with DNA and the time of content life

Comparison of DNA vs MagTape cost taking into account the cost of writing with DNA and the time of content life

Scientists have been working on this issue, that is becoming ever more pressing since today we have reached a digital store in the ZB figure. What can be used to preserve digital information forever (or at least 10,000 years)?

Just knock on Nature’s door.

Researchers at the EMBL-European BioInformatics Institute have succeeded in finding an effective way to write DNA code to store information.

The DNA is a very robust way to store information, we can read the DNA of animals that got extinct long ago, 10s and 100s thousands year ago.

With all the advances we had in the last decade reading DNA sequences has become easy and cheap. What has remained very difficult is to write specific sequences of DNA since the number of errors during the writing procedures are too high.

The researchers at the EMBL-EBI have succeeded in developing a technique that is much more robust so that even 4 consecutive errors can be intercepted and recovered.

Do not expect, however, to get rid of your hard disc anytime soon to replace it with a DNA cup, even though that cup would be able to store over 100 million hours of hd video.

As it is shown in the graph, the cost is still very high, and it needs to decrease at least a hundred folds to become competitive with a magnetic tape. But of course, cost reduction is what we have become used to in ICT!

Tagging molecules …

Thursday, September 27th, 2012 by Roberto Saracco

Science is progressing through painstaking observation and measurements. This started in the XVI century and continues today. We have invented amazing and ingenious ways to observe and measure the universe and the tiniest particle. Any new “tool” increasing our observation and measurement capability is bound to open up new discoveries and out of them new applications to our daily life…

Florescent colour particles attach to DNA strands

Hence, I was interested in reading the news that scientists have invented an extremely effective way to discover what molecules are present in a cell (or any other minute aggregation) and how they connect one another.

To observe molecules in a cell researchers have to disseminate in the cell fluorescent particles that binds to specific molecules. Each particle can only bind to a specific molecule so that if one observes a “red” dot that means the presence of a specific molecule.

The problem with this approach is that one can only detect as many molecule types as there are different coloured particles. Unfortunately we only have three-four colours max to play with and therefore it is only possible to detect simultaneously only four different types of molecules.

This is where the invention of the researchers of the Wyss Institute for biologically inspired engineering at Harvard comes handy.

They have found a way to attach coloured particles to DNA strands, thus permitting to create an almost infinite set of patterns that can be detected. This solves the problem of observing many different types of molecules at the same time.

You may want to take a look at the paper to grasp the full details.

What matters to me is that this is another step in the direction of learning more about our cells, how they are composed at the molecular level and how they work. This is important as we are moving to the genomic era of medicine and need to understand the finely tuned steps that make life tick and that sometimes go astray. Because if we understand that we are on our way to fix problems.

Clearly, this additional understanding will pass through the manipulation of big data. Consider that a liver cell (to take just an example) contains about 8 billion protein molecules and you see what I mean.

HI-C: looking at the DNA structure

Thursday, March 22nd, 2012 by Roberto Saracco

DNA: a fractal globule

It looks like a ball of thread, but, mathematically speaking, it is a very special ball: a fractal globule. And it represent the DNA forming a chromosome.

So far we have been used to see the DNA as “the double helix”  a ribbon containing the pairs with the code of life. Now scientists have managed to look at the real 3D structure of the DNA forming the chromosome and have discovered that the ribbon is bended and wrapped to form a sort of sphere, but a very particular one. You might expect it. The various portions of the “ribbon” need to be readily accessible to be copied and this requires that the ribbon part you are interested in is visible. This would not be the case in a normal ball of threads, but this is exactly what happens in fractal globules.

In spite of the very complex appearance, as shown in the figure, each portion of the ribbon can be easily isolated and once used can return to its original position in the ball. It is not knotted (in a way it is similar to the Peano curve).

The result was obtained by researchers from the Broad Institute (Harvard and MIT) and has been published in the Harvard Gazette. To discover the structure of the DNA in the chromosome the researchers have used a new probing technique, HI-C, that basically measure the number of collisions among atoms. Nearer atoms are likely to bounce up more frequently and this is certainly the case for atoms in nearby places on the DNA ribbon. However, there the researchers have discovered many places of high bouncing for atoms that in principle should have been quite distant from one another in the DNA ribbon. By computing the distance in terms of bouncing frequency the researchers have been able to construct the structure of DNA, and it turns out that it is a fractal globule.

What I think is interesting is to notice how processing of data can reveal the inner structure of our world, beyond what is the physical limits of optics. It is sort of magic: detecting quantities and being able to derive meaning out of that. GEt ready to see much more along this lines in our Digital Societies. Data are really at the core of everything.

A USB Memory Stick? No, it’s a genome sequencer!

Thursday, March 8th, 2012 by Roberto Saracco

Several times in this blog I posted news showing the acceleration of the technologies for sequencing the human genome. The target is to arrive by the end of this decade with every one of us (most of us) having in his pocket his genome sequence. To achieve this result the sequencing should become cheaper (down to 100 euros) and faster (like a blood analyses).

A USB Stick for Molecular analyses

This is “just” another one, but it really shows that we are on track. This USB device is able to detect DNA molecules, not sequencing the whole genome but once you plug it in in a laptop and you wire together twenty of them you will be able to sequence a whole human genome in … 15 minutes! That’s amazing.

The device is produced by Oxford Nanopore. They will begin selling this USB sequencer at the end of 2012 at a price of 900$.

Just think of the implications. A machine like this would be able to cut dramatically the price of genome sequencing (may be not down to 100 euros but surely well under 1,000 – you need to factor in the cost of the reagents required). Besides, its speed can make the sequencing a normal procedure. Just remember that the first genome sequencing took 10 years and several billion $.

I can see the time, soon, when we will have our genome in our medical record and drugs being prescribed base on the genome. Plus, proactive medicine will take into account the likelihood of suffering from certain diseases and would let us monitor for early tell tale signs and prompt action.

There will also be, as in any new technology, issues on privacy and potential misuse. Still, it is a new exciting frontier to explore!

Our DNA in the pocket

Saturday, January 21st, 2012 by Roberto Saracco

In the last years I wrote, from time to time, about the decoding of the genome, a feat that was accomplished at the turn of this century at  cost of several billion $ over a period of ten years.

The chip prototype for the genome decoding

The evolution of technology promised to bring down the price to  a point that one could imagine that every person would have her own genome decoded. The expectation for this was by 2020. Now, we are right on track with the announcement from a Connecticut based biotech company, Ion Torrent, that has announced a chip, ready for use in 2013, able to decode a person’s genome in a day at a cost of 1,000$.  It may not be within the (affordable) reach of most people but it is clearly in the right direction. A slashing in cost of a million times in 14 years can make the forecast of a cost in the range of 10-100$ by the following 5 years perfectly reasonable.

So, I guess we can take for granted that from an affordability standpoint we will be able to decode the genome of any newborn in her first week of life by the end of this decade. That code will remain associated to her clinical record through her life. That will enable a complete new approach to cure and prevention, as well as create a worldwide data base of the human genome.

This, from a technology point of view, leads directly into the Big Data and from there to the amazing wealth that can be leveraged. Today the availability of the genome decoding has allowed scientists to understand a bit of what all codons mean. Just an infinitesimal fraction of the knowledge that is hidden in the genome. Progress is slow because the copy of the genome one researcher is looking at is “generally” representative of the “human” genome but not quite. It is actually from the differences present in the genome as we move from one person to another that we can grow our understanding.

Having billion of genomes decoded opens up a completely new approach to create knowledge, based on statistical analyses and then on comparative analyses.

I can just imagine the huge number of services that will be offered once we have each person’s genome decoded and all genome data shared. And I am pretty sure that I am greatly underestimating their number.

ICT and Bio will be more and more inter-related and we are going to see a profound change in our life because of that.

Understanding the DNA program…

Tuesday, November 8th, 2011 by Roberto Saracco

All the variety of life, as far as we can tell, stems from instructions coded in the DNA. We have been able to understand the coding of proteins but so far we have been missing the big picture. How is it that a certain DNA string ends up producing a whale and another, pretty similar code by the way, producing us or a earthworm?

Mechanisms that determines the form of the living being

Well researchers at the Ecole Politechnique Federale de Lausanne and at the University of Geneva have unraveled a part of the mystery.

During the development of an embryo, everything happens at a specific moment. In about 48 hours, it will grow from the top to the bottom, one slice at a time – scientists call this the embryo’s segmentation. “We’re made up of thirty-odd horizontal slices,” explains Denis Duboule, a professor at EPFL and Unige. “These slices correspond more or less to the number of vertebrae we have.”

Every hour and a half, a new segment is built. The genes corresponding to the cervical vertebrae, the thoracic vertebrae, the lumbar vertebrae and the tailbone become activated at exactly the right moment one after another. “If the timing is not followed to the letter, you’ll end up with ribs coming off your lumbar vertebrae,” jokes Duboule. How do the genes know how to launch themselves into action in such a perfectly synchronized manner? “We assumed that the DNA played the role of a kind of clock. But we didn’t understand how.”

All the development in this early stage where the foundation of what that organism will look like is done at a clockwork precision.

What is interesting, beyond the fact that any discovery like this stimulates my curiosity, is the fact that understanding the code means expanding our capabilities of interacting with bio substances and also potentially use some of those mechanisms in our building with nanotechnologies. The next decade will see an explosion of new materials designed in the lab to fit the needs of engineers (and ourselves as end users). It will be the decade of micro fabrication and although there are concerns we should notice that we will get a bit closer to work in the natural way!

Penicillin wiped out my hard drive!

Friday, January 14th, 2011 by Ottone Maurizio Grasso

In a foreseeable future we might accidentally wipe our video collection or our photographic history because we forgot to feed the memory that contained them!

Since Watson and Crick discovery of DNA with its double helix structure and four nucleobases alphabet we’ve been fascinated by this way of encoding information. In recent times several laboratories undertook efforts to use this biological instrument for scopes other than supporting life. Biostorage, the idea of storing any kind of information in living organisms is quite a new field: the first attempts in this direction are just 10 years old.

By Photo by Eric Erbe, digital colorization by Christopher Pooley, both of USDA, ARS, EMU. (ARS Image Gallery Image Number K11077-1 (highres)) [Public domain], via Wikimedia Commons

In 2007 short sequences of text were successfully encoded in bacteria by a team of Keio University in Tokyo1. Now a team from CUHK, the Chinese University in Honk Kong, managed to store more complex data (like video) in E. coli bacteria. They did so by developing a method of compressing data, splitting it into chunks and distributing it between different bacterial cells, which helps to overcome limits on storage capacity. They are also able to “map” the DNA so information can be easily located2. Information density reachable by this approach is astonishing: researchers say that one gram of bacteria could store the same amount of information as 450 2 Terabyte hard disks.

Just imagine how much traditional archives would shrink if stored on such a media: national archives shelving of developed countries usually expands for hundreds of kilometers but would fit on a few petri dishes stored in a refrigerator!

Besides that, as bacteria reproduce this would mean that this information could be preserved as long as we keep these organisms alive.

We’re obviously far away from being able to use this marvelous storage media outside of a biology laboratory, but the road is being paved for the moment where an accidental drop of penicillin could wreak havoc into your hard drive


1Yachie, N., Sekiyama, K., Sugahara, J., Ohashi, Y. and Tomita, M. (2007), Alignment-Based Approach for Durable Data Storage into Living Organisms. Biotechnology Progress, 23: 501–505. doi: 10.1021/bp060261y

2Lou S, Ni B, Lo LY, Tsui SK, Chan TF, and Leung KS: ABMapper: a suffix array-based tool for multi-location searching and splice-junction mapping. Bioinformatics. 2010

2Lou S, Li JW, Qin H, Yim AK, Lo LY, Ni B, Leung KS, Tsui SK, and Chan TF: Detection of splicing events and multiread locations from RNA-seq data based on a geometric-tail (GT) distribution of intron length.
BMC Bioinformatics. 2010