A new journalist reads in many languages and reports in real time
Sunday, July 8th, 2012 by Roberto SaraccoA research program of DARPA, Auto-Text to Knowledge is about to release its results. By the end of this year the project will provide the Pentagon with almost real time summary of articles being published all over the world in many languages. The summary will be in English and will look like one written by a good journalist able to get the gist of what is being discussed in the article and summarizing it for a quick brief.
The point is that no journalist is involved. The work is being done by computers that keep scanning news sites, analyze articles and summarize them, also establishing cross connection to get the overall picture.
We already have programs, like FlipBoard, that are able to do a very good job in extracting articles from many sources and publishing them in an online magazine like form. Now it will be possible to have the summary of articles for a quick scan.
The technology involved is quite complex and push the boundaries of computer intelligence. The computer has to make sense of what is being written and has to put that into a context. Furthermore, to compare what is being written in different places about a certain topic it has to understand the different points of view. There is basically no limit to the sophistication involved (it may go beyond the capabilities of a human journalist….). Think about comparing a political news as reported by a Syrian newspaper (tied to the present government) and one reported by a European newspaper that is opposing the Syrian government. The fact might the the same but the way it is reported may be completely different!
I am really curious to see the extent of “smartness” that is achieved in this very ambitious project. Notice that they target “knowledge” and that is a very tricky area. You understand immediately what knowledge is but as soon as you start looking into it it becomes very fuzzy indeed…
It is clear that this is the way to the future of browsing the web, having automatic oils that can do the browsing and present a summary of what is out there. The question is how can we trust them? Already today Google claims to be transparent in its reporting of links to your query but is that really the case? Having placed a link in the 10256th page is probably the most efficient way of hiding it, rather than disclosing it! Formally they can claim they are giving the information to you, but in practice by choosing what goes in the first (few) page(s) they are steering you to very specific places… and it is their decision, not your.
Tools that will summarize news will pose even greater issues, particularly if they claim to provide us with “knowledge”. New possibilities are around the corner, and new challenges we never considered before will need to be faced.


