One, two, … a thousand variants on “Personal Data Store” theme
Wednesday, November 2nd, 2011 by Corrado MoisoSeveral companies and projects are proposing solutions to concretize the proposal to provide individuals with a set of capabilities by means of which they can collect, manage, share and use their “Personal Data” (for an incomplete list see http://blogs.law.harvard.edu/vrm/2011/02/15/state-of-the-vroom/). In my last posts, I discussed how these “Personal Data Stores” (or, better, “Personal Data Services”) are a key element for the construction of an individual-centric ecosystem among all the actors involved in the production and use of data about the persons.
Unfortunately, at the moment, there is not a clear and stable vision of the functions that a “Personal Data Store”. Different products and prototypes provide different sets of capabilities, and motivated by different requirements and use scenarios. One of the topics under discussion about ecosystem built around personal data is on the possible business models and impacts on them determined by different types of “Personal Data Stores” (e.g., see the session on “Personal Data Ecosystem Consortium” during the last Internet Identity Workshop).
In this post, I would like to contribute to the discussion, by proposing a possible decomposition of these capabilities into five groups, and by relating, each of them with a different application scenario.
- features to enable the individuals to create and manage their “digital footprint”: these should include functions for a “Data Space” for the storage of the “digital footprint”, the collection of personal data from different sources, their organization (e.g., enriching data with metadata), search/retrieval, and visualization; it is important that an open/public data model is adopted for retrieving and organizing the data and that some functions are achieved in an automatic way, such as the collection of data from different sources (e.g., personal devices), the generation of metadata information (e.g. by means of tools performing semantic analysis or data mining); at the moment, I think that the several proposals of “Personal Data Store” partially cover all these characteristics (e.g., see http://mydex.org/, http://lockerproject.org/, http://themineproject.org/);
- features to enable the execution of personal applications exploiting the data in the “digital footprint”: these must offer a set of mechanisms enable applications to access the data according to an open data model (e.g., query, read/write operations, event notification according to pub/sub model), and a trusted environment (e.g., a sand-box) for the deployment, management and execution of “personal” applications (e.g., applications which support individuals to improve their life, such as applications for “personal information management” or “personal task management”); an example is provided by Kynetx (http://docs.kynetx.com/ );
- features to enable a controlled sharing of the data in the “digital footprint” in the context of some service delivery: these capabilities concern the definition of (temporary or permanent) relationship between an individual and a third party (other individuals, enterprises, service providers, public organizations, etc.) in the context of which personal data are accessed, shared or synchronized (examples are functions implementing XDI-based data views); in order to correctly establish and control the relationships, these functions should have a strong interaction with a (federated) identity framework; moreover, these functions should include mechanism to support policies on data use control, i.e., policies constraining how data disclosed by an individual can be used by a 3rd party; an example is the XDI-native Personal Data Store developed in the OpenSource Project Danube (http://projectdanube.org/);
- features to create and handle aggregations of data: these functions are in charge of managing the relationships between an (homogenous) group of individuals and a third party (e.g., a public organization, a data broker, etc.); they create aggregations of the data disclosed by each of the group members, according to different aggregation criteria, by applying neutralization/anonimization filters on sensible data, and by improving data sets by reducing “statistical” effects, etc.;
- features to deal with negotiations on personal data; these features should enable individuals to negotiate the conditions on the disclosure of their data to 3rd parties, in order to get some economic or social advantages; they should also include functions to enable the negotiation of aggregations of data offered by different users, and the distribution of benefits to the contributing users; these functions could be supported by the definition of algorithms to evaluate the value of the data offered by the individuals or grouped in some aggregation, and to automatize the negotiations (e.g., according to some auction model) between the individuals disclosing the data and the actors (aggregators, service providers, etc.) interested in using them; an example is the solution prototyped by statz (www.statz.com), which covers also the functions in the previous group.
In general, these groups of features could be seen as layers: in general the functions in a layer rely on the functions in the lower layers, and the layers could be introduced in an incremental way.
In the following I would like to share some preliminary considerations on the application scenarios (and related business opportunities) enabled by each of the previously introduced groups of features.
The set of features to create a “Data Space for digital footprint” can provide to individuals the benefits that organizations have enjoyed for years after the introduction of information management systems (e.g., enabling the real-time control of their processes, the creation of CRM systems, the introduction of data warehouse, and the exploitation of data mining algorithms). Different providers could offer to individuals services, which differentiate in terms of functional coverage, level of automation and configuration, interaction with external systems.
The full exploitation of the digital footprint can only be achieved by opening its access to applications developed by 3rd parties. This is provided by the capabilities for “Trusted environment for personal applications”, whose introduction enables the creation of an application ecosystem similar to the one built according to the “Application Stores” models for smartphones. Individuals can select the relevant applications offered in an Application Store, buy it, download and deploy it on the execution environment associated to his/her data space.
The third group of functions for “Controlled sharing of individual’s personal data” with 3rd parties enable scenarios similar to the ones investigated in the “Vendor Relationship Management” initiative to create relationships between individuals and organizations (either enterprises or public organizations) aligned with principles of “transparency, fairness and user control”. Pre-defined templates could be adopted in order to ease the definition of relationships (i.e., the set of shared data, the allowed use of them, the duration of the relationship, etc.) among individuals and providers of services (both in the real and in the digital worlds). Moreover, the establishment of relationship among individuals could generate new initiatives of federated social networking. These application scenarios should be integrated with those enabled by the introduction of federated identity management frameworks.
The functions on “Data aggregations” enables application scenarios where the aggregated view of individuals is relevant. An example is the creation of aggregated view of personal data related to groups of individuals living in a given city and the municipality (a related example is provided by the recent DCC initiative of UK government about the handling of data on energy usage). Individuals could disclose for the creation of an aggregated view all the data relevant for improving the management of the territory, by possibly requiring the application of anonimization and neutralization filters to protect some sensible data.
Finally, the set of capabilities related to “Personal data negotiation” would create the opportunities to have a more rich and fear marketplace on the personal data, by enabling the so called “The Economics of Personal Data and the Economics of Privacy”. Individuals can trade the conditions for enabling 3rd parties (service providers, data brokers, …) to access some of their data, with the possible involvement of an actor play an intermediary role. Individuals and 3rd parties can agree on which data are disclosed, possible neutralization filters, etc. and on the benefits for individuals (in terms of money, free access to services, etc.). In this way, individuals can be more actively involved in the exploitation of their personal data (at least to achieve a greater awareness on the data disclosed to have access to free services).
In this analysis I assumed that the different sets of functions are provided by a single type of entity, i.e., the “Personal Data Store” providers. A deeper analysis could consider more complex configurations, where some of the groups of functions are provided by a different actor. For instance, the functions on “data aggregation” could be provided by a “Data Aggregator”, which accesses to the individuals’ data by means of relationships supported by the functions in the “controlled information sharing” layer implemented in the “Personal Data Store”. Moreover, other actors could be involved in order to provide enabling basic capabilities, such as providers of storage.
According to my point of view in order to better understand the business models related to these application scenarios it is necessary to investigate how the different actors involved in the personal data ecosystem estimate the “value” of the services implementing the identified group of functions. Each actor can assign a different estimation, strongly influenced by “value” it assigns to the handled personal data, determined by the possible economic, social, personal advantages which the actor can obtain through their exploitation.
Tags: identity management, personal data, privacy, service ecosystems



