D3.5/views/deliverable.md

18 KiB
Raw Permalink Blame History

Introduction

This deliverable principally consists of a working implementation (ALPHA stage) made available in source-code format and binary format compiled for several target architectures, demonstrating the extreme portability of this language interpreter.

This document accompanies the deliverable with a brief overview of characteristics and implementation choices deriving from DECODE's deliverable D3.3 "DECODE language design patterns" which is a living document currently indicated as the Whitepaper for the Zenroom implementation.

Following the publication of this deliverable, both documents will be merged into the Zenroom whitepaper, a living document nurtured by all findings published along this research path.

The following sections of this document are briefly illustrating the design choices made so far in developing a minimal language to express key privacy and integrity logic policies for data items, binding cryptographic primitives and human language into a domain-specific language (DSL).

High level and low level

This research is best understood by envisioning the dualistic path taken in the effort of making two extremely loose and far apart ends meet: the technical language of cryptography and the human language of rule expression. There is certainly no shortage of literature on the topic and it is well beyond the scope of this document to debate its values.

What this document delivers is a practical attempt which is still a work in progress at the time of writing, to make the loose ends meet by virtue of a foundational DSL implementation that on one side facilitates the expression of cryptographic schemes and on the other side, still in progress, matches the emerging semantic with the studies made on DECODE pilots and their needs to clearly express credentials and entitlements.

The following implementation has therefore to be seen as the common ground on which we are grafting our understanding of the pilot needs on the cryptographic schemes implemented in DECODE.

Implementation

The implementation is named "Zenroom" and is available at the following web address https://zenroom.dyne.org and the release of this deliverable coincides with the release of version 0.5 of this software, following the progression that was started by deliverable D3.3 and that lead to this deliverable across the iterations reported in its ChangeLog:

## 0.5.0
### April 2018

Fully adopted Milagro-crypto-C as underlying crypto library,
abandoning luazen at least for now. Refactored the API and language
approach to adopt a more object-oriented posture towards first-class
citizen data objects (octets) and keyrings. Full ECDH implementation
with support for multiple curve types.

Direct-syntax interpreter upgraded to Lua 5.3; dropped dependency from
lua_sandbox effectively cleaning up large portions of code.

Improved support for javascript; implemented a cryptographically
secure random generator linked to different RNG functions provided by
native platforms. Added build targets for Android and iOS, improved JS
support both for NodeJS and WASM targets.

Adopted an embedded memory-manager (umm) optionally enabled at
runtime, achieving significant speed improvements, reduction of
resources used and full control on memory allocation; adopted a
function pointer mechanism to easily include different memory managers
in the future.

Updated documentation accordingly with more examples and
tests. Half-baked RSA implementation may be abandoned in the future
unless use-cases arise.


## 0.4 (ALPHA)
### March 2018

Major improvements to standard Lua direct-syntax compatibility, port
to emscripten, osx and win targets. Documentation using LDoc and
website. Support for cjson and other embedded extensions. First
binary release, enters ALPHA stage.


## 0.3 (Prototype)
### February 2018

Build fixes for various architecture targets. Milagro integration,
test suites, continuous integration setup.

## 0.2
### December 2017

Whitepaper and improved Lua support.
Adopted luazen in place of luanacha.

## 0.1 (POC)
### November 2017

Proof of concept based on lua_sandbox

The technical documentation of Zenroom is partially already covered in its online API documentation and should be seen complementary to this document, made available on https://zenroom.dyne.org/api

Features

Zenroom can be primarily defined as a "virtual machine". It is self-contained and can be compiled as a completely static binary, without any dependency and ties to the host operating system.

Zenroom does not require any access to the network nor to the filesystem, it simply processes data as input to return data as output and in doing so it adopts its own experimental memory management without even relying on the memory allocation calls provided by the OS. Its development will continue in the direction of making this component as minimal and independent as possible in covering its role, which is that of interpreting language describing a set of rules that operate cryptographic transformations on rather complex data structures.

The namespace of reference for cryptographic primitives in Zenroom, likely to be adopted also in other DECODE component implementations, is the one provided by the Apache Milagro library, which seems to be well succesfull in establishing itself as an efficient [@budroni2017efficient] and de-facto standard implementation library with no further dependencies, undergoing adoption for privacy aware services [@rios2017query] and Internet of Things products [@scott2016sok].

Zenroom is extremely portable, being written following the C99 standard [@ritchie1988c] whenever possible. As of today we managed to port its binaries to all major desktop platforms (Windows, Apple and GNU/Linux) plus major mobile platforms (Android and iOS) and to Java platforms via JNI. Last not least Zenroom succesfully compiles and run as "Javascript code" via LLVM/Emscripten, de-facto placing this software as an active player for the upcoming wave of innovations bound to the adoption of the "WebAssembly" enabling technology in web browsers [@haas2017bringing].

As indicated already in D3.3 Zenroom adopts the Lua engine [@ierusalimschy1996lua] in a slightly modified form of version 5.3 as its direct-syntax parser.

Zenroom facilitates a declarative approach by implementing efficient schema validation, a crucial feature to secure the development of taxonomies to a solid data-centric paradigm [@murata2005taxonomy]. It also allows to write code in lazy functional programming style based on recursion operators associated with data type definitionson data [@meijer1991functional].

At last, it aims at establishing as first-class citizens [@kosar2004stork] both simple cryptographic primitives and complex concepts as keyrings belonging to people and entities.

Taxonomy of pilot entitlements

This section presents an analysis that is foundation to future development directions for Zenroom to express nouns and predicates emerging from the analysis of entitlements across pilots. The challenge ahead will be that of connecting this taxonomy with well understandable scripts that can cover most of the configurations described.

We proceed presenting an exemplary set of Smart Rules and related attributes for the application of DECODE's language development to pilots. The rationale for the selection of privacy by design strategies for this pilot descends from the need to apply the principles of minimization, separation, abstraction, hide, information, control, enforcement and demonstration as outlined in deliverable D1.2 (Privacy Design Strategies for the DECODE Architecture).

By endorsing an inductive methodology to pilot design, the goal is to define a general taxonomy of Smart Rules, attributes and entitlements that comply with the approach to Attribute Based Credentials defined in deliverable D1.4 (First Version of DECODE Architecture).

iDigital Decidim

In this first case, the problem to be solved is to provide safe identification for users while being sensitive about sharing sensitive user data to extract valuable information about city concerns that might be later be used to propose data-driven policies in the city. (Cf. D1.1 - Scenarios and Requirements Definition Report). Accordingly, the solution or service to be provided is a set of rules for data sharing/ data “donation” by participants/owners under the rules issued by the issuer, i.e. the promoters of the Decidim platform. This dynamic can be audited by a relying party, for instance the Municipal Information technology Insititute at Barcelona City Council.

The rules, attributes and entitlements summarised below are meant to show in an intelligible form how sensitive personal data can be used for the public good, shaping the form of the “data commons”. The service is open source and can be adopted by other municipalities or companies who want to study the data for both data-driven public policymaking and private business, within a GDPR compliant environment.

In such an environment the stakeholders are all users of Decidim, research institutions interested in urban matters such as Eurecat and UB, data journalists, data service related industries and developers (Data Beers BCN, BCN Analytics Hub) and app developers and hackers who want to use Metadecidim data and DECODE platform to develop new services. Also citizens who wish to learn / use data analysis techniques and share them (Cf. D1.1 - Scenarios and Requirements Definition Report).

Key enabling factors:

  • Transparency in data storage and user entitlements: enable users to control where their data is stored, choose what identifying information is shared and the granularity of access levels for that information
  • Auditable petition signing process: As a provider for enabling citizens to make collaborative decisions, there should be a way to audit and verify transactions in the system in a reliable manner

Hypothesis statement:

As a user I want to sing a petition in a secure, transparent and audible process, and control the granularity of access to personal information I share with my petition.

Result:

  • minimal app for users to decide data sharing rights (HOW MANY, what are them) and who accessed their data (Possible TOKEN3).
  • minimal visualisation of Decidim data and knowledge extraction to show to users that donated their data

Below a preliminary set of Smart Rules (SR) for Decidim as derived from the current Privacy Policy and Terms of Use as published by Barcelona City Council on decidim.barcelona:

  • SR1: audit storage —> the user can audit the storage of her data within the file “Citizens Participation” located in the Decidim servers on the premises of Barcelona City Hall updated to run a DECODE node.
  • SR2: data subject credentials for registration —> national ID number, date of birth, post code to be accessed by Decidim according to Attribute Based Credentials framework. Moreover the data subject must create a password and accept of GDPR compliant privacy policy and terms of use
  • SR3: granularity of access levels for that information
  • SR4: Audit transactions, i.e. petition vote
  • SR5: verify transactions, i.e. petition vote
  • SR6: submit a proposal
  • SR7: modify proposal
  • SR8: signal to data subjects any change in privacy policy
  • SR9: allow data subjects to amend their personal data sharing policy
  • SR10: allow data subject to cancel their subscription
  • SR11: allow registration for natural persons only if attribute “age” : decode account; age of subscriber; > 16 years
  • SR12: allow registration to legal entities such as groups, collectives, city organisations with the following credentials —> name of organisation, responsible person, telephone number of the organisation, email of the organisation, password, acceptance of GDPR compliant privacy policy and terms of use
  • SR13: illegal use of the platform (copyright, trademark infringements; publishing personal data belonging to other data subjects; sending spam or viruses; setting up pyramid schemes, ponzi schemes; commercial ads; non conformity with public decency; create multiple users to steer voting)
  • SR14: banning a participant
  • SR15: waving Decidim from the responsibility to address disputes among participants.
  • SR16: participants Intellectual Property Right: all users generate contents will be published under Creative Commons license (CC-BY-SA).
  • SR17: Only the user name, a pseudonym, is public information accessible to third parties.

IoT Pilot involving CitizenSense

This pilot focuses on data produced by citizens in the context of crowdsourced scientific research through leveraging data subjects data donations. The latter come from different sensors and devices, especially smartphones that can put at risk both the implementation of the research and the safety of data subjects producing and donating data. In the case of this pilot the sensor is made by an Arduino chip and a minimal web interface that communicates with the Smart Citizen platform.

In particular, Barcelona second pilot, CitizenSense is a pilot designed to solve the problem of suitable and safe participation to the initiatives promoted under the umbrella of “Oficina de Ciencia ciutadana”. The goal is to offer a GDPR compliant smart city service connecting the Open Data Barcelona portal to the Open Data infrastructure through DECODE to build and manage "data commons" datasets.

These datasets are currently gathered on ODI (Open Data Infrastructure), Sentilo, IRIS (Incidències, Reclamacions i Suggeriments), ASIA (Aplicatiu de Sistemes Integrats dAtenció), CityOS (City Operating System) from Barcelona City Council infrastructure; and two public sources, i.e. Smart Citizen and Inside Airbnb (CF. D5.3 - Data analysis methods and first results from pilots).

The pilot would be a proof of concept for how a decentralised storage and access rights ledger, with dynamic permissions (in the sense that citizens can revoke access, could be used to support distributed sensing projects. This includes the data sharing part, but also the decentralised (or at least hashed) data storage solutions of highly non scalable IoT sensing data streams (Cf. D1.1 - Scenarios and Requirements Definition Report).

Key enabling factors:

  • DECODE Hubs to store the data access permissions, that should be connected to the infrastructure where the actual IoT data is stored.
  • DECODE Node running on DECODE OS that would mediate access of a specific CitizenScience project to the DECODE Hubs.

Hypothesis statements:

  • As a user I want to be in control of my data”
  • As an IoT platform provider I want to give users a transparent , traceable, secure, collaborative platform. (CF. D1.1 - Scenarios and Requirements Definition Report).

Below a preliminary set of Smart Rules (SR) for CitizenSense:

  • SR0: audit storage —> the user can audit the storage of her data, also by mining the dataset s/he co-produced in complete anon- or pseudonymity.
  • SR1: citizen revokes access to the data produced and donated by data subjects.
  • SR2: find data from citizens
  • SR3: access data from citizens
  • SR4: manage data from citizens
  • SR4: save data from citizens
  • SR5: register data donor with the possibility to recall them for new research
  • SR6: register research projects and put them in touch with data donors willing to take part to the research
  • SR7: banning a participant
  • SR8: verify integrity of data
  • SR9: data access traceability
  • SR10: Data transparency
  • SR11: Data reusability
  • SR12: expiration date - at some point, i.e. end of research data gathering, the validity token should expire although a users can opt out at any moment before token validity expiration.

Holiday Rental Registry / FairBnb

The pilot aims at combining the FairBnb community and Short Term Rental Register in Amsterdam. The goal is to show that DECODE can support the implementation of a city wide register. In order to achieve this goal, a web application that enables Amsterdam residents to register rental periods with the municipality will be developed.

Below a preliminary set of Smart Rules (SR) for Holiday Rental Registry / FairBnb:

  • SR0: audit storage —> the user can audit the storage of her data, also by mining the dataset s/he co-produced in complete a non- or pseudonymity.
  • SR1: revoke access to data produced and shared by data subjects.
  • SR2: landlord submit data once
  • SR3: landlord shares data submitted many times
  • SR4: enabling FairBnb to interact with data collected by the city
  • SR5: citizen authenticates as an Amsterdam citizen against Municipal Personal Records Database
  • SR6: citizen registers address for rental on Short Term Rental Register in Amsterdam (Check address validity against Cadastre)
  • SR7: landlord registers rental periods on Short Term Rental Register in Amsterdam
  • SR8: give landlord information on the balance within the 60 days annual rental limit
  • SR9: give municipality information on the balance within the 60 days annual rental limit
  • SR10: FairBnb issues a certificate (token) to the municipality to record that two peers reached consensus for a rental transaction
  • SR11: Banning a landlord from business if s/he goes beyond the 60 days limit
  • SR12: register guest (attribute?)
  • SR13: register local business to take part to distribute hospitality platform FairBnb
  • SR14: register citizen who is not a landlord (desirable? attribute?)

Gebiedonline

Gebiedonline (Neighborhood Online) is an Amsterdam based online neighbourhood platform that is cooperatively run and owned. Every decision is made within the community. The platform aims to enable people, groups and organisations to view events taking place in the area, share news, exchange and borrow products and services, and to meet people in a GDPR compliant environment.

Below a preliminary set of Smart Rules (SR) for Gebiedonline:

  • SR0: Audit storage —> the user can audit the storage of her data, also by mining the dataset s/he co-produced in complete a non- or pseudonymity.
  • SR1: revoke access to data produced and shared by data subjects.
  • SR2: collective decision making
  • SR3: citizen registration
  • SR4: group registration
  • SR5: organization registration
  • SR6: create event
  • SR7: publish event
  • SR7: view event
  • SR8: share news
  • SR9: contact peers
  • SR10: offer product
  • SR11: borrow product
  • SR12: offer service
  • SR13: borrow service
  • SR14: banning subscriber
  • SR15: manage data sharing rights