bulk commit for progress made in the last month

adds BDD, Schema and lots more info and figures
This commit is contained in:
Jaromil 2018-12-20 15:15:14 +01:00
parent 557c394716
commit bd4cb12750
9 changed files with 372 additions and 9 deletions

56
views/bdd.md Normal file
View File

@ -0,0 +1,56 @@
# Behavior Driven Development
In Behavior Driven Development (BDD), the important role of software integration and unit tests is extended to serve both the purposes of designing the human-machine interaction flow (user journey in UX terms) and of laying down a common ground for interaction between designers and stakeholders. In this Agile software development methodology the software testing suite is based on natural language units that grant a common understanding for all participants and observers.
To implement BDD the first step is that of mapping a series of combinable, cascading sentences to actual source code; this implementation is usually done manually by programmers that have knowledge of the higher level application protocol interface (API) that grants communication between the backend and the frontend of a software application. The BDD implementation can then be seen as an alternative frontent whose purpose is that of lowering the distance between expression and execution by means of utterances expressed in human language.
Far from giving an exhaustive description of BDD implementations and characteristics, this brief chapter intends to summarize the features of this approach where they specifically apply to the development goals of Zencode (previously stated) and the solution provided.
Referring to the Cucumber implementation of BDD, arguably the most popular in use by the industry to day and de-facto standard, the grammar of utterances is very simple and definible as a "cascading" flow indeed, since the fixed sequence of lines can follow only one fixed order:
- Given
- and*
- When
- and*
- Then
This sequence is fixed and in simple terms consists of an extendable initialisation of states "Given (and)*" followed by an extendable transformation of states "When (and)*" and concluded by a non-extendable enunciation of states in their final form "Then".
The Zenroom implementation is kept simple at this stage and does not takes any "fuzzy" approach to the parsing, but simply defines fixed sequences of strings and variables that are expected to occur within them: the variables are what is ultimately possible to change by users and are marked by a repeating sequence of two adjacent single quotes ('').
The underlying parser acts upon a positive, unique and so far non-flexible match of the whole phrase minus the variables, then executes a function that takes as many arguments as the variables present in the lines across the utterance. As a result, every single non-repeating line of the utterance has a declared function that interacts with the underlying implementation of Zenroom, whose actions are defined in its LUA subset language.
Brief examples of this implementation follow:
```lua
Given("I introduce myself as ''", function(name) whoami = name end)
Given("I am known as ''", function(name) whoami = name end)
```
The above definition of two lines possibly occurring within the utterances in Zencode are demonstrating how a state "who am I" basically my own name can be set using two different phrases, leading to the execution of the same function which basically operates a simple assignment to the variable `whoami`. This simple demonstration is a hint to the fact that multiple patterns can be defined also in different ways, making the Zencode DSL implementation very easy to translate across different spoken languages as well contextualized within specific idiolects adopted by humans.
Furthermore, another example of implementation:
```lua
Given("that '' declares to be ''",function(who, decl)
-- declaration
if not declared then declared = decl
else declared = declared .." and ".. decl end
whois = who
end)
Given("declares also to be ''", function(decl)
ZEN.assert(who ~= "", "The subject making the declaration is unknown")
-- declaration
if not declared then declared = decl
else declared = declared .." and ".. decl end
end)
```
Shows how is possible to accept multiple variables and process them through more complex transformations that also contemplate the concatenation of contents to previous states. States are in fact permanent within the scope of the execution of a single utterance and will be modified in the same deterministic order by which they are expressed across lines. What is also visible within this example implementation, which we intend to facilitate by customisations made by people who have a simple knowledge of Zenroom's API and LUA scripting, is that the `ZEN.` namespace makes available a number of utility functions to easily check states (asserts) and propagate meaningful error messages that are then part of a traceback output given to the calling application (host) on occurrance of an error.
The full implementation of Zencode available at the time of publising this document is inside the sourcecode files `zenroom/src/lua/zencode_*` and is relatively easy to maintain for the pilots analysed in our project, as well easy to extend to more usecases. At the dawn of piloting sessions, due to the lack of actual feedback so far given in field trials, this implementation does not addresses specific schemes beyond a simple Diffie-Helman asymmetric key encryption (AES-GCM) and an even simplier symmetric encryption of ciphertext by means of a PIN and KDF trasformations. On top of that, perhaps the most complex implementation of Zencode so far is the "implicit certificate" crypto scheme (Qu-Vanstone, ECQV) still limited to first order curve transformations, which applies widely to pilots requiring simple certification schemes and is illustrated in more detail in the following chapters[^ecqv].
[^ecqv] It is important to note that while the ECQV scheme was not examined by other partners in our project, it has been choosen for its stable role in the industry and for its augmented complexity within an approachable implementation, complexity which could better inform the Zencode implementation. Without that complexity and without implementation feedback by other partners, it wouldn't have been possible to work on Zencode and bring it to what it is today, since both the Petition contract and the Coconut implementation in Zenroom are not available as of today and need to be completed in a later stage of the DECODE project.

47
views/conclusion.md Normal file
View File

@ -0,0 +1,47 @@
# Zencode usage
In order to better explain the potential of the Zencode Domain Specific Language (DSL), approaches may change on a domain-specific basis, meaning an explanation will be more effective when tailored on the specific context it applies to. As we are on the quest to merge the description of an algorithm with its executive expression we get close to the concept of a speech act that refers to a specific context and adopts a limited taxonomy which may or may not be inscribed in a larger ontology.
At the time of writing our explanation can be based on an extended experimentation of in-vitro usage (lab tests) and a limited experimentation of in-vivo usage mostly bound to the conceptualization of use-cases in the IoT pilot and the Amsterdam's register pilot. In order to extend the coverage of Zencode to more pilots, we need to have a completed implementation of the underlying cryptographic contract, in this case the petition.
What follows is a brief visualisation of what is realised so far. In particular the first visualisation below refers to the implementation of an asymmetric cryptographic exchange in the fashion of the PGP implementation, based on an exchange of pulic/private keys and their collection into a keyring:
![Asymmetric Diffie-Helman encryption using Zencode](encryption.dot.png)
This simplified flow diagram shows **actual Zencode** that can be executed, higlighting variables that are normally just surrounded by single quotes. Between each code block, which is executed asynchronously as required and at different times, there is a schema which indicates the shape of data in output.
What follows is another flow diagram leading to data outputs that can be reused into the above: is the use of ECQV implicit certificates via Zencode, which leads to obtaining public/private keypairs that are compatible with asymmetric encryption.
![Implicit certificate issuing and retrieval using Zencode](implicit_certificate.dot.png)
Future horizons of development of Zencode include further implementations supporting interoperable and extensible crypto schemes on the same EC curve that can still work with the above implementations, as well further refinement of the parser and extension of the schema validation. From this point onwards Zencode must be informed by piloting, while it will be also refined in cooperation with legal experts to match the smart-rule statements so far identified to express consensual data processing conditions.
# Zencode Integration
The integration of Zencode is so far relying on the same integration schemes present for Zenroom, with the addition of a minimal layer of boilerplate code for its execution. This is so to facilitate flexibility in piloting, but will be later changed to lock down to the sole execution of Zencode via new specific API calls.
Therefore, for now, in addition to the C call that we have exported to Java, Go, Python and Javascript languages along with utility wrappers:
```c
int zenroom_exec(char *script, char *conf, char *keys,
char *data, int verbosity);
```
We also have the boilerplate internal to the `script` buffer:
```lua
verbosity_level = 1
ZEN:begin(verbosity_level)
ZEN:parse([[
-- your zencode here
]])
ZEN:run()
```
The execution of actual Zencode lines happens sequentially at the time of the `ZEN:run()` call. Each line as part of the whole statement block (utterance) makes use of data types which may or may be validated and should be present in the KEYS and DATA buffers.
A list of Zenroom/Zencode integrated implementations follow: they have been developed in relation to each pilot software implementation as needed, covering several languages. Also notable the presence of the `zenroom` module inside the NodeJS Package Manager collection (NPM) and of course its extremely portable WebAssembly optimized build (universal binary).
TODO: list git repos

BIN
views/encryption.dot.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 342 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 375 KiB

View File

@ -8,7 +8,7 @@ The ECQV is identifiable as a simple yet important building block within DECODE,
ECQV relates well to those DECODE pilots in need to authenticate participants according to signed credentials, where the issuance of a public key is subject to the verification of certain conditions by a Certificate Authority (CA) capable of verifying and signing those conditions. This scenarios applies well to the pilot experimentations ongoing in Amsterdam for the DECODE project, where a certificate (and a keypair) is issued based on attributes that are certified by the municipal register and then used for authentication procedures operated by third parties and based on those attributes.
The limit of this implementation is the lack of decentralization, a problem that will be solved by the Coconut [cit] implementation in Zencode language, which is still a work in progress.
The limit of this implementation is the lack of threshold certification, a problem that will be solved by the Coconut [cit] implementation in Zencode language, which is still a work in progress. However it should be noted that only one pilot in DECODE (Amsterdam's Gebiedonline) may benefit from this feature, which is however not vital to the deployement.
## Differences with traditional certificates
@ -124,10 +124,139 @@ I.print({ private = CERTprivate:octet():base64(),
public = CERTpublic:octet():base64() })
```
At last, the implementation in Zencode follows
At last, the implementation in Zencode follows, clearly showing the
simplification made possible by Zenroom for the ECQV implicit
certificate cryptographic scheme. Each of the following "scenarios"
are blocks of code that can be executed independently from one
another, taking validated input and output data structures.
```
-- Zenroom 0.8.1
-- Zenroom 0.9
Scenario 'keygen': $scenario
Given that I am known as 'MadHatter'
When I create my new keypair
Then print my keyring
Scenario 'request': Make my declaration and request certificate
Given that I introduce myself as 'Alice'
and I have the 'public' key 'MadHatter' in keyring
When I declare to 'MadHatter' that I am 'lost in Wonderland'
and I issue my implicit certificate request 'declaration'
Then print all data
Scenario 'keygen': $scenario
Given that I am known as 'Alice'
and I have a 'declaration_public' 'from' 'Alice'
Then print data 'declaration_public'
Scenario 'keygen': $scenario
Given that I am known as 'Alice'
and I have a 'declaration_keypair'
Then print data 'declaration_keypair'
Scenario 'issue': Receive a declaration request and issue a certificate
Given that I am known as 'MadHatter'
and I have a 'declaration_public' 'from' 'Alice'
and I have my 'private' key in keyring
When I issue an implicit certificate for 'declaration_public'
Then print all data
Scenario 'split': Print the public section of the certificate
Given I have a 'certificate_public' 'from' 'MadHatter'
When possible
Then print data 'certificate_public'
Scenario 'split': Print the private section of the certificate
Given I have a 'certificate_private'
When possible
Then print data 'certificate_private'
Scenario 'save': Receive a certificate of a declaration and save it
Given I have a 'certificate_private' 'from' 'MadHatter'
and I have the 'private' key 'declaration_keypair' in keyring
When I verify the implicit certificate 'certificate_private'
Then I print data 'declaration'
Scenario 'keygen': $scenario
Given that I am known as 'Bob'
When I create my new keypair
Then print my keyring
Scenario 'challenge': Receive a certificate of a declaration and use it to encrypt a message
Given that I am known as 'Bob'
and I have my 'private' key in keyring
and that 'Alice' declares to be 'lost in Wonderland'
and I have a 'certificate' 'from' 'MadHatter'
When I draft the text 'Hey Alice! can you read me?'
and I use 'certificate' key to encrypt the text into 'ciphertext'
Then I print data 'ciphertext'
Scenario 'respond': Alice receives an encrypted message, decrypts it and sends an encrypted answer back to sender
Given that I am known as 'Alice'
and I have my 'private' key in keyring
When I decrypt the 'ciphertext' to 'decoded'
and I use 'certificate' key to encrypt 'decoded' into 'answer'
Then I print data 'answer'
```
The Zencode language is a DSL enforcing a strong declarative behavior underneath and all base data structures are checked against a validation scheme upon input and output. The checks are also of cryptographic nature, for instance public keys are checked to make sure they are actual points on the elliptic curve in use. Here below the data validation schemes so far in use:
```lua
_G['schemas'] = {
-- packets encoded with AES GCM
AES-GCM = S.record {
checksum = S.hex,
iv = S.hex,
schema = S.Optional(S.string),
text = S.hex,
zenroom = S.Optional(S.string),
encoding = S.string,
curve = S.string,
pubkey = S.ecp
},
-- zencode_keypair
keypair = S.record {
schema = S.Optional(S.string),
private = S.Optional(S.hex),
public = S.ecp
},
-- zencode_ecqv
certificate = S.record {
schema = S.Optional(S.string),
private = S.Optional(S.big),
public = S.ecp,
hash = S.big,
from = S.string,
authkey = S.ecp
},
certificate_hash = S.Record {
schema = S.Optional(S.string),
public = S.ecp,
requester = S.string,
statement = S.string,
certifier = S.string
},
declaration = S.record {
schema = S.Optional(S.string),
from = S.string,
to = S.string,
statement = S.string,
public = S.ecp
},
declaration_keypair = S.record {
schema = S.Optional(S.string),
requester = S.string,
statement = S.string,
public = S.ecp,
private = S.hex
}
}
```

View File

@ -1,6 +1,7 @@
introduction.md
# bdd.md
bdd.md
schema.md
# asymmetric_crypto.md
# elgamal_vote_tally.md
implicit_certificate.md
# conclusion.md
conclusion.md

View File

@ -1,8 +1,8 @@
# Introduction
This deliverable consists of the implementation of smart-rules effectively executing cryptographic operation and data transformations using a human readable language modeled according to the taxonomy expressed in DECODE's deliverable D3.5 "Initial definition of Smart Rules and Taxonomy".
This deliverable consists of the implementation of smart-rules effectively executing cryptographic operation and data transformations using a human readable language modeled according to a taxonomy of subjects and predicates found in the pilot applications. It delivers a technology that brings together expression and execution into utterances based on translatable human language phrases. This technology is a simple, non-touring complete natural language interpreter (Zencode) based on a domain specific language (DSL) that can run and execute inside a very portable virtual machine (Zenroom) capable of cryptographic transformations.
Since DECODE project's inception, reaching this point of development has been my personal ambition and it is perhaps the most important practical realization of a solution for some of the techno-political implications I've illustrated in my Ph.D thesis "Algorithmic Sovereignty".
Since DECODE project's inception, reaching this point of development has been my personal ambition and it is an important solution for some of the techno-political implications I've illustrated in my Ph.D thesis titled "Algorithmic Sovereignty" [@algosov2018].
## For the awareness of algorithms
@ -10,9 +10,9 @@ The goal of this task is ultimately that of realizing a simple, non-technical, h
To articulate the importance of this quest and the relevance of the results presented, which I believe to be unique in the landscape of blockchain smart-contract languages, is important to remind us of the condition in which most people find themselves when participating in the regime of truth that is built by algorithms.
As the demand and production of well-connected vessels for the digital dimension has boomed, machine-readable code today functions as a literature informing the architecture in which human interactions happens. The telematic condition is realised by an integrated datawork continuously engaging the observer as a participant. Such a “Gesamtdatenwerk” [@Ascott_1990] may seem an abstract architecture, yet it can be deeply binding under legal, ethical and moral circumstances.
As the demand and production of well-connected vessels for the digital dimension has boomed, machine-readable code today functions as a literature informing the architecture in which human interactions happens and decisions are taken. The telematic condition is realised by an integrated datawork continuously engaging the observer as a participant. Such a “Gesamtdatenwerk” [@Ascott_1990] may seem an abstract architecture, yet it can be deeply binding under legal, ethical and moral circumstances.
The comprehension of algorithms, the awareness of the way decisions are formulated, the implications of their execution, is not just a technical condition, but a political one, for which access to information cannot be just considered a feature, but a civil right. It is important to understand this in relation to the "classical" application of algorithms executed in a centralized manner, but even more in relation to distributed computing scenarios posed by blockchain technologies, which theorize a future in which rules and contracts are executed without requiring any human agency.
The comprehension of algorithms, the awareness of the way decisions are formulated, the implications of their execution, is not just a technical condition, but a political one, for which access to information cannot be just considered a feature, but a civil right [@pelizza_governance]. It is important to understand this in relation to the "classical" application of algorithms executed in a centralized manner, but even more in relation to distributed computing scenarios posed by blockchain technologies, which theorize a future in which rules and contracts are executed irrevocably and without requiring any human agency.
The legal implications with regards to standing rights and liabilities are out of the scope here, while the focus is on ways humans, even when lacking technical literacy, can be made aware of what an algorithm does. Is it possible to establish the ground for a shared language that informs digital architects about their choices and inhabitants about the digital territory? Going past assumptions about the strong role algorithms have in governance and accountability [@Diakopoulos_2016], how can we inform digital citizens about their condition?
@ -23,3 +23,4 @@ When describing the virtualisation of economic activity in the global context, S
The analysis of legal texts and regulations here shifts into an entirely new domain; it has to refer to conditions that only algorithms can help build or destroy. Thus, referring to this theoretical framework, the research and development of a free and open source language that is intellegible to humans becomes of crucial importance and, from an ethical standing point, DECODE as many other projects in the same space cannot be exempted from addressing it.
When we consider algorithms as contracts regulating relationships (between humans, between humans and nature and, nowadays more increasingly, between different contexts of nature itself) then we should adopt a representation that is close to how the human mind works and that is directly connected to the language adopted. In this thesis I interpret algorithms as the systemic product of complex relationships between contracts and relevant choices made by standing actors [@standing2014Monico]. The ability to verify which algorithms are in place for a certain result to be visualised, to understand and communicate what these algorithms do, to describe and experiment their repercussions on reality is in fact conditioning the very choices standing actors will make.

View File

@ -39,3 +39,28 @@
publisher={Il Saggiatore},
journal={Aut/Aut, La condizione postumana}
}
@article{pelizza_governance,
author = {Pelizza, A. and Kuhlmann, S.},
title = {Mining Governance Mechanisms. Innovation policy, practice and theory facing algorithmic decision-making},
year = 2017,
publisher = {Springer, Berlin},
journal = {Handbook of Cyber-Development, Cyber-Democracy, and Cyber-Defense}
}
@inproceedings{soeken2012assisted,
title={Assisted behavior driven development using natural language processing},
author={Soeken, Mathias and Wille, Robert and Drechsler, Rolf},
booktitle={International Conference on Modelling Techniques and Tools for Computer Performance Evaluation},
pages={269--287},
year={2012},
organization={Springer}
}
@book{cucumber2012,
title={The Cucumber Book: Behavior-Driven Development for Testers and Developers},
author = {Wynne, A},
year={2012},
organization={The Pragmatic Bookshelf}
}

104
views/schema.md Normal file
View File

@ -0,0 +1,104 @@
# Declarative Schema Validation
In order to make the processing of Zencode more robust, all data used as input and output for its computations is validated according to predefined schemas. This makes the Zencode DSL a declarative language in which data recognition is operated before processing.
The data schemas are added on a per-usecase basis: they refer to specific cryptographic implementations as they are added in Zencode. Careful evaluation regarding their addition is made to realise if old schemas can be extended to include new requirements.
Schemas are expressed in a simple format using Lua scripting syntax, for example:
```lua
-- zencode_keypair
keypair = S.record {
schema = S.Optional(S.string),
private = S.Optional(S.hex),
public = S.ecp
}
```
The schema above is the smallest and most commonly used one, composed by one required field and two optional ones, used to validate the input and output of public/private keypairs to be used in transformations.
The only required field in the schema is the `public` key which is validated using the `ECP` type (`S.` is an abbreviation for the `SCHEMA.` namespace). The validation of `S.ECP` is an actual cryptographic validation: Zenroom will check that the big integer number represented by the field corresponds to a valid point on the curve. In case the validation is not passed, the execution of the Zencode script will not take place and Zenroom will return a meaningful error message indicating the wrong field.
The other optional field is the `private` key which can correspond to any sequence of values, therefore no cryptographic validation is possible for it; in this case then the validation used is one that refers to the encoding of the field: `S.hex` is verifying that the value is encoded with a sequence of characters that express only hexadecimal numbers (that is, 0..9 numbers and case-insensitive letters from A to Z). Other encoding tests are also available, for instance `S.base64` if that is the encoding used in the specific implementation.
Another more complex example follows:
```lua
-- packets encoded with AES GCM
AES-GCM = S.record {
checksum = S.hex,
iv = S.hex,
schema = S.Optional(S.string),
text = S.hex,
zenroom = S.Optional(S.string),
encoding = S.string,
curve = S.string,
pubkey = S.ecp
}
```
In this example no new validations are being used and in fact it just adds fields compared to the previous: it defines a portable packet of ciphertext data that is returned as output of AES-GCM asymmetric encryption as well is accepted as input to AES-GCM decryption. A similarity between these two examples is evident: the presence of the `schema` field. This field is a sort of "introspective" indication matching the data structure to its schema specification. If this field is not present (as it is always optional) then no validation on the data structure will take place, meaning the Zencode implementation leaves the risk (and hopefully the validation task) to the host.
This chapter ends with the current implementation of schema validation data types that are currently implemented for symmetric and asymmetric encryption of ciphertexts as well for implicit certificates. The schema impementation for Zencode is maintained into the sourcecode within the source file `src/lua/zencode_schemas.lua` and can be accessed by the function `ZEN.validate(data,'schema','error')` which is a wrapper of `ZEN.assert(validate(data,schemas['schema']),'error')`.
```lua
_G['schemas'] = {
-- packets encoded with AES GCM
AES-GCM = S.record {
checksum = S.hex,
iv = S.hex,
schema = S.Optional(S.string),
text = S.hex,
zenroom = S.Optional(S.string),
encoding = S.string,
curve = S.string,
pubkey = S.ecp
},
-- zencode_keypair
keypair = S.record {
schema = S.Optional(S.string),
private = S.Optional(S.hex),
public = S.ecp
},
-- zencode_ecqv
certificate = S.record {
schema = S.Optional(S.string),
private = S.Optional(S.big),
public = S.ecp,
hash = S.big,
from = S.string,
authkey = S.ecp
},
certificate_hash = S.Record {
schema = S.Optional(S.string),
public = S.ecp,
requester = S.string,
statement = S.string,
certifier = S.string
},takes
declaration = S.record {
schema = S.Optional(S.string),
from = S.string,
to = S.string,
statement = S.string,
public = S.ecp
},
declaration_keypair = S.record {
schema = S.Optional(S.string),
requester = S.string,
statement = S.string,
public = S.ecp,
private = S.hex
}
}
```