From 86eb214a931990e02798f8bc6045518985c5ca79 Mon Sep 17 00:00:00 2001 From: Preview Action Date: Sun, 8 May 2022 09:55:49 +0000 Subject: [PATCH] Squashed commit of the following: commit 6f83edd9b5d3f3e29243121a9c4a17b7268c8d82 Author: Yannik Sander Date: Sun May 8 11:51:17 2022 +0200 Style improvements Co-authored-by: Martin Monperrus commit bd59fe32b0999d457b40c1f592697d13b3dfdd44 Author: Yannik Sander Date: Sun May 8 11:50:31 2022 +0200 IaaS description and application commit f7cbbc9faeb20b81e116229f3380838e6f0c4783 Author: Yannik Sander Date: Sun May 8 11:49:45 2022 +0200 Move Nickel AST to top to introduce fundamental concepts (i.e. records) commit 1e0821b1708924fd088737559eda3765230f7567 Author: Yannik Sander Date: Sun May 8 11:47:59 2022 +0200 Add sentence about IaC implementation commit 430c53e5e41d34c5107e095e16f249e53ac516a1 Author: Yannik Sander Date: Sun May 8 11:46:16 2022 +0200 Fixup: Rewrite Configuration programming languages as text based configuration commit e94524212a9df74db2911f44cbc641499672fe36 Author: Yannik Sander Date: Sun May 8 11:45:00 2022 +0200 Rewrite Configuration programming languages as text based configuration Co-authored-by: Martin Monperrus commit 8f4f497a16c48e5368a60dac0737139f905a0b0b Author: Yannik Sander Date: Sun May 8 11:40:38 2022 +0200 Destinction of workspace scoped symbols commit ccdde1e8668ca3734f865a33ffb27289f997e7cc Author: Yannik Sander Date: Sun May 8 11:39:32 2022 +0200 Define Folder/Workspace Co-authored-by: Martin Monperrus commit 102d4114e261238120d9f7d9d6b2ede03cce5eb3 Author: Yannik Sander Date: Sun May 8 11:38:09 2022 +0200 Add RPC method names commit 643d5c8c3bac589881de91bb6e180b59725b8728 Author: Yannik Sander Date: Sun May 8 11:36:58 2022 +0200 Remove subjective phrase Co-authored-by: Martin Monperrus commit d2a3e93053c7685dffa6b113610d05dc2eeda678 Author: Yannik Sander Date: Sun May 8 11:36:31 2022 +0200 Improve self containment of this chapter commit 1b7f7c6c9bc475cc975eea7693cbfb5f8f13b889 Author: Yannik Sander Date: Sun May 8 11:35:19 2022 +0200 Update chapter/background.md commit dd68fdd58242a2d39d7b4218f50425189bd38d81 Author: Yannik Sander Date: Sun May 8 11:33:43 2022 +0200 Define JSON-RPC after naming it commit 33d2f6f697325a6c0d2f8dc45b5af6e52e50ef55 Author: Yannik Sander Date: Sun May 8 11:31:26 2022 +0200 Define and distinct LSP Server/Client commit 9f559091a5f10eadeb6b25eaaf9df23a44f0aab8 Author: ysndr Date: Sun Apr 17 15:34:28 2022 +0200 Short description of gradual typing commit 010ac56f186873ab7edc7c82982f4795001b22d6 Author: ysndr Date: Sun Apr 17 15:34:13 2022 +0200 Describe Record Merging commit 44e266aa5add9823c56fd28df0d06e8e9c224427 Author: ysndr Date: Sun Apr 17 15:33:28 2022 +0200 Add Nickel Introduction commit a82da53fcc0884979b4dbae8e2a5719d6990ed08 Author: ysndr Date: Sun Apr 17 15:33:09 2022 +0200 Comment configuration example commit 86ab1ada4e1b1fb3b8c1c4fc60214104bce9ce5f Author: ysndr Date: Sun Apr 17 15:32:53 2022 +0200 Move file processing section from related work commit 6b6447858e380d5979d8a86e21f88433aadfc81c Author: ysndr Date: Sun Apr 17 15:32:32 2022 +0200 Fix list item commit 9cb38ea240144f2a7ab39df49641ebd1cce77a4b Author: ysndr Date: Sun Apr 17 15:32:16 2022 +0200 Cleanup Headers commit 25fdb8ffbdd75b86f72937d9715ba877ee225a91 Author: ysndr Date: Sun Apr 17 15:31:42 2022 +0200 Describe Diagnostics capability commit 2fb44ae7a6aa260368cda449f46323654e4d5572 Author: ysndr Date: Sun Apr 17 15:31:14 2022 +0200 Describe symbols capability commit f678716926d29f41e81ad71050f01c2f73d2b830 Author: ysndr Date: Mon Apr 11 15:18:14 2022 +0200 Add Go-To Methods commit 35456a752b65a59abc1c62ba94ff02509cb046e8 Author: ysndr Date: Sat Apr 9 21:37:28 2022 +0200 Fix file paths commit 32730eb4ed6d94bddcf957524f2e065206aa797b Author: ysndr Date: Sat Apr 9 21:14:03 2022 +0200 Explain hover commit 681d26836d86e9b95d565366d7ae8c6e1575f5c8 Author: ysndr Date: Sat Apr 9 21:13:40 2022 +0200 Explain completion commit 85e4b6f995664a8f17ca23f9fb851579e9b8b5fa Author: ysndr Date: Sat Apr 9 21:13:15 2022 +0200 Improve LSP introduction commit f5265654f7ab0415210079dcb5710d4a6a9a03d3 Author: ysndr Date: Sat Apr 9 21:12:50 2022 +0200 Clean up json rpc commit 69dcab040b9e874df2dcdcb48ee4b73ea22eca1e Author: Yannik Sander Date: Fri Feb 25 14:04:37 2022 +0100 Fix code block formatting commit 5c2ee8cbb5174645c67b42e0b21b7a9685b65b9d Author: Yannik Sander Date: Mon Feb 7 16:01:30 2022 +0100 Clarifying static/dynamic access commit ce50274c4ec19b9c59c32d8ae391a8bdf8c27afe Author: Yannik Sander Date: Mon Feb 7 15:32:40 2022 +0100 Nickel record shorthand example fixes commit a4ce6ee84bd033e3d679e26035d4a12b0eb81c3e Author: Yannik Sander Date: Mon Feb 7 15:28:56 2022 +0100 Remove (commented) comparative example commit ab40ab40c09cce83557f5ac38846f3b0f19c62ec Author: Yannik Sander Date: Mon Feb 7 15:27:55 2022 +0100 Simplify metadata commit 835cfcc4b61111640a881b2da320d787c266edd5 Author: Yannik Sander Date: Mon Feb 7 15:02:04 2022 +0100 Nix example typos and empty lines commit 496702ddc408cd397c390dc85156cfb90a7b2834 Author: Yannik Sander Date: Mon Feb 7 14:59:53 2022 +0100 IaC intro commit d71ed22f93cbda1959699a97316a1ef4c143a614 Author: Yannik Sander Date: Mon Feb 7 14:41:52 2022 +0100 Rephrase "config drift" paragraph commit 3fc8853a5ad81820e1ea596a62f67258d46361fb Author: Yannik Sander Date: Mon Feb 7 12:15:59 2022 +0100 Apply suggestions from code review From PR review #4 Moved to #2 Co-authored-by: Yann Hamdaoui # Conflicts: # chapter/background.md commit 59fee450c5ec8ca82e65b2e4b475db0a0dde3dcf Author: Yannik Sander Date: Sun Feb 6 17:51:43 2022 +0100 Fix latex incompatibility commit 7a38c0db39abe5692ff63d6e9be1e646b559684a Author: Yannik Sander Date: Sun Feb 6 17:43:20 2022 +0100 Add section on contracts commit e6dd02acecd73fe76f88befedacb3717536a33d0 Author: Yannik Sander Date: Sun Feb 6 17:42:58 2022 +0100 Move down header level of gradual typing commit a8b020ace3e2ca28f43790b6e476c879f6e395ba Author: Yannik Sander Date: Sun Feb 6 17:42:23 2022 +0100 Fix typos and wording commit 1034f59c00d0fc51ea468b4072b2acaf74f6b248 Author: Yannik Sander Date: Sun Feb 6 15:26:40 2022 +0100 Remove Motivation (moved to introduction) commit c6791ed6657619d8d64e3040ae7a716933ded376 Author: Yannik Sander Date: Wed Jan 26 17:25:21 2022 +0100 Fix code examples to pass syntax highlighter commit 5874b6af422381edeef4e3f5588a33109162a152 Author: Yannik Sander Date: Wed Jan 26 17:24:57 2022 +0100 AST: Identifiers and let bindings commit b143283a4773dc5499021d4d5216c28c00edb2ea Author: Yannik Sander Date: Thu Jan 13 15:40:43 2022 +0100 Add Nickel AST chapter to background commit 815a0fe09ed713878576fadc5560d7beb333356a Author: Yannik Sander Date: Thu Jan 13 15:40:13 2022 +0100 Merge Nickel section into Configuration languages commit d27c9a333587ea06f25341f26a8605fa11e3d497 Author: Yannik Sander Date: Tue Jan 4 18:02:25 2022 +0100 Removing math terms and mention edit integration work commit b9a65f0b67ebd55a93d5a27b25b2b196342c9bf4 Author: Yannik Sander Date: Tue Jan 4 17:50:07 2022 +0100 Add paragraph about config security commit 584e5a54c19f162c368ac7cf41fb41f490ff7c2c Author: Yannik Sander Date: Tue Jan 4 17:49:44 2022 +0100 Do not use auxillary description for DSL commit 6f523257f1ef0dfffcabe837d1cecc761118a8f7 Author: Yannik Sander Date: Tue Jan 4 17:18:16 2022 +0100 Address more suggestions towards motivation and JSONRPC commit 49605bc283192ed881164ddefed1c6cfe1b5cbf1 Author: Yannik Sander Date: Tue Jan 4 16:52:32 2022 +0100 Address suggestions regarding LSP commit 99d5d07a733e35502566f909cc1c93e55bfe7650 Author: Yannik Sander Date: Tue Jan 4 16:47:35 2022 +0100 Apply suggestions from code review Co-authored-by: Yann Hamdaoui commit 1e0facfe8166ac3f7fdb98e75474f1f7675ec6ab Author: Yannik Sander Date: Tue Jan 4 16:31:33 2022 +0100 Fill subsection on IaC commit 525c2988e02ed186f3a98164b2475d53d27d3165 Author: Yannik Sander Date: Sun Jan 2 16:19:43 2022 +0100 Fix some footnotes and typo commit be79a9ce96edc40c97d376e57dde4e2ef6a32a87 Author: Yannik Sander Date: Sun Jan 2 14:26:40 2022 +0100 Finish `Configuration programming languages` section commit 440fdf803396988f32caf06bb3b0a7d056275eab Author: Yannik Sander Date: Sun Jan 2 00:51:02 2022 +0100 Amend static text vs binary paragraph commit c8903d36ae3d14a256c05a9b2e2f41481d8b8838 Author: Yannik Sander Date: Sat Jan 1 18:07:26 2022 +0100 Start section on configuration programming languages commit 3cfc28ba4bd8d9339cd6c2f591dee3d3e7a74c6d Author: Yannik Sander Date: Sat Jan 1 15:55:21 2022 +0100 Fix figure references commit e1b816b121bc2ff3cf8e6623b71dd7f0be9fdcb5 Author: Yannik Sander Date: Sat Jan 1 15:31:30 2022 +0100 Finish rpc doc commit 3cc32b519d0fcc23f01bcdf025fca30773bcbcae Author: Yannik Sander Date: Sat Jan 1 15:05:58 2022 +0100 Fix code block rendering commit 0706fe34e957c6bd515e99b5adf6088f46bc301f Author: Yannik Sander Date: Fri Dec 31 17:57:41 2021 +0100 Start subsection jsron rpc commit 277674f9adc83d871b9f980cdc82d153c1d75a17 Author: Yannik Sander Date: Fri Dec 31 16:49:02 2021 +0100 Fix footnote commit 40cbc87ed0e325a6d85b936589814f57fa822220 Author: Yannik Sander Date: Fri Dec 31 16:32:35 2021 +0100 Rename rationale Motivation and extend subsection commit 87589b6d98a3bde574d7ed80888b614ac50e6e36 Author: Yannik Sander Date: Wed Dec 29 00:05:48 2021 +0100 Add to the rationale of LSP commit d36bac9bf7a2b417001256ca7da07b557ba426a9 Author: Yannik Sander Date: Tue Dec 28 21:29:57 2021 +0100 Language server Introduction commit 32aa082c4336ba2b362005d811d5646732557d03 Author: Yannik Sander Date: Tue Dec 28 20:17:57 2021 +0100 Background introduction commit c0b596532bd61cfbd2c9116159e2a86d2e4b7b9c Author: Yannik Sander Date: Tue Dec 28 20:01:15 2021 +0100 Extend chapter outline commit 7826b00e23a5f9d342a19c9a5fc01cab279be2f7 Author: Yannik Sander Date: Mon Dec 27 23:10:25 2021 +0100 Address LSP first --- chapter/background.md | 452 ++++++++++++++++++++++++------------------ 1 file changed, 259 insertions(+), 193 deletions(-) diff --git a/chapter/background.md b/chapter/background.md index 846e380e..3192ca39 100644 --- a/chapter/background.md +++ b/chapter/background.md @@ -10,6 +10,14 @@ The second part is dedicated to Nickel, elaborating on the context and use-cases ## Language Server Protocol +The Language Server Protocol is a JSON-RPC based communication specification comprising an LSP client (i.e. editors) and server (also called language server for simplicity). +The LSP decouples the development of clients and servers, allowing developers to focus on either side. +The LSP defines several capabilities -- standardized functions which are remotely executed by the language server. +LSP Clients are often implemented as editor extensions facilitating abstraction libraries helping with the interaction with the protocol and editor interface. +Language Servers analyse source code sent by the client and may implement any number of +capabilities relevant to the language. +Since the LSP is both language and editor independent, the same server implementation can serve all LSP compliant clients eliminating the need to redundantly recreate the same code intelligence for every editor. + Language servers are today's standard of integrating support for programming languages into code editors. Initially developed by Microsoft for the use with their polyglot editor Visual Studio Code^[https://code.visualstudio.com/] before being released to the public in 2016 by Microsoft, RedHat and Codeenvy, the LSP decouples language analysis and provision of IDE-like features from the editor. Developed under open source license on GitHub^[https://github.com/microsoft/language-server-protocol/], the protocol allows developers of editors and languages to work independently on the support for new languages. @@ -17,9 +25,10 @@ If supported by both server and client, the LSP now supports more than 24 langua ### JSON-RPC -JSON-RPC (v2) [@json-rpc] is a JSON based lightweight transport independent remote procedure call [@rpc] protocol used by the LSP to communicate between a language server and a client. +the LSP uses JSON-RPC to communicate between a language server and a client. +JSON-RPC (v2) [@json-rpc] is a JSON based lightweight transport independent remote procedure call [@rpc] protocol. -RPC is a well known paradigm that allows clients to virtually invoke a method at a connected process. +RPC is a paradigm that allows clients to virtually invoke a method at a connected process. The caller sends a well-defined message to a connected process which executes a procedure associated with the request, taking into account any transmitted arguments. Upon invoking a remote procedure, the client suspends the execution of its environment while it awaits an answer of the server, corresponding to a classical local procedure return. @@ -42,13 +51,23 @@ In this case, the server should respond with a list of results matching each req ### Commands and Notifications The LSP builds on top of the JSON-RPC protocol described in the previous subsection. -In total the LSP defines 33 [@lsp] "language features", i.e., source code related capabilities. -In addition, the LSP specifies different capabilities to the server to control the editor. -For instance, servers may instruct clients to show notifications or progress bars or open documents. -Similarly, the client has multiple ways of notifying the server of file changes, including renaming r deletion of files. +It defines four sets of commands: -This thesis aims to implement a fundamental set of capabilities. -The chosen capabilities are based on those identified as "key methods" by the authors of langserver [@langserver], specifically: +The largest group are commands that are relevant in the scope of the currently opened document, e.g. autocompletion, refactoring, inline values and more. +In total the LSP defines 33 [@lsp] of such "Language Features". +Editors will notify the server about file changes and client side interaction, i.e., opening, closing and renaming files using "Document Synchronization" methods. +While most commands are defined in the document scope, i.e., a single actively edited file, the LSP allows clients to communicate changes to files in the opened project. +This so called workspace comprises on or more root folders managed by the editor and all files contained in them. +"Workspace Features" allow the server to intercept file creation, renaming or deletion to make changes to existing sources in other files. +Use cases of these features include updating import paths, changing class names and other automations that are not necessary local to a single file. +In addition, the LSP specifies so called "Window Features" which allow the server to control parts of the user interface of the connected editor. +For instance, servers may instruct clients to show notifications and progress bars or open files. + + +### Description of Key Methods + +the authors of langserver.org [@langserver] identified six "key methods" of the LSP. +The methods represent a fundamental set of capabilities, specifically: 1. Code completion @@ -66,6 +85,8 @@ The chosen capabilities are based on those identified as "key methods" by the au #### Code Completion +*RPC Method: [`textDocument/Completion`](https://microsoft.github.io/language-server-protocol/specifications/specification-current/#textDocument_completion)* + This feature allows users to request a suggested identifier of variables or methods, concrete values or larger templates of generated code to be inserted at the position of the cursor. The completion can be invoked manually or upon entering language defined trigger characters, such as `.`, `::` or `->`. The completion request contains the current cursor position, allowing the language server to resolve contextual information based on an internal representation of the document. @@ -76,6 +97,8 @@ In the example ([@fig:lsp-capability-complete]) the completion feature suggests #### Hover +*RPC Method: [`textDocument/hover`](https://microsoft.github.io/language-server-protocol/specifications/specification-current/#textDocument_hover)* + Hover requests are issued by editors when the user rests their mouse cursor on text in an opened document or issues a designated command in editors without mouse support. If the language server has indexed any information corresponding to the position, it can generate a report using plain text and code elements, which are then rendered by the editor. Language servers typically use this to communicate type-signatures or documentation. @@ -86,6 +109,8 @@ An example can be seen in [@fig:lsp-capability-hover]. #### Jump to Definition +*RPC Method: [`textDocument/definition`](https://microsoft.github.io/language-server-protocol/specifications/specification-current/#textDocument_definition)* + The LSP allows users to navigate their code by the means of symbols by finding the definition of a requested symbol. Symbols can be for instance variable names or function calls. As seen in [@fig:lsp-capability-definition], editors can use the available information to enrich hover overlays with the hovered element's definition. @@ -95,6 +120,8 @@ As seen in [@fig:lsp-capability-definition], editors can use the available infor #### Find References +*RPC Method: [`textDocument/references`](https://microsoft.github.io/language-server-protocol/specifications/specification-current/#textDocument_references)* + Finding references is the inverse operation to the previously discussed *Jump to Definition* ([cf. @sec:jump-to-definition]). For a given symbol definition, for example variable, function, function argument or record field the LSP provides all usages of the symbol allowing users to inspect or jump to the referencing code position. @@ -102,8 +129,12 @@ For a given symbol definition, for example variable, function, function argument #### Workspace/Document symbols -The symbols capability allows language servers to expose a list if symbols declared in the open document or workspace. -The granularity of the listed items is determined by the server. +*RPC Method: [`workspace_symbol`](https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/#workspace_symbol) or [`textDocument_documentSymbol`](https://microsoft.github.io/language-server-protocol/specifications/specification-current/#textDocument_documentSymbol)* + +The symbols capability is defined as both a "Language Feature" and "Workspace Feature" which mainly differ in the scope they represent. +The `textDocument/documentSymbol` command lists symbols solely found in the currently opened file, while the `workspace/symbol` takes into account all files in the opened set of folders. +The granularity of the listed items is determined by the server and possibly different for either scope. +For instance, document symbols could be as detailed as listing any kind of method or property found in the document, while workspace symbols take visibility rules into account which might expose public entities only. Symbols are associated with a span of source code of the symbol itself and its context, for example a function name representing the function body. Moreover, the server can annotate the items with additional attributes such as symbol kinds, tags and even child-references (e.g. for the fields of a record or class). @@ -142,44 +173,82 @@ Requests invoke an ad-hoc resolution the results of which may be memoized for fu Lazy resolution is more prevalent in conjunction with incremental indexing, since it further reduces the work associated with file changes. This is essential in complex languages that would otherwise perform a great amount of redundant work. -## Configuration programming languages +## Text based Configuration + + +Configuration languages such as XML[@xml], JSON[@json], or YAML[@yaml] are textual representations of structural data used to configure parameters of a system or application. +The objective of such languages is to be machine readable, yet also intuitive enough for humans to write. + +### Common Configuration Languages -Nickel [@nickel], the language targeted by the language server detailed in this thesis, defines itself as "configuration language" used to automize the generation of static configuration files. +#### JSON -Static configuration languages such as XML[@xml], JSON[@json], or YAML[@yaml] are language specifications defining how to textually represent structural data used to configure parameters of a system^[some of the named languages may have been designed as a data interchange format which is absolutely compatible with also acting as a configuration language]. -Applications of configuration languages are ubiquitous especially in the vicinity of software development. While XML and JSON are often used by package managers [@npm, @maven, @composer], YAML is a popular choice for complex configurations such as CI/CD pipelines [@travis, @ghaction, @gitlab-runner] or machine configurations in software defined networks such as Kubernetes and docker compose. +According to the JSON specification [@json] the language was designed as a data-interchange format that is easy to read and write both for humans and machines. +Since it is a subset of the JavaScript language, its use is particularly straight forward and wide spread in JavaScript based environments such as web applications. +But due to its simplicity implementations are available and for all major languages, motivating its use for configuration files. -Such static formats are used due to some significant advantages compared to other formats. -Most strikingly, the textual representation allows inspection of a configuration without the need of a separate tool but a text editor and be version controlled using VCS software like Git. -For software configuration this is well understood as being preferable over databases or other binary formats. Linux service configurations (files in `/etc`) and MacOS `*.plist` files which can be serialized as XML or a JSON-like format, especially exemplify that claim. +#### YAML -Yet, despite these formats being simple to parse and widely supported [@json], their static nature rules out any dynamic content such as generated fields, functions and the possibility to factorize and reuse. +YAML is another popular language, mainly used for configuration files. +According to its goals [@yaml-goals] it should be human readable and independent of the programming language making use of it. +It should be efficient to parse and easy to implement while being expressive and extensible. + +YAML also features a very flexible syntax which allows for many alternative ways to declare semantically equivalent data. +For example boolean expressions can be written as any of the following values: + +> `y|Y|yes|Yes|YES|n|N|no|No|NO` +> `true|True|TRUE|false|False|FALSE` +> `on|On|ON|off|Off|OFF` + +Since YAML facilitates indentation as a way to express objects or lists and does not require object keys (and even strings) to be quoted, it is considered easier to write than JSON at the expense of parser complexity. +Yet, YAML is compatible with JSON since as subset of its specification defines a JSON equivalent syntax that permits the use of `{..}` and `[..]` to describe objects and lists respectively. +### Applications of Configuration Languages + +Applications of configuration languages are ubiquitous especially in the vicinity of software development. +While XML and JSON are often used by package managers [@npm, @maven, @composer], YAML is a popular choice for complex configurations such as CI/CD pipelines [@travis, @ghaction, @gitlab-runner] or machine configurations in software defined networks such as Kubernetes[@kubernetes] and docker compose [@docker-compose]. + +Such formats are used due to some significant advantages compared to other binary formats such as databases. +Most strikingly, the textual representation allows inspection of a configuration without the need of a separate tool but a text editor. +Moreover textual configuration can be version controlled using VCS software like Git which allows changes to be tracked over time. +Linux service configurations (files in `/etc`) and MacOS `*.plist` files which can be serialized as XML or a JSON-like format, especially exemplify that claim. + +### Configuration *Programming* Languages + +Despite the above-mentioned formats being simple to parse and widely supported [@json], their static nature rules out any dynamic content such as generated fields, functions and the possibility to factorize and reuse. Moreover, content validation has to be developed separately, which led to the design of complementary schema specification languages like json-schema [@json-schema] or XSD [@xsd]. These qualities require an evaluated language. In fact, some applications make heavy use of config files written in the native programming language which gives them access to language features and existing analysis tools. Examples include JavaScript frameworks such as webpack [@webpack] or Vue [@vue] and python package management using `setuptools`[@setuptools]. +Yet, the versatility of general purpose languages poses new security risks if used in this context as configurations could now contain malicious code requiring additional verification. +Beyond this, not all languages serve as a configuration language, e.g. compiled languages. -Despite this, not all languages serve as a configuration language, e.g. compiled languages and some domains require language agnostic formats. -For particularly complex products, both language independence and advanced features are desirable. -Alternatively to generating configurations using high level languages, this demand is addressed by more domain specific languages. -Dhall [@dhall], Cue [@cue] or jsonnet [@jsonnet] are such domain specific languages (DSL), that offer varying support for string interpolation, (strict) typing, functions and validation. +However, for particularly complex applications, both advanced features and language independence are desirable. +Alternatively to using high level general purpose languages, this demand is addressed by domain specific languages (DSL). +Dhall [@dhall], Cue [@cue] or jsonnet [@jsonnet] are such domain specific languages, that offer varying support for string interpolation, (strict) typing, functions and validation. +Most of these languages are used as a templating system which means a configuration file in a more portable format is generated using an evaluation of the more expressive configuration source. +The added expressiveness manifests in the ability to evaluate expression and the availability of imports, variables and functions. +These features allow to refactor and simplify repetitive configuration files. ### Infrastructure as Code -A prime example for the application of configuration languages are IaaS^[Infrastructure as a Service] products. -These solutions offer great flexibility with regard to resource provision (computing, storage, load balancing, etc.), network setup and scaling of (virtual) servers. -Although the primary interaction with those systems is imperative, maintaining entire applications' or company's environments manually comes with obvious drawbacks. +The shift to an increasing application of IaaS^[Infrastructure as a Service] products started the desire for declarative machine configuration in a bid to simplify the setup and reproducibility of such systems. +This configuration based setup of infrastructure is commonly summarized as infrastructure as code or IaC. +As the name suggests, IaC puts cloud configuration closer to the domain of software development [@IaC-book]. + +In principle, IaaS solutions offer great flexibility with regard to resource provision (computing, storage, load balancing, etc.), network setup and scaling of (virtual) servers. +However, since the primary interaction with those systems is imperative and based on command line or web interfaces, maintaining entire applications' or company's environments manually comes with obvious drawbacks. Changing and undoing changes to existing networks requires intricate knowledge about its topology which in turn has to be meticulously documented. Undocumented modification pose a significant risk for *config drift* which is particularly difficult to undo imperatively. Beyond that, interacting with a system through its imperative interfaces demands qualified skills of specialized engineers. -The concept of "Infrastructure as Code" (*IaC*) serves the DevOps principles. +The concept of "Infrastructure as Code" (*IaC*) align with the principles of DevOps. IaC tools help to overcome the need for dedicated teams for *Dev*elopment and *Op*erations by allowing to declaratively specify the dependencies, topology and virtual resources. Optimally, different environments for testing, staging and production can be derived from a common base and changes to configurations are atomic. As an additional benefit, configuration code is subject to common software engineering tooling; It can be statically analyzed, refactored and version controlled to ensure reproducibility. +A subset of IaC is focused on largely declarative configuration based on configuration files that are interpreted and "converted" into imperative platform dependent instructions. As a notable instance, the Nix[@nix] ecosystem even goes as far as enabling declarative system and service configuration using NixOps[@nixops]. @@ -247,171 +316,14 @@ This suggests that techniques[@aws-cloud-formation-security-tests] to automatica ### Nickel -The Nickel[@nickel] language is a configuration programming language (cf. [@sec:configuration-programming-languages]) with the aims of providing composable, verifiable and validatable configuration files. -The language draws inspiration from existing projects such as Cue [@cue], Dhall [@Dhall] and most importantly Nix [@nix]. +The Nickel[@nickel] language is a configuration programming language as defined in [@sec:configuration-programming-languages] with the aims of providing composable, verifiable and validatable configuration files. Nickel implements a pure functional language with JSON-like data types and turing-complete lambda calculus. +The language draws inspiration from existing projects such as Cue [@cue], Dhall [@Dhall] and most importantly Nix [@nix]. However, Nickel sets itself apart from the existing projects by combining and advancing their strengths. The language addresses concerns drawn from the experiences with Nix which employs a sophisticated modules system [@nixos-modules] to provide type-safe, composed (system) configuration files. Nickel implements gradual type annotations, with runtime checked contracts to ensure even complex configurations remain correct. Additionally, considering record merging on a language level facilitates modularization and composition of configurations. -#### Record Merging - -Nickel considers record merging as a fundamental operation that combines two records (i.e. JSON objects). -Merging is a commutative operation between two records which takes the fields of both records and returns a new record that contains the fields of both operands (cf. [@lst:nickel-merging-simple]) - -```{.nickel #lst:nickel-merging-simple caption="Merging of two records without shared fields"} -{ enable = true } & { port = 40273 } ->> -{ - enable = true, - port = 40273 -} -``` - -If both operands contain a nested record referred to under the same name, the merging operation will be applied to these records recursively (cf. [@lst:nickel-merging-recursive]). - -```{.nickel #lst:nickel-merging-recursive caption="Merging of two records without shared nested records"} -let enableGollum = { - service = { - gollum = { - enable = true - } - } -} in - -let setGollumPort = { - service = { - gollum = { - port = 40273 - } - } -} in - -enableGollum & setGollumPort - ->> -{ - service = { - gollum = { - enable = true, - port = 40273 - } - } -} -``` - -However, if both operands contain a field with the same name that is not a mergeable record, the operation fails since both fields have the same priority making it impossible for Nickel to chose one over the other (cf. [@lst:nickel-merging-failing-names]) -Specifying one of the fields as a `default` value allows a single override (cf. [@lst:nickel-merging-default] ). -In future versions of Nickel ([@nickel-rfc-0001]) it will be possible to specify priorities in even greater detail and provide custom merge functions. - -```{.nickel #lst:nickel-merging-failing-names caption="Failing merge of two records with common field"} -{ port = 40273 } & { port = 8080 } - ->> -error: non mergeable terms - | -1 | { port = 40273 } & { port = 8080 } - | ^^^^^ ^^^^ with this expression - | | - | cannot merge this expression -``` - - -```{.nickel #lst:nickel-merging-default caption="Succeeding merge of two records with default value for common field"} -{ port | default = 40273 } & { port = 8080 } - ->> -{ port = 8080 } -``` - -#### Gradual typing - -The typing approach followed by Nickel was introduce by Siek and Taha [@gradual-typing] as a combination of static and dynamic typing. -The choice between both type systems is traditionally debated since either approach imposes specific drawbacks. -Static typing lacks the flexibility given by fully dynamic systems yet allow to ensure greater correctness by enforcing value domains. -While dynamic typing is often used for prototyping, once an application or schema stabilizes, the ability to validate data schemas is usually preferred, often requiring the switch to a different statically typed language. -Gradual typing allows introducing statically checked types to a program while allowing other parts of the language to remain untyped and thus interpreted dynamically. - - -#### Contracts - -In addition to a static type-system Nickel integrates a contract system akin what is described in [@cant-be-blamed]. -First introduced by Findler and Felleisen, contracts allow the creation of runtime-checked subtypes. -Unlike types, contracts check an annotated value using arbitrary functions that either pass or *blame* the input. -Contracts act like assertions that are automatically checked when a value is used or passed to annotated functions. - -For instance, a contract could be used to define TCP port numbers, like shown in [@lst:nickel-sample-contract]. - -```{.nickel #lst:nickel-sample-contract caption="Sample Contract ensuring that a value is a valid TCP port number"} -let Port | doc "A contract for a port number" = - contracts.from_predicate ( - fun value => - builtins.is_num value && - value % 1 == 0 && - value >= 0 && - value <= 65535 - ) -in 8080 | #Port -``` - -Going along gradual typing, contracts pose a convenient alternative to the `newtype` pattern. -Instead of requiring values to be wrapped or converted into custom types, contracts are self-contained. -As a further advantage, multiple contracts can be applied to the same value as well as integrated into other higher level contracts. -An example can be observed in [@lst:nickel-sample-advanced-contract] - -```{.nickel #lst:nickel-sample-advanced-contract caption="More advaced use of contracts restricting values to an even smaller domain"} -let Port | doc "A contract for a port number" = - contracts.from_predicate ( - fun value => - builtins.is_num value && - value % 1 == 0 && - value >= 0 && - value <= 65535 - ) -in -let UnprivilegedPort = contracts.from_predicate ( - fun value => - (value | #Port) >= 1024 - ) -in -let Even = fun label value => - if value % 2 == 0 then value - else - let msg = "not an even value" in - contracts.blame_with msg label -in - -8001 | #UnprivilegedPort - | #Even -``` - -Notice how contracts also enable detailed error messages (see [@lst:nickel-sample-error-advaced-contract]) using custom blame messages. -Nickel is able to point to the exact value violating a contract as well as the contract in question. - -```{.text #lst:nickel-sample-error-advaced-contract caption="Example error message for failed contract"} -error: Blame error: contract broken by a value [not an even value]. - - :1:1 - | - 1 | #Even - | ----- expected type - | - - repl-input-34:22:1 - | -22 | - 8001 | #UnprivilegedPort - | ---- evaluated to this expression -23 | | | #Even - | -------------^ applied to this expression - -note: - - repl-input-34:23:8 - | -23 | | #Even - | ^^^^^ bound here -``` - - - #### Nickel AST Nickel's syntax tree is a single sum type, i.e., an enumeration of node types. @@ -422,11 +334,11 @@ Additionally, tree nodes hold information about their position in the underlying The primitive values of the Nickel language are closely related to JSON. On the leaf level, Nickel defines `Boolean`, `Number`, `String` and `Null`. -In addition to that the language implements native support for `Enum` values which are serialized as plain strings. +In addition to that the language implements native support for `Enum` values which are serialized as plain strings in less expressive formats such as JSON. Each of these are terminal leafs in the syntax tree. Completing JSON compatibility, `List` and `Record` constructs are present as well. -Records on a syntax level are HashMaps, uniquely associating an identifier with a sub-node. +Records on a syntax level are Dictionaries, uniquely associating an identifier with a sub-node. These data types constitute a static subset of Nickel which allows writing JSON compatible expressions as shown in [@lst:nickel-static]. @@ -437,13 +349,12 @@ These data types constitute a static subset of Nickel which allows writing JSON } ``` +Beyond these basic elements, Nickel implements variables and functions as well as a special syntax for attaching metadata and recursive records. -Building on that Nickel also supports variables and functions. - ##### Identifiers -The inclusion of Variables to the language, implies some sort of identifiers. +The inclusion of variables to the language, implies an understanding of identifiers. Such name bindings can be declared in multiple ways, e.g. `let` bindings, function arguments and records. The usage of a name is always parsed as a single `Var` node wrapping the identifier. Span information of identifiers is preserved by the parser and encoded in the `Ident` type. @@ -479,7 +390,6 @@ Functions in Nickel are curried lambda expressions. A function with multiple arguments gets broken down into nested single argument functions as seen in [@lst:nickel-args-function]. Function argument name binding therefore looks the same as in `let` bindings. - ##### Meta Information One key feature of Nickel is its gradual typing system [ref again?], which implies that values can be explicitly typed. @@ -579,8 +489,6 @@ strict digraph { } ``` - - ##### Record Shorthand Nickel supports a shorthand syntax to efficiently define nested records similarly to how nested record fields are accessed. @@ -605,3 +513,161 @@ As a comparison the example in [@lst:nickel-record-shorthand] uses the shorthand ``` Yet, on a syntax level Nickel generates a different representation. + +#### Record Merging + +Nickel considers record merging as a fundamental operation that combines two records (i.e. JSON objects). +Merging is a commutative operation between two records which takes the fields of both records and returns a new record that contains the fields of both operands (cf. [@lst:nickel-merging-simple]) + +```{.nickel #lst:nickel-merging-simple caption="Merging of two records without shared fields"} +{ enable = true } & { port = 40273 } +>> +{ + enable = true, + port = 40273 +} +``` + +If both operands contain a nested record referred to under the same name, the merging operation will be applied to these records recursively (cf. [@lst:nickel-merging-recursive]). + +```{.nickel #lst:nickel-merging-recursive caption="Merging of two records without shared nested records"} +let enableGollum = { + service = { + gollum = { + enable = true + } + } +} in + +let setGollumPort = { + service = { + gollum = { + port = 40273 + } + } +} in + +enableGollum & setGollumPort + +>> +{ + service = { + gollum = { + enable = true, + port = 40273 + } + } +} +``` + +However, if both operands contain a field with the same name that is not a mergeable record, the operation fails since both fields have the same priority making it impossible for Nickel to chose one over the other (cf. [@lst:nickel-merging-failing-names]) +Specifying one of the fields as a `default` value allows a single override (cf. [@lst:nickel-merging-default] ). +In future versions of Nickel ([@nickel-rfc-0001]) it will be possible to specify priorities in even greater detail and provide custom merge functions. + +```{.nickel #lst:nickel-merging-failing-names caption="Failing merge of two records with common field"} +{ port = 40273 } & { port = 8080 } + +>> +error: non mergeable terms + | +1 | { port = 40273 } & { port = 8080 } + | ^^^^^ ^^^^ with this expression + | | + | cannot merge this expression +``` + + +```{.nickel #lst:nickel-merging-default caption="Succeeding merge of two records with default value for common field"} +{ port | default = 40273 } & { port = 8080 } + +>> +{ port = 8080 } +``` + +#### Gradual typing + +The typing approach followed by Nickel was introduce by Siek and Taha [@gradual-typing] as a combination of static and dynamic typing. +The choice between both type systems is traditionally debated since either approach imposes specific drawbacks. +Static typing lacks the flexibility given by fully dynamic systems yet allow to ensure greater correctness by enforcing value domains. +While dynamic typing is often used for prototyping, once an application or schema stabilizes, the ability to validate data schemas is usually preferred, often requiring the switch to a different statically typed language. +Gradual typing allows introducing statically checked types to a program while allowing other parts of the language to remain untyped and thus interpreted dynamically. + + +#### Contracts + +In addition to a static type-system Nickel integrates a contract system akin what is described in [@cant-be-blamed]. +First introduced by Findler and Felleisen, contracts allow the creation of runtime-checked subtypes. +Unlike types, contracts check an annotated value using arbitrary functions that either pass or *blame* the input. +Contracts act like assertions that are automatically checked when a value is used or passed to annotated functions. + +For instance, a contract could be used to define TCP port numbers, like shown in [@lst:nickel-sample-contract]. + +```{.nickel #lst:nickel-sample-contract caption="Sample Contract ensuring that a value is a valid TCP port number"} +let Port | doc "A contract for a port number" = + contracts.from_predicate ( + fun value => + builtins.is_num value && + value % 1 == 0 && + value >= 0 && + value <= 65535 + ) +in 8080 | #Port +``` + +Going along gradual typing, contracts pose a convenient alternative to the `newtype` pattern. +Instead of requiring values to be wrapped or converted into custom types, contracts are self-contained. +As a further advantage, multiple contracts can be applied to the same value as well as integrated into other higher level contracts. +An example can be observed in [@lst:nickel-sample-advanced-contract] + +```{.nickel #lst:nickel-sample-advanced-contract caption="More advaced use of contracts restricting values to an even smaller domain"} +let Port | doc "A contract for a port number" = + contracts.from_predicate ( + fun value => + builtins.is_num value && + value % 1 == 0 && + value >= 0 && + value <= 65535 + ) +in +let UnprivilegedPort = contracts.from_predicate ( + fun value => + (value | #Port) >= 1024 + ) +in +let Even = fun label value => + if value % 2 == 0 then value + else + let msg = "not an even value" in + contracts.blame_with msg label +in + +8001 | #UnprivilegedPort + | #Even +``` + +Notice how contracts also enable detailed error messages (see [@lst:nickel-sample-error-advaced-contract]) using custom blame messages. +Nickel is able to point to the exact value violating a contract as well as the contract in question. + +```{.text #lst:nickel-sample-error-advaced-contract caption="Example error message for failed contract"} +error: Blame error: contract broken by a value [not an even value]. + - :1:1 + | + 1 | #Even + | ----- expected type + | + - repl-input-34:22:1 + | +22 | - 8001 | #UnprivilegedPort + | ---- evaluated to this expression +23 | | | #Even + | -------------^ applied to this expression + +note: + - repl-input-34:23:8 + | +23 | | #Even + | ^^^^^ bound here +``` + + +