YamlRepository - File System Backend

Introduction

Sometimes you may want to run a minimal deployment of RDMP to achieve a specific task (e.g. load data). Or you may want your RDMP configuration to remain largely static (e.g. running a specific configuration repeatedly within virtual containers).

For these use cases setting up a Microsoft Sql Server backend for platform databases would not be efficient. You can instead use a file system backend for RDMP.

Quick Start

To run RDMP with a file system backend pass the --dir option to the command line or windows gui client (e.g via a shortcut to ResearchDataManagementPlatform.exe).

Simple commands will work out of the box e.g.

./rdmp CreateNewEmptyCatalogue --dir ./rdmp-yaml
./rdmp ls Catalogue --dir ./rdmp-yaml

The terminal UI is fully compatible with YamlRepository:

./rdmp gui --dir ./rdmp-yaml

If you encounter rendering issues with the TUI, add the --usc flag (particularly if using remoting e.g. SSH)

For most activities (e.g. data load) you will require a logging (audit) database. This can be any DBMS (Sql Server, MySql etc) supported by RDMP.

./rdmp CreateNewExternalDatabaseServer LiveLoggingServer_ID "DatabaseType:MicrosoftSQLServer:Name:MyLoggingDb:Server=(localdb)\MSSQLLocalDB;Trusted_Connection=True;" --dir ./rdmp-yaml

Advantages

Using a file system backend for RDMP allows use of tools such as git for change tracking. Yaml files generated by RDMP are all human readable and will result in a sensible diff log over time.

Using a file system backend removes the requirement for having an Sql Server instance available.

Limitations

YamlRepository does not support multiple concurrent user writes. ID allocations for new objects are based on the current system state and therefore if multiple separate processes are creating objects at once there will be a mismatch in allocation which will lead to corruption.

Multiple concurrent read-only processes are permitted. The easiest way to achieve this is to remove write permissions on the directory (e.g. ./rdmp-yaml) for the running accounts. Ensure that configuration changes can only be made by a single admin user account who runs on a single 'write' process at a time.

When running with multiple read-only processes, a restart or refresh may be required to pick up configuration changes made by other processes.

As mentioned in the Quick Start you will still need some kind of relational database for audit and for storing cohorts etc. But this can be any DBMS type (e.g. MySql) and can even be the same server you use for your data repository.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

YamlRepository.md

YamlRepository.md

YamlRepository - File System Backend

Table of contents

Introduction

Quick Start

Advantages

Limitations

Files

YamlRepository.md

Latest commit

History

YamlRepository.md

File metadata and controls

YamlRepository - File System Backend

Table of contents

Introduction

Quick Start

Advantages

Limitations