Sometimes you may want to run a minimal deployment of RDMP to achieve a specific task (e.g. load data). Or you may want your RDMP configuration to remain largely static (e.g. running a specific configuration repeatedly within virtual containers).
For these use cases setting up a Microsoft Sql Server backend for platform databases would not be efficient. You can instead use a file system backend for RDMP.
To run RDMP with a file system backend pass the --dir
option to the command line or windows
gui client (e.g via a shortcut to ResearchDataManagementPlatform.exe).
Simple commands will work out of the box e.g.
./rdmp CreateNewEmptyCatalogue --dir ./rdmp-yaml
./rdmp ls Catalogue --dir ./rdmp-yaml
The terminal UI is fully compatible with YamlRepository:
./rdmp gui --dir ./rdmp-yaml
If you encounter rendering issues with the TUI, add the --usc
flag (particularly if using remoting e.g. SSH)
For most activities (e.g. data load) you will require a logging (audit) database. This can be any DBMS (Sql Server, MySql etc) supported by RDMP.
./rdmp CreateNewExternalDatabaseServer LiveLoggingServer_ID "DatabaseType:MicrosoftSQLServer:Name:MyLoggingDb:Server=(localdb)\MSSQLLocalDB;Trusted_Connection=True;" --dir ./rdmp-yaml
Using a file system backend for RDMP allows use of tools such as git
for change tracking.
Yaml files generated by RDMP are all human readable and will result in a sensible diff log
over time.
Using a file system backend removes the requirement for having an Sql Server instance available.
YamlRepository does not support multiple concurrent user writes. ID allocations for new objects are based on the current system state and therefore if multiple separate processes are creating objects at once there will be a mismatch in allocation which will lead to corruption.
Multiple concurrent read-only processes are permitted. The easiest way to achieve this is to remove
write permissions on the directory (e.g. ./rdmp-yaml
) for the running accounts. Ensure that configuration
changes can only be made by a single admin user account who runs on a single 'write' process at a time.
When running with multiple read-only processes, a restart or refresh may be required to pick up configuration changes made by other processes.
As mentioned in the Quick Start you will still need some kind of relational database for audit and for storing cohorts etc. But this can be any DBMS type (e.g. MySql) and can even be the same server you use for your data repository.