Risks And Technical Debt
Following risks have been identified:
- Performance: support for long-lasting demanding computations has not been clearly provisioned yet. Testing is required how well data streaming is supported to avoid slowing down computational processes.
- Clinical environment: the requirements for handling clinical data have not been clearly defined yet. We hope that current architecture covers most of the needs.
- Pseudonymisations of individuals: currently a process that is left to data-providers to implement and support.
- Data exposure: data analysis must not expose sensitive data through logs and files, but due to the amount of data, it’s complicated to judge.
- Filenames of uploads may contain sensitive information.
- Metadata management by different organisations may overlap individuals and their information; merging that information may cause unexpected results.
- Patients (data subjects) cannot view the usage of their data. This is not currently required from the system.
- System monitoring may have blind spots or an overload of information.
- Administrative users don’t respond to requests from other users. We aim to provide email-based notifications.
- Users find the secure processing environment too restrictive (can’t peek into files) and hard to use (have to develop new scripts and images to use existing tools).
- Docker images need to be continuously monitored for potential security bugs.
Following technical debt has been identified:
- Estonian Biobank has not developed its systems for integrating with public interfaces: need pipelines to manually prepare the metadata for the GDI node.
Since most of the system is built from scratch, the list of technical debt is minimal.