Risks And Technical Debt

Following risks have been identified:

  1. Performance: support for long-lasting demanding computations has not been clearly provisioned yet. Testing is required how well data streaming is supported to avoid slowing down computational processes.
  2. Clinical environment: the requirements for handling clinical data have not been clearly defined yet. We hope that current architecture covers most of the needs.
  3. Pseudonymisations of individuals: currently a process that is left to data-providers to implement and support.
  4. Data exposure: data analysis must not expose sensitive data through logs and files, but due to the amount of data, it’s complicated to judge.
  5. Filenames of uploads may contain sensitive information.
  6. Metadata management by different organisations may overlap individuals and their information; merging that information may cause unexpected results.
  7. Patients (data subjects) cannot view the usage of their data. This is not currently required from the system.
  8. System monitoring may have blind spots or an overload of information.
  9. Administrative users don’t respond to requests from other users. We aim to provide email-based notifications.
  10. Users find the secure processing environment too restrictive (can’t peek into files) and hard to use (have to develop new scripts and images to use existing tools).
  11. Docker images need to be continuously monitored for potential security bugs.

Following technical debt has been identified:

  1. Estonian Biobank has not developed its systems for integrating with public interfaces: need pipelines to manually prepare the metadata for the GDI node.

Since most of the system is built from scratch, the list of technical debt is minimal.