Context And Scope
Current architecture is foremost defined in the context of Estonia. Following sections describe how the solution is going to be deployed.
Business Context
Estonian Biobank is the center of genomic research in Estonia and hosts a number of domain experts. Since its foundation, Estonian Biobank has collected samples combined with phenotypic data from more than 200.000 individuals. It has conducted or participated in hundreds of studies. Only recently, full genomes have been sequenced from samples, before that it was mostly genotyped data. Estonian Biobank is regulated by Human Genes Research Act (IGUS) and linking genetic data with specific individuals is highly classified.
HPC is the IT services provider for Estonian Biobank and rest of the academia. It is specifically designed to support large-scale computations on big data.
Hospital of University of Tartu is an example of an hospital that performs sequencing based on the request of medical doctors. The genetic data is used to analyse for specific genetic diseases or rare disorders. In this case, the sequenced data is produced in clinical context and has clinical value for other doctors. From the government perspective, sequencing has a cost and re-using the data could save time and money. Ideally, health-sector could store also genomic data for history and re-use, however, currently the health-care data storage (TEHIK) only accepts health records. Therefore, hospitals needs to store the data within their IT-systems.
Genomic data and health-care researcher is a broad term for any individual with bioinformatics background. The most valuable asset for such researchers is, of course, genomic data but often combined with some phenotypic data.
The following image depicts the relationship between the actors in the business context of the GDI. Hospital and Estonian Biobank are presented as data-providers. The GDI node system functions as a data-collector and processor that makes the data discoverable and accessible to researchers through the User Portal. The GDI node (including help-desk) is run by Estonian Biobank in partnership with HPC.

Technical Context
The GDI will provide central services:
- User Portal – the user interface for researcher to discover datasets and request access to the selected ones.
- Life Science AAI – federated user authentication service.
- Beacon Network – federated network for distributing and collecting Beacon API requests.
- REMS – management service for dataset access-requests.
(These services are not further covered in this architecture document.)
The GDI node will provide its data catalogue and Beacon API as online services. When data access is granted, the GDI node will allocate a secure processing environment (SPE) for the researcher(s).
