Clarifications: EoE SNSDE for R&D - Solution to support data harmonisation, cohort discovery, de-identification, and linkage

Questions and Clarifications

Below is a list of the 23 questions and clarifications relating to this opportunity.

1. You are asking suppliers to keep the tender documentation confidential. Can we share the tender documentation with potential technology partners or sub-contractors?

We can confirm that the tender documentation can be shared with technology partners or sub-contractors, however any partners engaged must keep the documentation confidential and not share with any other parties.

Last Updated : 16/09/2024

2. What is the anticipated budget envelope?

We are not able to share the budget envelope for this project. Pricing proposals should be submitted by shortlisted bidders as part of their stage 2 responses.

Last Updated : 16/09/2024

3. Who delivered the Discovery and Alpha stages?

Following a procurement exercise, Kainos Software Limited were awarded the contract to design and build the MVP for the East of England Secure Data Environment. What they, with close collaboration and involvement of AWS Professional Services, have delivered is covered in section 1.2 of 00_EoE-SNSDE_Data_harmonisation_Ptr_spec. As this engagement will combine the MVP SDE platform and existing functionality from the chosen partner, we felt this best matched the beta definition from https://www.gov.uk/service-manual/agile-delivery#phases-of-an-agile-project

Last Updated : 16/09/2024

4. Which assets are required to be live by March?

By March, we expect regional data assets, particularly secondary care, from trusts within the East of England region.

Last Updated : 17/09/2024

5. Do you plan to onboard national assets by March?

This work is focussed on supporting cohort discovery across participating NHS Trusts in the region. We may separately bring in national data assets to the EoE SDE in this timeframe, but they are out of scope for this discovery infrastructure work.

Last Updated : 17/09/2024

6. Given that vendors will eventually access PLD after the initial stage of the proof of concept, are you comfortable with using a combination of on-shore and off-shore resources (European and non-European)? If so, then are there any restrictions you have in mind with regards to conducting work off-shore (e.g. only on test data or de-identified data) and locations (e.g. Europe or UK Data Partnership countries only)?

All resources must be within the UK. It is not acceptable for any data to be transferred or accessed outside of the UK for any purpose.

Last Updated : 18/09/2024

7. Is it correct to assume that you will be the data controller and that vendors will act as data sub-processors on your behalf, using your systems and infrastructure?

Cambridge University Hospitals is the data controller for the East of England Secure Data Environment (and lead organisation). Relevant Data Processing Agreements will be put in place where required.

Last Updated : 18/09/2024

8. You are looking for a proven solution: Could you please outline where you would like that solution to sit on the continuum between fully packaged and delivered as a Service versus being more open-source based using commodity components assembled on your behalf?

Any combination would be considered. However, there is a requirement to use existing and proven products that can be "knitted" together in a novel way that meets the specification. Use of open-source tooling is not discouraged, however support response should be considered.

Last Updated : 18/09/2024

9. In the data harmonization section, there is a requirement of “Transform and load (ETL) services which can harmonise synthetic EPRs from the NHS data providers into the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) v5.4.” Please could you expand on what synthetic EPRs are?

Synthetic or "dummy" data will be created to mimic the data we will receive from the requested data fields within the Electronic patient Record (EPR) systems that each data provider are using. The data fields are likely to differ between EPRs. Synthetic data will be used to test the data harmonisation between different providers and the OMOP Common Data Model, prior to tests on live data.

Last Updated : 18/09/2024

10. Please could you provide an indicative budget that will enable us to submit with budget considered.

We are not able to share the budget envelope for this project. Pricing proposals should be submitted by shortlisted bidders as part of their stage 2 responses.

Last Updated : 18/09/2024

11. Can you please confirm whether the 750 character limit is for each question on each section (essential skills and experience section and nice to have skills and experience section), i.e. Q1.1 750 characters, Q1.2 750 characters and so on or whether the 750 character limit is cumulative for each section, i.e. 750 characters in total for the essential skills and experience section and 750 characters in total for the nice to haves skills and experience section.

We can confirm that the 750 character limit is per question and applies to each question in both essential skills and experience and nice to have skills and experience.

Last Updated : 18/09/2024

12. What is the information governance status in relation to Section 251 CAG and REC Approval for the permission to de-identify assets for the purpose of research?

Applications have been made both to the Research Ethics Committee and the Confidentiality Advisory Group to seek ethical approval for the SDE as a database and for the permission to de-identify assets for the purpose of research, respectively.

Last Updated : 18/09/2024

13. What assets will be covered by the CAG and REC approval for the SDE?

Assets covered by the approval and so made available through the EoE SDE include all data from patients who have an EPR at participating NHS organisations based in the EoE and who haven’t opted out. The data available through the EOE SDE will initially include structured clinical data collected as part of routine care, subject always to approval from the EOE SNSDE Data Access Committee.

Last Updated : 18/09/2024

14. What is the information governance status of the local information governance framework for the onboarding of local assets?

The proposed local information governance framework for the SDE is included within the database protocol which is currently under review by REC and CAG.

Last Updated : 18/09/2024

15. The timeframe for submission of stage 1 suggests submission is on 26/09/2024 at 12:00 but in 4.1 of the tender document it specifies 25/09/2024 as the submission date. Please can you confirm that the correct date is the 26/09/2024 for Stage 1 Submission.

We can confirm that the correct date for the Stage 1 Submission is 26/09/2024 at midday. Apologies for the error in 4.1.

Last Updated : 19/09/2024

16. Does the buyer want the supplier to provide the platform and software or to deliver our software on the buyer’s existing platform?

For OMOP transformations and surfacing data for discovery within NHS provider organisations, a platform agnostic approach suitable for local deployment would be preferable (or requirements made clear if this needs to be standardised). Tooling to aggregate responses from the data providers and link to the national system via API would ideally sit within the existing SDE footprint. However adjacent infrastructure on AWS, but administratively managed by the SDE, is acceptable.

Last Updated : 19/09/2024

17. The specification suggests “extracting the required information from their record systems including Electronic Patient Records (EPRs) and pathology/pharmacy records.” Can you confirm what LIMS/PIS systems you require a supplier to integrate with?

We are currently engaged with different NHS provider organisations to confirm who is participating in this POC and MVP. These organisations all use different systems so what is certain is this system will need to be able to interact with different types of LIMS and PIS systems. We expect the proposed solution to be flexible enough that it can be adapted to new systems as and when these are integrated.

Last Updated : 19/09/2024

18. The specification suggests that the Live Data PoC must integrate with at least 2 different types of EPR data such as EPIC and Cerner. Can you confirm if the PoC/MVP will be required to integrate with others?

The POC and MVP will need to integrate with multiple different EPR systems across the East of England as different trusts use different systems. We expect there to be an expansion of systems as we move from POC to MVP. The precise list of systems is still being defined. Epic and Cerner have been given as examples as these are systems in use across East of England Trusts currently.

Last Updated : 19/09/2024

19. How many users do you expect to undertake MVP testing with?

We expect the MVP testing to be with at least 3 NHS provider organisations and at least 10 researchers using the Cohort Discovery and delivery services. This may change once use cases have been refined.

Last Updated : 19/09/2024

20. Are you able to share the size and complexity of the datasets (Synthetic and Live) you will be using for both the PoC and MVP delivery.

A target dataset for one trust would cover approximately 100,000 patients in one disease area - longitudinal patient records covering ~10 years. This would be split across half a dozen or so tables and take up a few GB of space inflated (uncompressed). This will vary slightly between trusts as the coverage will vary around the number of patients records include, the data fields available, and the length of time that electronic records are available for. The synthetic dataset will be a smaller subset, covering the same breadth of data but likely fewer patients over a shorter time period.

Last Updated : 19/09/2024

21. Can you clarify your build preference for example, would you expect a supplier to adapt existing open-source tools and frameworks (e.g., OMOP transformation scripts) or would you expect a supplier to adapt a commercial-off-the-shelf solution?

Any combination would be considered. However, there is a requirement to use existing and proven products that can be "knitted" together in a novel way that meets the specification. Use of open-source tooling is not discouraged, however how actively supported the tool is should be considered.

Last Updated : 19/09/2024

22. Can you clarify your MVP requirement in relation to ‘each NHS data provider’s own infrastructure’. Can each NHS data provider host one or more Kubernetes container images, or a Virtual Machine image, or do you expect new physical hardware and software to be required? Do you have a preferred deployment model agreed?

Each NHS provider will provide infrastructure to support execution of a machine image within their compute & GDPR real-estate. We do not expect new physical hardware to be required, nor software beyond the proposed solution.

Last Updated : 19/09/2024

23. Please could an extension be granted that will take us to midday on the 30th September?

Unfortunately, we are unable to grant an extension due to the tight timescales we have to deliver this project.

Last Updated : 19/09/2024

Return to EoE SNSDE for R&D - Solution to support data harmonisation, cohort discovery, de-identification, and linkage