Soteria at the University of Arizona is a secure data and analysis enclave for conducting research with Personally Identifiable Information (PII) and Protected Health Information (PHI) data.
- Working with PII has become commonplace for many disciplines. However, the regulations and compliance that exist to ensure the protection often create barriers that are challenging for highly collaborative, interdisciplinary projects that involve managing multiple data types. The Soteria environment is committed to providing a secure managed data and analysis enclave that is HIPAA compliant and easily accessible.
- Unified and secure enclaves like Soteria have been established to accelerate research and collaboration, it takes inspiration from efforts such as the NIH NC3 (National COVID Cohort Collaborative)(link is external) that allows healthcare providers, public health experts and epidemiologists to collaborate and draw conclusions at a speed previously inaccessible to researchers (link is external). With access to multiple sources of PII and PHI, University of Arizona researchers will be able to perform similarly groundbreaking research via Soteria.
- With funding support from the University of Arizona Health Sciences 5.3 Strategic Initiatives plan, the University of Arizona Data Science Institute, University of Arizona Information Technology Services, Center for Biomedical Informatics and Biostatistics and CyVerse have joined forces to bring big data science techniques and capabilities to biomedical research.
Frequently Asked Questions (FAQs)
Why is it called Soteria?
In Greek mythology, Soteria was the goddess or spirit of safety and salvation, deliverance, and preservation from harm.
Will this satisfy NIH guidelines for data storage and practices?
Yes, UA Soteria is in compliance with the most recent NIH guidelines, including for genomic data.
Will there be HPC capabilities for protected data?
Yes, Soteria at UArizona has a dedicated HPC cluster exclusive to its users with the same compliance standards.
Will I be able to share my analysis apps built in this environment?
Soteria has Posit Connect Health capabilities which allow for publishing and distribution of fully de-identified data and results.
Will it support Jupyter notebooks?
Yes. Soteria uses CyVerse Health and Posit Connect Health which have Jupyter notebooks, R Markdown, R Shiny, AWS capabilities and more.
Will I need significant technical expertise?
No, you won’t need to know any software engineering, only the technical knowledge required to run the programs you already use. Our programmers, engineers and IT managers will handle the rest.
Will I be able to share my data or work with external collaborators?
As long as your collaborator has a UA NetID or the ability to get one, they will be able to access Soteria at UArizona with approval.
Can I import code from GitHub?
Yes, but you will not be able to export code to GitHub given the sensitive nature of information stored in the environment.
Can I use Soteria to publish my dataset?
You can work with the CB2 honest brokering service to completely de-identify your data for public storage or publishing. We hope to work with the UA Libraries ReDATA project to curate a collection of de-identified healthcare datasets as the project progresses.
How will this work with RIA data requests and IRB approval?
UAHS RIA, IRB, and HIPAA offices are all aware of and in communication with Soteria operations at UArizona. Both RIA and IRB operating systems lend themselves well to the cloud-based nature of Soteria.