cover image
Arc Institute

Arc Institute

www.arcinstitute.org

1 Job

265 Employees

About the Company

Arc Institute is a new scientific institution that conducts curiosity-driven basic science and technology development. Headquartered in Palo Alto, California, Arc is a non-profit organization founded on the belief that many important research programs will be enabled by new academic models. Arc operates in partnership with Stanford University, UCSF, and UC Berkeley. As individuals, Arc researchers collaborate across diverse disciplines to study complex diseases, including cancer, neurodegeneration, and immune dysfunction. As an organization, Arc strives to enable long-term research agendas by betting on people rather than projects, and making it easier to invent and deploy new technologies at scale. Together, our mission is to accelerate scientific progress, understand the root causes of disease, and narrow the gap between discoveries and impact on patients.

Listed Jobs

Company background Company brand
Company Name
Arc Institute
Job Title
Infrastructure Engineer
Job Description
**Job Title** Infrastructure Engineer **Role Summary** Design, implement, and maintain a hybrid cloud infrastructure platform for computational biology. Build scalable data pipelines, database systems, and data discovery tools to support AI and bioinformatics projects. **Expectations** - Deliver high‑availability compute, networking, and storage solutions across public, private, and on‑premise environments. - Enable researchers to access and analyze large scientific datasets efficiently. - Collaborate cross‑functionally to translate scientific requirements into technical architecture. - Continuously optimize performance, reliability, and cost of infrastructure. **Key Responsibilities** - Architect and deploy scalable data pipelines for single‑cell genomics and other experimental datasets using Nextflow, Prefect, or GCP Cloud Workflows. - Build ExperimentDB from inception: design schema, implement PostgreSQL storage, and expose REST/GraphQL APIs. - Develop catalog, metadata, and governance systems to support data discovery and access. - Automate bioinformatics workflows, reducing manual steps through orchestration and agentic design. - Optimize query performance and troubleshoot distributed data systems. - Establish best practices for data quality, versioning, documentation, and reproducibility. - Partner with scientists and engineers to understand data requirements and deliver tailored solutions. **Required Skills** - Experience with workflow orchestration platforms (Nextflow, Prefect, Airflow, or similar). - Proficiency in Python and SQL for ETL, data transformation, and API development. - Strong background with distributed data technologies: analytics warehouses, relational databases (PostgreSQL), object storage, parallel/distributed file systems. - Database design and optimization for large, scientific workloads. - Familiarity with bioinformatics formats (FASTQ, BAM, Cell Ranger, single‑cell analysis) is highly desirable. - Excellent troubleshooting, problem‑solving, and communication skills. - Ability to work in a hybrid onsite arrangement. **Required Education & Certifications** - Bachelor’s degree in Computer Science, Data Engineering, Bioinformatics, or a related field. - (No specific certifications required.)
Palo alto, United states
On site
08-12-2025