Job Specifications
About Arc Institute
The Arc Institute is a new scientific institution that conducts curiosity-driven basic science and technology development to understand and treat complex human diseases. Headquartered in Palo Alto, California, Arc is an independent research organization founded on the belief that many important research programs will be enabled by new institutional models. Arc operates in partnership with Stanford University, UCSF, and UC Berkeley.
While the prevailing university research model has yielded many tremendous successes, we believe in the importance of institutional experimentation as a way to make progress. These include:
Funding: Arc will fully fund Core Investigator’s (PI’s) research groups, liberating scientists from the typical constraints of project-based external grants.
Technology: Biomedical research has become increasingly dependent on complex tooling. Arc Technology Centers develop, optimize and deploy rapidly advancing experimental and computational technologies in collaboration with Core Investigators.
Support: Arc aims to provide first-class support—operationally, financially and scientifically—that will enable scientists to pursue long-term high risk, high reward research that can meaningfully advance progress in disease cures, including neurodegeneration, cancer, and immune dysfunction.
Culture: We believe that culture matters enormously in science and that excellence is difficult to sustain. We aim to create a culture that is focused on scientific curiosity, a deep commitment to truth, broad ambition, and selfless collaboration.
Arc has scaled to nearly 300 people to date. With $650M+ in committed funding and a state of the art new lab facility in Palo Alto, Arc will continue to grow quickly in the coming years.
About the position
We are seeking an Infrastructure Engineer to join our team. In this role, you will be responsible for designing and optimizing our Hybrid Cloud Infrastructure Platform across public, private, and on-premise datacenters. You will work closely with researchers, developers, and IT professionals to ensure the availability, reliability, and performance of our compute, networking, and storage. Your work will fuel the development of AI biological foundation models (i.e. Evo2; Arc’s recently updated DNA foundation model), the Virtual Cell Initiative, and other cutting-edge bioinformatic projects in the context of Institute-wide efforts.
About you
You lead with empathy. You know that successful systems are more about the user than the tool. You enjoy building relationships and credibility with your colleagues.
You enjoy solving problems. Any new project is an interesting puzzle. So is a tricky troubleshooting issue. You get satisfaction from helping someone get to resolution.
You’re curious. You like to keep track of the latest developments in your field, and to learn about the substance behind your employer’s mission.
In this position you will
Design and implement scalable data pipelines for single-cell genomics, experimental datasets, and bioinformatics workflows using tools like Nextflow, Prefect, or GCP Cloud Workflows
Build and maintain ExperimentDB from 0-1, establishing database architecture, data models, and APIs to support experimental data capture and retrieval across the institute
Develop data discovery infrastructure including catalog systems, metadata management, and data governance frameworks to enable scientists to find and access relevant datasets
Automate bioinformatics workflows and create agentic systems (e.g., automated single-cell analysis pipelines) that reduce manual intervention and accelerate research
Collaborate with scientists and engineers to understand data requirements, optimize query performance, and deliver tailored data infrastructure solutions
Establish best practices for data quality, versioning, and documentation to ensure reproducibility and reliability across research projects
Requirements
Bachelor's degree in Computer Science, Data Engineering, Bioinformatics, or a related field
Strong experience with workflow orchestration platforms (Nextflow, Prefect, Airflow, or similar) and building production data pipelines
Extensive experience with distributed systems, specifically data technologies: analytics warehouses, relational databases (PostgreSQL-based), object storage, and parallel/distributed file systems
Advanced skills in Python and SQL for data transformation, ETL processes, and API development
Experience with database design and optimization for scientific or analytical workloads at scale
Familiarity with bioinformatics data formats and tools (FASTQ, BAM, Cell Ranger, single-cell analysis) is a strong plus
Excellent problem-solving abilities and strong communication skills for translating scientific needs into technical solutions
Ability to work hybrid onsite (3 days per week) in Palo Alto, CA
The base salary range for this position is $158,500 to $196,000. These amounts reflect the range of base salar
About the Company
Arc Institute is a new scientific institution that conducts curiosity-driven basic science and technology development. Headquartered in Palo Alto, California, Arc is a non-profit organization founded on the belief that many important research programs will be enabled by new academic models. Arc operates in partnership with Stanford University, UCSF, and UC Berkeley. As individuals, Arc researchers collaborate across diverse disciplines to study complex diseases, including cancer, neurodegeneration, and immune dysfunction. As...
Know more