- Company Name
- imec
- Job Title
- HPC Unix Engineer
- Job Description
-
**Job title:** HPC Unix Engineer
**Role Summary:**
Manage, optimize, and support a high‑performance computing (HPC) environment for simulation, electronic design automation (EDA), and numerical modeling workloads. Focus on SLURM cluster administration, user support via Open OnDemand, system monitoring, and automation of cluster configuration and software deployment.
**Expectations:**
- Deliver 2nd‑line Linux/HPC support and incident resolution.
- Ensure cluster performance, scalability, and stability.
- Implement automation, monitoring, and secure configurations.
- Maintain scientific software stacks and container environments.
**Key Responsibilities:**
1. **Operational Support & Incident Management**
- Resolve job scheduling, performance, and software usage issues.
- Provide 2nd‑line support for Linux and HPC environments.
- Support users through Open OnDemand (interactive and batch).
- Follow ITIL processes for incidents, changes, and requests.
- Monitor system and job metrics with Prometheus/Grafana.
2. **HPC System Management**
- Administer SLURM clusters, compute nodes, GPU resources.
- Perform OS patching, configuration management, and automation (Ansible, Bash, Python).
- Manage parallel/shared filesystems (NFS, Lustre) and object storage (S3).
- Control user access, quotas, and secure network settings.
- Maintain FlexLM license management.
3. **Software, Containers & Build Frameworks**
- Deploy and manage container technologies (Apptainer/Singularity, Podman, ENROOT, Pyxis).
- Build and maintain scientific software (Ansys, Cadence, COMSOL, MATLAB, Mentor, Synopsys) via EasyBuild and lmod.
- Optimize deployments, including Azure CycleCloud integration.
**Required Skills:**
- Proven experience with HPC environments, parallel computing, and job scheduling (SLURM).
- Strong problem‑solving and analytical abilities.
- Scripting and automation expertise (Ansible, Bash, Python).
- Understanding of performance tuning, cluster monitoring, and system hardening.
- Familiarity with Open OnDemand; Azure CycleCloud is a plus.
- Exposure to EDA and engineering software workflows in HPC contexts is advantageous.
**Required Education & Certifications:**
- Bachelor’s or Master’s degree in Computer Science, Information Technology, Systems Engineering, or a related field.
- Relevant certifications (e.g., Red Hat Certified Engineer, Linux Professional Institute, or HPC‑specific credentials) are preferred but not mandatory.