Job Specifications
Bristol, Darlington, London, Manchester, Wolverhampton
Job Summary
Here at the Ministry of Housing and Local Communities (MHCLG), we work on things that make a real difference to people's lives.
Whether it's through the homes we live in, the work of our local councils, or the communities we're all a part of, our work is at the top of the political agenda. We have ambitious and far-reaching outcomes to achieve this year and, if you're thinking of joining us, there's never been a more exciting time.
We have circa 3,500 staff who are based in 20 offices across the UK. The Cloud Operations function is a core component of our Technology team, which operates within the Digital Directorate. As a Cloud Infrastructure Engineer, your role will be an integral part of this team.
The technology team is responsible for delivering various modern technology services to our user base. We work flexibly to deliver excellent user centred services within Microsoft Azure, AWS, Microsoft 365 and other SaaS and Desktop apps. We're now looking to build internal capability and skills so that we can take forward cloud-based services for the department.
We're looking for a Cloud Infrastructure Engineer to join our Cloud Operations team, supporting both infrastructure and cloud operations across Azure, AWS, and Zscaler platforms.
The Cloud Operations team is responsible for monitoring and maintaining cloud services, ensuring performance, reliability, and security. This includes managing systems accepted via the AIS (Accepted into Service) process, executing pre-approved and well-documented operational tasks, and overseeing user lifecycle processes (Joiners, Movers, Leavers) for department's cloud platforms and services'
A key part of the role involves Financial Operations (FinOps)--tracking cloud spend, producing monthly cost reports for senior leadership, and identifying opportunities for cost optimisation.
You'll work in a fast-paced, agile environment with a high degree of autonomy.
Job Description
As a Cloud Infrastructure Engineer You'll
Implement and support Infrastructure as Code (IaC) using tools such as Terraform
Manage and maintain cloud infrastructure (e.g., AWS, Azure) to ensure availability, performance, and scalability
Collaborate with Architects and Lead Engineers to maintain and improve CI/CD pipelines for infrastructure deployment
Automate routine operational tasks and identify opportunities to streamline processes.
Monitor system performance, availability, and respond to incidents in a timely manner
Support incident and problem management by diagnosing and resolving infrastructure issues across cloud environments
Optimise cloud resource usage and costs, producing monthly reports and identifying cost-saving opportunities (FinOps)
Manage Joiners, Movers, and Leavers (JML) processes, including secure access provisioning aligned with role-based access controls
Maintain and update Standard Operating Procedures (SOPs) and ensure compliance with government security policies in collaboration with the Cyber team. This includes overseeing patch and vulnerability management across cloud environments and driving improvements in Azure Secure Score through proactive security configuration and remediation activities.
Contribute to service transition and operational readiness for new cloud-based services and infrastructure changes
Provide coaching and line management support to team members, fostering professional development and ensuring effective delivery
Identify and improve proactive monitoring coverage, including the use of synthetic testing and modern remediation techniques to enhance system reliability and reduce mean time to recovery
As a Cloud Infrastructure Engineer You'll
Implement and support Infrastructure as Code (IaC) using tools such as Terraform
Manage and maintain cloud infrastructure (e.g., AWS, Azure) to ensure availability, performance, and scalability
Collaborate with Architects and Lead Engineers to maintain and improve CI/CD pipelines for infrastructure deployment
Automate routine operational tasks and identify opportunities to streamline processes.
Monitor system performance, availability, and respond to incidents in a timely manner
Support incident and problem management by diagnosing and resolving infrastructure issues across cloud environments
Optimise cloud resource usage and costs, producing monthly reports and identifying cost-saving opportunities (FinOps)
Manage Joiners, Movers, and Leavers (JML) processes, including secure access provisioning aligned with role-based access controls
Maintain and update Standard Operating Procedures (SOPs) and ensure compliance with government security policies in collaboration with the Cyber team. This includes overseeing patch and vulnerability management across cloud environments and driving improvements in Azure Secure Score through proactive security configuration and remediation activities.
Contribute to service transition and operational readiness f