Job Specifications
About the Opportunity:
We’re seeking an experienced, highly collaborative SRE to partner with product teams and tackle our most critical infrastructure challenges. You’ll be hands-on in designing, building, and operating our cloud platform—and driving the reliability, performance, and security that empower our engineering organization.
Responsibilities:
Infrastructure as Code & CI/CD: Automate provisioning and deployments with
Terraform and integrate best-practice pipelines (GitHub Actions, ArgoCD, etc.).
Reliability Engineering: Define SLIs/SLOs, manage error budgets, and build dashboards & alerts to proactively measure and improve system health.
Security & Compliance: Enforce least-privilege IAM policies, automate vulnerability scans, and maintain audit logging for compliance.
Monitoring & Observability: Instrument services with metrics, logs, and distributed tracing to enable rapid troubleshooting, aid teams in alerting, custom metrics, and dashboarding
Incident Management: Own on-call rotations, lead real-time incident response, conduct post-mortems, and drive continuous improvements.
Cost Optimization: Implement tagging strategies, right-size resources, and leverage concrete data to decide on optimal methods to control cloud spend at scale.
Documentation & Mentorship: Author runbooks, standards, and best-practice guides—and coach dev teams on implementing modern DevOps, reliability, and security patterns.
Required Qualifications:
Have 5+ years of experience running production critical systems.
Deep proficiency with the AWS Cloud and Cloud-Native best practices.
Experience with Kubernetes (EKS, GKE) and Container Orchestration at scale.
Skilled in Terraform to declaratively provision and maintain infrastructure services.
Working knowledge of managing and debugging databases like Redis and Postgres.
Strong familiarity with VPC, VPN, Load Balancing, and cloud networking components.
Proficiency with Git workflows, branching strategies, and CI/CD systemintegrations.
Solid understanding of web and network protocols and standards (HTTP, REST, TLS, DNS, etc...)
Professional proficiency in English (both written and spoken) is required for this role.
Nice to Have Skills:
Bachelor's degree, or equivalent in Computer Science, Engineering, or a related field.
Experience with ArgoCD, GitHub Actions, Jenkins, or other CI/CD pipeline solutions.
Working knowledge of Python, Golang, and Helm templating languages.
Node.js experience a plus, including running scalable, resilient Node microservices.
Grasp of foundational security best practices for cloud infrastructure.
Awareness of Terragrunt, managing Terraform state, and optimal project structure.
Seasoned in production readiness fundamentals amidst a fast-moving team.
Avenue Code reinforces its commitment to privacy and to all the principles guaranteed by the most accurate global data protection laws, such as GDPR, LGPD, CCPA and CPRA. The Candidate data shared with Avenue Code will be kept confidential and will not be transmitted to disinterested third parties, nor will it be used for purposes other than the application for open positions. As a Consultancy company, Avenue Code may share your information with its clients and other Companies from the CompassUol Group to which Avenue Code’s consultants are allocated to perform its services.
About the Company
What happens when you put together a group of talented engineers, decades of combined experience working in digital transformation, and the determination to deliver an extraordinary experience for clients, partners, and employees? You create the leading software consultancy focused on delivering end-to-end digital transformation solutions for enterprise organizations of all kinds.
Avenue Code is more than just a software development company. When it comes to our practice areas, we are experts in web solutions, infrastructur...
Know more