SpryPoint

www.sprypoint.com

1 Job

248 Employees

About the Company

SpryPoint builds amazing cloud-based enterprise software for electric, water, gas, and telecom utilities. We give utilities the tools they need to provide top-tier customer service while improving operational efficiency and enabling a world of cutting-edge capabilities. We make the world a better place for utility operators and customers all over North America. We build Smart Solutions for Smart Utilities.

Listed Jobs

Company Name: SpryPoint
Job Title: Cloud Operations Engineer I
Job Description: Job title: Cloud Operations Engineer I Role Summary: Support and maintain AWS-based infrastructure for a growing utility software platform, ensuring environments are stable, secure and performance‑optimized during implementation, testing and production phases. Leverage automation, observability tools and AI-driven troubleshooting to resolve incidents and continuously improve operational processes. Expectations: - Rapidly learn and apply runbooks while proactively suggesting enhancements. - Manage multiple concurrent requests calmly and efficiently, prioritizing urgency and impact. - Communicate clearly across internal and client‑facing teams. - Use AI tools for faster diagnostics, documentation and workflow automation. Key Responsibilities: - Provision, update and decommission AWS environments (Elastic Beanstalk, EC2, ECS, RDS PostgreSQL & Aurora Serverless v2, DynamoDB, Route 53, VPC, S3). - Perform IP/domain whitelisting, access controls, database refresh coordination and general troubleshooting via Jira Service Management. - Investigate performance and reliability issues using logs, metrics, CloudWatch, Linux‐level debugging and observability platforms; escale SQL/indexing concerns to developers. - Support onboarding, testing, mock and production go‑lives for project teams. - Execute scheduled maintenance (patching, scaling, certificate updates, configuration adjustments). - Tune monitoring and alerting to detect incidents early. - Conduct incident analysis, root cause investigations and post‑mortem documentation. - Develop and maintain automation scripts with Python or Bash to streamline recurring tasks. - Analyze NGINX, service and application logs to isolate stack issues. - Document procedures, runbooks and environment guidelines in Confluence; record change management and time tracking in Jira. - Enforce security best practices: IAM permissions, patching, access controls, compliance checks. - Validate backups, perform restores and participate in disaster recovery drills. - Use AI tools to accelerate troubleshooting and improve operational documentation. Required Skills: - Hands‑on experience with AWS services: EC2, Elastic Beanstalk, ECS, RDS (PostgreSQL), Aurora Serverless v2, DynamoDB, Route 53, VPC, S3, IAM. - Linux system administration and shell scripting (Bash). - Python scripting for automation. - Familiarity with observability/monitoring tools (e.g., CloudWatch, Datadog, New Relic). - Logging and diagnostics (NGINX logs, application logs). - Experience using Jira Service Management and Confluence for change tracking and documentation. - Strong analytical skills; ability to diagnose performance bottlenecks and troubleshoot complex distributed systems. - Knowledge of database performance tuning, SQL and indexing. - Awareness of security and compliance best practices for cloud infrastructure. - Proficiency in using AI‑powered tools (e.g., Copilot, ChatGPT, generative assistants) for troubleshooting and automation. Required Education & Certifications: - Bachelor’s degree in Computer Science, Information Technology, Engineering, or equivalent work experience (minimum 2–3 years in cloud operations or DevOps). - Relevant AWS certification preferred (e.g., AWS Certified SysOps Administrator – Associate, AWS Certified Developer – Associate, or AWS Certified Solutions Architect – Associate).

Canada

Remote

11-02-2026