cover image
SpryPoint

SpryPoint

www.sprypoint.com

1 Job

248 Employees

About the Company

SpryPoint builds amazing cloud-based enterprise software for electric, water, gas, and telecom utilities. We give utilities the tools they need to provide top-tier customer service while improving operational efficiency and enabling a world of cutting-edge capabilities. We make the world a better place for utility operators and customers all over North America. We build Smart Solutions for Smart Utilities.

Listed Jobs

Company background Company brand
Company Name
SpryPoint
Job Title
Cloud Operations Engineer I
Job Description
Job title: Cloud Operations Engineer I Role Summary: Support and maintain AWS-based infrastructure for a growing utility software platform, ensuring environments are stable, secure and performance‑optimized during implementation, testing and production phases. Leverage automation, observability tools and AI-driven troubleshooting to resolve incidents and continuously improve operational processes. Expectations: - Rapidly learn and apply runbooks while proactively suggesting enhancements. - Manage multiple concurrent requests calmly and efficiently, prioritizing urgency and impact. - Communicate clearly across internal and client‑facing teams. - Use AI tools for faster diagnostics, documentation and workflow automation. Key Responsibilities: - Provision, update and decommission AWS environments (Elastic Beanstalk, EC2, ECS, RDS PostgreSQL & Aurora Serverless v2, DynamoDB, Route 53, VPC, S3). - Perform IP/domain whitelisting, access controls, database refresh coordination and general troubleshooting via Jira Service Management. - Investigate performance and reliability issues using logs, metrics, CloudWatch, Linux‐level debugging and observability platforms; escale SQL/indexing concerns to developers. - Support onboarding, testing, mock and production go‑lives for project teams. - Execute scheduled maintenance (patching, scaling, certificate updates, configuration adjustments). - Tune monitoring and alerting to detect incidents early. - Conduct incident analysis, root cause investigations and post‑mortem documentation. - Develop and maintain automation scripts with Python or Bash to streamline recurring tasks. - Analyze NGINX, service and application logs to isolate stack issues. - Document procedures, runbooks and environment guidelines in Confluence; record change management and time tracking in Jira. - Enforce security best practices: IAM permissions, patching, access controls, compliance checks. - Validate backups, perform restores and participate in disaster recovery drills. - Use AI tools to accelerate troubleshooting and improve operational documentation. Required Skills: - Hands‑on experience with AWS services: EC2, Elastic Beanstalk, ECS, RDS (PostgreSQL), Aurora Serverless v2, DynamoDB, Route 53, VPC, S3, IAM. - Linux system administration and shell scripting (Bash). - Python scripting for automation. - Familiarity with observability/monitoring tools (e.g., CloudWatch, Datadog, New Relic). - Logging and diagnostics (NGINX logs, application logs). - Experience using Jira Service Management and Confluence for change tracking and documentation. - Strong analytical skills; ability to diagnose performance bottlenecks and troubleshoot complex distributed systems. - Knowledge of database performance tuning, SQL and indexing. - Awareness of security and compliance best practices for cloud infrastructure. - Proficiency in using AI‑powered tools (e.g., Copilot, ChatGPT, generative assistants) for troubleshooting and automation. Required Education & Certifications: - Bachelor’s degree in Computer Science, Information Technology, Engineering, or equivalent work experience (minimum 2–3 years in cloud operations or DevOps). - Relevant AWS certification preferred (e.g., AWS Certified SysOps Administrator – Associate, AWS Certified Developer – Associate, or AWS Certified Solutions Architect – Associate).
Canada
Remote
11-02-2026