cover image
DAT Freight & Analytics

Site Reliability Engineer

Hybrid

Denver, United states

Fresher

Full Time

29-10-2025

Share this job:

Skills

Python Java JavaScript Go Kotlin Incident Response GitHub Kubernetes Monitoring Problem-solving Organization AWS cloud platforms C++ Terraform GitHub Actions

Job Specifications

About DAT

DAT is an award-winning employer of choice and a next-generation SaaS technology company that has been at the leading edge of innovation in transportation supply chain logistics for 45 years. We continue to transform the industry year over year, by deploying a suite of software solutions to millions of customers every day - customers who depend on DAT for the most relevant data and most accurate insights to help them make smarter business decisions and run their companies more profitably. We operate the largest marketplace of its kind in North America, with 400 million freights posted in 2022, and a database of $150 billion of annual global shipment market transaction data. Our headquarters are in Denver, CO, and Beaverton, OR, with additional offices in Seattle, WA; Springfield, MO; and Bangalore, India. For additional information, see www.DAT.com/company

Job Application Deadline: 11/30/2025

The Opportunity

DAT is looking for a Site Reliability Engineer to join our SRE platform team. This position will work hybrid or remote in Seattle, WA.

Candidate profile

DAT is seeking an experienced Site Reliability Engineer to help grow our SRE practices. In this role, you will be responsible for contributing to technical initiatives and enhancing your skills. You’ll work closely with development teams and platform architects to achieve critical reliability goals and help scale our platform.DAT is actively seeking a highly skilled and experienced Site Reliability Engineer (SRE) to play a pivotal role in the expansion and maturation of our SRE practices. In this critical position, the successful candidate will be instrumental in driving key technical initiatives, fostering a culture of continuous improvement, and significantly enhancing their own professional expertise.

This role necessitates close collaboration with various stakeholders, including our dedicated development teams and platform architects. The primary objective of these partnerships is to collectively achieve ambitious reliability goals and strategically scale our platform to meet evolving growth of the company. The SRE will be responsible for ensuring the stability, performance, and scalability of our systems, implementing robust monitoring solutions, automating operational tasks, and proactively identifying and resolving potential issues. This will involve a deep understanding of distributed systems, cloud infrastructure, and a commitment to best practices in site reliability engineering.

What You’ll Do

Contribute to the design, implementation, and maintenance of scalable and reliable systems. Collaborate with engineering teams to ensure reliability targets are met.
Identify and troubleshoot complex issues across distributed systems, ensuring minimal downtime and optimal performance.
Advocate for and implement SRE best practices, including automation, monitoring, and incident response, to enhance system resilience.
Participate in capacity planning and performance tuning to proactively address potential bottlenecks and support future growth.
Leverage new AI tools to assist with coding and observability tasks.
Assist and respond to critical engineering incidents.
Improve your engineering skills within the SRE team.
Provide technical guidance and best practices for use of cloud infrastructure and tooling. Contribute to Infrastructure-as-Code within the platform. We strive to automate all the things!
Contribute to reliability-focused initiatives and projects.
Help optimize our work to be customer-focused. Continually seek feedback from our customers on how we can improve.
Assist in migrating legacy systems to modern, scalable cloud environments.
Help develop and drive a culture of continuous improvement with the Platform Engineering and Software Engineering groups.
Participate in an on-call rotation.

The Skills And Experience You’ll Bring

Strong collaboration and problem-solving abilities, especially within SRE or Platform Engineering/Infrastructure teams.
Total of 2 to 4+ years industry experience
At least 1 year of software engineering experience (JavaScript, Python, Go, Java/Kotlin, C++, etc)
Experience with modern observability tools (Datadog preferred).
Experience with cloud platforms (preferably AWS).
Demonstrated success in contributing to large technical initiatives and acting as a driving force to complete those initiatives.
Proven experience assisting in modernizing legacy code and infrastructure.
Ability to work closely with peer teams, platform/software architects and management to drive key reliability improvements.
Willingness to share your expertise among team members and others within the engineering organization. We value upleveling our peers however we can.
Understanding of cloud infrastructure, automation, and best practices for reliability.
Experience with our tools (Kubernetes, ArgoCD, Terraform, Github Actions) a plus.

Why DAT?

DAT is an award winning employer of choice.

For starters, we have a hybrid work envi

About the Company

DAT Freight & Analytics operates DAT One, North America's largest truckload freight marketplace; DAT iQ, the industry's leading freight data analytics service; and Trucker Tools, the leader in load visibility. Shippers, transportation brokers, carriers, news organizations, and industry analysts rely on DAT for market trends and data insights, informed by nearly 700,000 daily load posts and a database exceeding $1 trillion in freight market transactions. Founded in 1978, DAT is a business unit of Roper Technologies (Nasdaq: R... Know more