Lead Site Reliability Engineer
16 hours ago
Location: Cork, Ireland OR Prague, Czech Republic
Hybrid: 3 days in the office/week
As a Lead Site Reliability Engineer, you'll be at the forefront of building scalable, resilient, and observable systems that power Tricentis SaaS products globally. This is a hands-on engineering leadership role—balancing technical delivery, process ownership, and team mentorship.
You will drive initiatives across multiple products, shape SRE standards, and serve as a trusted partner to both engineering and product leaders. You will be responsible for elevating engineering quality and reliability while enabling scale and speed.
Your Impact as an
Lead and deliver cross-cutting initiatives to improve platform scalability, resilience, and cost efficiency.
Architect and implement cloud-native infrastructure that supports multi-region, multi-tenant deployments.
Improve observability strategy across systems and teams—including SLOs, error budgets, and alerting standards.
Coach and mentor engineers, guiding technical design reviews and promoting engineering excellence.
Own post-incident analysis and ensure learning loops are completed with preventive action.
Influence product reliability from early-stage design to production readiness reviews.
Establish and evolve standards for deployments, operational readiness, and incident response.
Serve as a technical advisor for engineering and product managers across the org.
As a valuable member of our SRE team, you'll have the opportunity to
Drive architectural discussions and make decisions that influence the SRE org and wider engineering teams.
Define and evolve technical roadmaps and execution plans aligned with company goals.
Partner with peers in security, infrastructure, and product to drive platform-wide improvements.
Lead incident response for high-impact outages and continuously reduce incident recurrence.
Contribute to SRE hiring through interviews, onboarding, and process refinement.
Guide the adoption of modern tooling and practices across teams (e.g., GitOps, self-service platforms, chaos engineering).
Represent SRE in leadership forums, bringing insights, trade-offs, and forward-looking strategies.
About You
6+ years of experience in SRE, Infrastructure, or DevOps roles, including technical leadership.
Expertise in building and operating production systems in public cloud (Azure).
Deep understanding of observability principles (SLOs, SLIs, metrics, traces, logs).
Strong experience with infrastructure-as-code, container orchestration, and CI/CD (Terraform, K8s, GitHub Actions).
Proven track record in leading technical projects, influencing architecture, and mentoring engineers.
Excellent communication and cross-functional collaboration skills.
Proactive, ownership-driven mindset with a passion for reliability and continuous improvement.
Our Tech Stack
AZURE , AWS, Terraform, GitHub Actions, Kubernetes, DataDog, Prometheus, Grafana, Betterstack, All-in-one incident management platform | , Jira and more
Our Culture
We don't just preach our values; we embody them in everything we do. We are committed to creating an environment that empowers, supports, and includes individuals, where trust, transparency, creativity, curiosity, and continuous improvement thrive on a daily basis.
Tricentis Core Values:
Knowing what we need to achieve and how to achieve it is important. Tricentis' core values define our ways of working and the behaviors we model that create an enjoyable and successful Tricentis life.
- Demonstrate Self-Awareness: Own your strengths and limitations.
- Finish What We Start: Do what we say we are going to do.
- Move Fast: Create momentum and efficiency.
- Run Towards Change: Challenge the status quo.
- Serve Our Customers & Communities: Create a positive experience with each interaction.
- Solve Problems Together: We win or lose as one team.
- Think Big & Believe: Set extraordinary goals and believe you can achieve them.
-
Site Reliability Engineer
3 days ago
Prague, Hlavní město Praha, Czech Republic Blackfluo Full time €90,000 - €120,000 per yearJob DescriptionLocation: Full remote, EU timezone (CET +/- 2 hours)Start Date: As soon as possibleLanguages: English requiredWe are looking for a skilled Site Reliability Engineer (SRE) with deep expertise in AWS to help us scale and secure our infrastructure. As an SRE, you will be instrumental in ensuring the reliability, performance, and scalability of...
-
Platform & Site Reliability Engineer
3 days ago
Prague, Hlavní město Praha, Czech Republic PriceHubble Full time 120,000 - 180,000 per yearAbout PriceHubblePriceHubble is on a mission to transform how real estate and financial professionals make decisions. We're a fast-growing European B2B SaaS company that leverages the power of AI and big data to bring next-level transparency and insight to the property market. Our digital solutions empower clients across the real estate value chain – from...
-
Site Reliability Engineer
16 hours ago
Prague, Hlavní město Praha, Czech Republic Cato Networks Full time 80,000 - 120,000 per yearNow we're looking for a visionary Site Reliability Engineer to join the R&D team. In this critical role, you will support our growing operation, network, and systems. You will play a pivotal role in administering our internal systems as well as participate in key design decisions. In this position, you can innovate, build best practice processes, and...
-
Site Reliability Engineer
2 weeks ago
Prague, Hlavní město Praha, Czech Republic Nord Security Full time 60,000 - 120,000 per yearThe world's most advanced VPN, and a whole lot more. If you're a curious problem-solver who carves their own path, join the team behind Threat Protection Pro, the NordLynx protocol, and the fastest VPN on the planet—tools that put privacy, security, and control back in people's hands Your impact? Helping millions take back control of their online...
-
Site Reliability Engineer
2 weeks ago
Prague, Hlavní město Praha, Czech Republic Nord Security Full time 90,000 - 120,000 per yearThe world's most advanced VPN, and a whole lot more. If you're a curious problem-solver who carves their own path, join the team behind Threat Protection Pro, the NordLynx protocol, and the fastest VPN on the planet—tools that put privacy, security, and control back in people's hands.Your impact? Helping millions take back control of their online...
-
Site Reliability Engineer
16 hours ago
Prague, Hlavní město Praha, Czech Republic Canonical - Jobs Full time 900,000 - 1,200,000 per yearCanonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation, and IoT. Our customers include the world's leading public cloud and silicon providers, and...
-
Site Reliability Engineer
16 hours ago
Prague, Hlavní město Praha, Czech Republic Nord Security Full time 120,000 - 240,000 per yearThe world's most advanced VPN, and a whole lot more. If you're a curious problem-solver who carves their own path, join the team behind Threat Protection Pro, the NordLynx protocol, and the fastest VPN on the planet—tools that put privacy, security, and control back in people's hands. Your impact? Helping millions take back control of their online...
-
Site Reliability Engineer
4 days ago
Prague, Hlavní město Praha, Czech Republic Thales Full time 60,000 - 120,000 per yearLocation: Praha, CzechiaThales people architect identity management and data protection solutions at the heart of digital security. Business and governments rely on us to bring trust to the billons of digital interactions they have with people. Our technologies and services help banks exchange funds, people cross borders, energy become smarter and much more....
-
Senior Site Reliability Engineer
16 hours ago
Prague, Hlavní město Praha, Czech Republic Canonical - Jobs Full time 80,000 - 150,000 per yearCanonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is very widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation and IoT. Our customers include the world's leading public cloud and silicon providers,...
-
Staff Site Reliability Engineer
16 hours ago
Prague, Hlavní město Praha, Czech Republic Outreach Full time 120,000 - 240,000 per yearAbout Outreach Outreach, founded in 2014, is the only complete AI Revenue Workflow Platform that helps sales leaders benefit from connected account visibility, performance insights, and higher forecasting accuracy across every GTM team. Outreach infuses agentic AI to power 100s of use cases across sales motions. From new logo prospecting to renewal and...