Staff Site Reliability Engineer

3 days ago

Remote Palo Alto, Czech Republic Oscilar Full time

At least 6+ years of hands-on experience managing critical, high-availability production infrastructure, demonstrating success in maintaining reliability and maximizing application uptime.
Proven experience in DevOps or SRE roles with a strong background in cloud services, preferably AWS.
Proficiency in IaC tools, such as Pulumi.
Strong programming skills in Go and Java.
Experience with system scalability and building high availability systems.
Knowledge of microservices architecture, containerization technologies (Docker, Kubernetes), and distributed data systems (Kafka, Clickhouse)
Excellent problem-solving abilities and a keen attention to detail.
Familiarity with security compliance frameworks (e.g., OWASP, ISO, CSA, PCI)
Ability to work collaboratively in a distributed team environment.
Extreme ownership mindset along with track record of championing system reliability, continuous improvement, and operational excellence throughout an organization.

Oscilar is growing fast, and so is the complexity of our systems. We're looking for a experienced SRE to take ownership of reliability across our multi-region, cloud-native platform. You'll have the mandate and autonomy to design, implement, and evolve systems that stay performant and resilient—through traffic spikes, dependency failures, and global deployments. You'll be shaping how we scale, how we build observability, and how we run infrastructure that supports billions of events and large-scale data pipelines.

,[Design, implement, and maintain a reliable cloud infrastructure using an infrastructure as code (IaC) approach., Ensure robust performance of the platform under diverse operational conditions, including traffic spikes and external dependency slowdowns., Continuously optimize and manage the CI/CD pipeline for efficient deployment processes., Develop and maintain essential alerts and metrics to ensure high observability and reliability of all components., Collaborate effectively with international colleagues in the US and Europe to align strategies and practices., Costs monitoring and alerting] Requirements: Golang, AWS, Docker, DevOps, Kubernetes, Microservices architecture, IaC, Pulumi, Java, GCP, Kafka Additionally: Flat structure, Small teams, International projects, Startup atmosphere, No dress code.

Staff Site Reliability Engineer

1 day ago

Remote, Palo Alto, Czech Republic Oscilar Full time

At least 8+ years of hands-on experience managing critical, high-availability production infrastructure, demonstrating success in maintaining reliability and maximizing application uptime. Proven experience in DevOps or SRE roles with a strong background in cloud services, preferably AWS. Proficiency in IaC tools, such as Pulumi. Strong programming skills...
Site Reliability Engineer Senior @

3 days ago

Remote, Czech Republic Akamai Full time

Have in-depth understanding of computer networking concepts, Security concepts, Unix/Linux internals, distributed systems, and systems design.Have professional experience in a Site Reliability, Development, or Systems Engineering role, with large scale distributed systemsDemonstrate experience with programming or scripting languages such as Python or...
Sr./Staff Backend Engineer

3 days ago

Remote, Palo Alto, Czech Republic Oscilar Full time

RequirementsBackend Development: 8+ years of experience with Java in large-scale, distributed environments.Kafka Mastery: Extensive experience with Apache Kafka, including Kafka Streams, Kafka Connect, partitioning, replication, and consumer group management.Cloud Infrastructure: Strong experience with AWS services (e.g., MSK, EC2, RDS, DynamoDB, S3,...
Site Reliability Engineer Senior @ Akamai

1 day ago

Remote, Czech Republic Akamai Full time

Have in-depth understanding of computer networking concepts, Security concepts, Unix/Linux internals, distributed systems, and systems design. Have professional experience in a Site Reliability, Development, or Systems Engineering role, with large scale distributed systems Demonstrate experience with programming or scripting languages such as Python or...
Senior Site Reliability Engineering Lead @

7 days ago

Remote, Czech Republic Akamai Full time

Have 5 years of relevant experience and a Bachelor's Degree in Computer Science or its equivalentPossess expert level experience in a DevOps, Development, or SysAdmin role working with large scale distributed systemsHave experience with building tools for automation and infrastructure at scale(python/go, terraform, saltstack, jenkins)Be able to work in...
Senior Site Reliability Engineering Lead @

1 week ago

Remote, Czech Republic Akamai Full time

Have 5 years of relevant experience and a Bachelor's Degree in Computer Science or its equivalentPossess expert level experience in a DevOps, Development, or SysAdmin role working with large scale distributed systemsHave experience with building tools for automation and infrastructure at scale(python/go, terraform, saltstack, jenkins)Be able to work in...
Senior Site Reliability Engineer @

3 days ago

Remote, Czech Republic Akamai Full time

Have relevant experience and a Bachelor's diploma in Computer Science or its equivalentPossess expert level experience in a SysAdmin (Linux/Unix Administration), DevOps or Software engineering role, working with large scale distributed systemsPossess at least one programming language (Python/Golang) and configuration management with...
Sr./Staff Backend Engineer

1 day ago

Remote, Palo Alto, Czech Republic Oscilar Full time

Requirements Backend Development: 8+ years of experience with Java in large-scale, distributed environments. Kafka Mastery: Extensive experience with Apache Kafka, including Kafka Streams, Kafka Connect, partitioning, replication, and consumer group management. Cloud Infrastructure: Strong experience with AWS services (e.g., MSK, EC2, RDS, DynamoDB, S3,...
Senior Site Reliability Engineering Lead @ Akamai

1 day ago

Remote, Czech Republic Akamai Full time

Have 5 years of relevant experience and a Bachelor's Degree in Computer Science or its equivalent Possess expert level experience in a DevOps, Development, or SysAdmin role working with large scale distributed systems Have experience with building tools for automation and infrastructure at scale(python/go, terraform, saltstack, jenkins) Be able to work in...
Senior Site Reliability Engineer @ Akamai

1 day ago

Remote, Czech Republic Akamai Full time

Have relevant experience and a Bachelor's diploma in Computer Science or its equivalent Possess expert level experience in a SysAdmin (Linux/Unix Administration), DevOps or Software engineering role, working with large scale distributed systems Possess at least one programming language (Python/Golang) and configuration management with...

Americas

Europe

Asia / Oceania

Africa

Staff Site Reliability Engineer