MLOps Engineer @ RemoDevs

4 days ago


Warsaw, Czech Republic RemoDevs Full time
  • Good knowledge of cloud infrastructure (AWS, Azure, or GCP) and container orchestration (Docker, Kubernetes, ECS/EKS).
  • Hands-on experience running AI/ML services in production.
  • Experience with CI/CD pipelines for AI, LLM workflows, and model deployments.
  • Knowledge of distributed AI serving frameworks and inference optimization.
  • Understanding of monitoring, observability, and incident response for AI.
  • Experience setting up AI system health metrics, dashboards, and alerts.
  • Awareness of AI security, data protection, and compliance needs.
  • Interest in learning and using new AIOps and AI observability tools.

Overview

We are a leader in AI-powered business operations. Our goal is to make companies work better and faster by using smart technology. We help improve efficiency, simplify workflows, and create new growth opportunities, especially in private capital markets.

Our ecosystem has three main parts:

  • PaaS (Platform as a Service): Our core AI platform that improves workflows, finds insights, and supports value creation across portfolios.
  • SaaS (Software as a Service): A cloud platform that delivers strong performance, intelligence, and execution at scale.
  • S&C (Solutions and Consulting Suite): Modular technology playbooks that help companies manage, grow, and improve performance.

With more than 10 years of experience supporting fast-growing companies and private equity-backed platforms, we know how to turn technology into a real business advantage.

About the Role

We are looking for an MLOps / AIOps Engineer to manage the deployment, running, and monitoring of AI services in production. This role combines infrastructure engineering and AI systems. You will make sure our AI-powered APIs, RAG pipelines, MCPs, and agent services work safely, reliably, and at scale. You will work closely with ML Engineers, Python Developers, and AI Architects to design strong infrastructure and workflows for distributed AI applications.

Key Responsibilities

  • Create and maintain infrastructure-as-code for AI services (Terraform, Pulumi, AWS CDK).
  • Build and run CI/CD pipelines for AI APIs, RAG pipelines, MCP services, and LLM agent workflows.
  • Set up monitoring and alerting for AI systems and LLM observability.
  • Track metrics like latency, error rates, drift detection, and hallucination monitoring.
  • Improve inference workloads and manage distributed AI serving tools (Ray Serve, BentoML, vLLM, Hugging Face TGI).
  • Work with ML Engineers and Python Developers to define safe, scalable, and automated deployment processes.
  • Follow standards for AI system security, data governance, and compliance.
  • Keep up to date with new AIOps and LLM observability tools and best practices.

Required Skills & Experience

  • Good knowledge of cloud infrastructure (AWS, Azure, or GCP) and container orchestration (Docker, Kubernetes, ECS/EKS).
  • Hands-on experience running AI/ML services in production.
  • Experience with CI/CD pipelines for AI, LLM workflows, and model deployments.
  • Knowledge of distributed AI serving frameworks and inference optimization.
  • Understanding of monitoring, observability, and incident response for AI.
  • Experience setting up AI system health metrics, dashboards, and alerts.
  • Awareness of AI security, data protection, and compliance needs.
  • Interest in learning and using new AIOps and AI observability tools.

Why Join Us?

We value creative problem solvers who learn quickly, enjoy teamwork, and always aim higher. We work hard, but we also enjoy what we do and create a fun environment together.

,[Create and maintain infrastructure-as-code for AI services (Terraform, Pulumi, AWS CDK)., Build and run CI/CD pipelines for AI APIs, RAG pipelines, MCP services, and LLM agent workflows., Set up monitoring and alerting for AI systems and LLM observability., Track metrics like latency, error rates, drift detection, and hallucination monitoring., Improve inference workloads and manage distributed AI serving tools (Ray Serve, BentoML, vLLM, Hugging Face TGI)., Work with ML Engineers and Python Developers to define safe, scalable, and automated deployment processes., Follow standards for AI system security, data governance, and compliance., Keep up to date with new AIOps and LLM observability tools and best practices.] Requirements: Cloud, Container, AI, MLOps, CI/CD

  • Warsaw, Czech Republic RemoDevs Full time

    6+ years in DevOps or platform engineering, with at least 2 years in a lead/mentorship role Strong hands-on expertise with Azure cloud services and infrastructure management Proven experience designing and improving CI/CD (Azure DevOps, GitHub Actions) Strong Infra-as-Code skills (Terraform, Bicep) and scripting experience Solid understanding of system...


  • Remote, Wrocław, Gdańsk, Warsaw, Kraków, Poznań, Czech Republic RemoDevs Full time

    Experience as an AI/ML Engineer or similar, delivering NLP and ML models in SaaS. Strong ML knowledge with hands-on use of NLP libraries (spaCy, Hugging Face, PyTorch, TensorFlow). Experience deploying and scaling LLM apps (fine-tuning, RAG, evaluation, monitoring). Strong Python skills (preferred) and experience building APIs and connecting AI models to...


  • Remote, Gdańsk, Wrocław, Warsaw, Kraków, Poznań, Czech Republic RemoDevs Full time

    Proven experience with Azure Databricks and Azure Data Factory (ADF). Strong skills in SQL and Python for data engineering. Experience in building pipelines and data models. Good English (minimum B2) to communicate in an international team. Experience with Agile methods and Azure DevOps. We are looking for skilled Data Engineers to join a team working on...


  • Remote, Wrocław, Gdańsk, Warsaw, Kraków, Poznań, Czech Republic RemoDevs Full time

    Experience as a Senior DevOps Engineer in SaaS. Strong AWS skills: VPC, ECS, Lambda, RDS, Redis, S3, EFS, SQS, SNS, CloudFront, ALB. Strong skills with Terraform and Terragrunt. Experience with CI/CD pipelines (AWS CodePipeline preferred). Knowledge of secrets and encryption (SOPS, KMS). Experience with PostgreSQL, Elasticsearch, and Redis in...


  • Remote, Wrocław, Gdańsk, Warsaw, Kyiv, Kraków, Czech Republic RemoDevs Full time

    5+ years of Python Knowledge of TypeScript and React (at least basic) 2+ years of experience as a team leader Strong backend and architecture skills Projects deployed in production Async IO experience Knowledge of CI/CD, cloud (AWS/GCP/Azure), Linux, Docker, Kubernetes, APIs Experience with data pipelines / ETL Startup experience BSc in Software Development...


  • Remote, Wrocław, Gdańsk, Warsaw, Kyiv, Kraków, Czech Republic RemoDevs Full time

    Strong hands-on experience with Python (backend) and React (frontend). Experience in legal technology or financial technology. Experience with AWS single-tenant architecture and AWS Lambda. Experience with PostgreSQL and Elasticsearch. Previous leadership role (e.g., tech lead). Experience working with 30+ developers and an automation QA team. We are...


  • Remote, Wrocław, Gdańsk, Warsaw, Kyiv, Kraków, Czech Republic RemoDevs Full time

    3+ years of experience in Java development. Experience in Big Data - must. Experience with Java, Spring Boot, Kafka, MongoDB, Kubernetes, MySQL. Hands-on experience with Spring / Spring Boot. Experience with cloud platforms like AWS (EMR, Aurora, S3, Athena, Glue) is a plus. Ability to debug and solve issues in distributed production systems. Team...


  • Remote, Gdańsk, Wrocław, Warsaw, Kyiv, Kraków, Czech Republic RemoDevs Full time

    5+ years of Python 2+ years of experience with TypeScript on frontend Experience with React Async IO experience Knowledge of CI/CD, cloud (AWS/GCP/Azure), Linux, Docker, Kubernetes, APIs Startup experience About Us We build AI that takes drive-thru orders. Our system helps restaurants serve faster and with fewer mistakes. It can be adjusted for scripts,...


  • Remote, Warsaw, Czech Republic hubQuest Full time

    Must-have:5+ years of overall software engineering experience.At least 1 year as an MLOps or ML Engineer in production environments.Strong Python programming skills, especially in data-heavy contexts.Hands-on experience with ML infrastructure at scale.At least 1 year of data engineering experience.Strong knowledge of the ML lifecycle and...


  • Remote, Warsaw, Czech Republic hubQuest Full time

    Must-have: 5+ years of overall software engineering experience. At least 1 year as an MLOps or ML Engineer in production environments. Strong Python programming skills, especially in data-heavy contexts. Hands-on experience with ML infrastructure at scale. At least 1 year of data engineering experience. Strong knowledge of the ML lifecycle and...