Join our Journey

At Cato Networks, we have a team of veteran technology and security experts, looking to change the world. We believe that while good engineers can create simple solutions for complex problems, great engineers can make complex problems – simple.

All Jobs

Engineering - AI Security

AI Security - AI Platform Engineer

Location: Tel Aviv District, Israel

Welcome to the future of cloud networking and security!  

Cato Networks is the first company to converge enterprise networking and security into one centralized and global service that is delivered by cloud. It is led by networking and security pioneer Shlomo Kramer (Check Point, Imperva) and early investor (Palo Alto Networks, Exabeam, Trusteer and more). Cato’s unique technology inspired a brand-new product category, later named β€œSASE” by Gartner and a market expected to reach $28.5 billion by 2028.

This is your opportunity to get on the rocket ship and join a company that is building a cutting-edge enterprise network and secure cloud platform, and is on a fast track to becoming the worldwide market leader – don’t miss it!

 

Cato is building a real-time AI runtime platform for security algorithms running inline across our global cloud and physical PoPs.
We are looking for an AI Platform Engineer to help build the infrastructure that powers high-throughput, low-latency AI security decisions in production.
You will work on a runtime engine that combines GPU-based models, from MMBERT-style models to LLMs, with CPU-based heuristics and security logic, optimized for scale, performance, reliability, and real-time execution. This is a versatile engineering role that spans AI runtime infrastructure, high-performance backend development, GPU inference, model lifecycle, and close collaboration with research teams to bring AI security algorithms into production.


Responsibilities
  • Build Cato’s AI security runtime platform for high-throughput, low-latency production serving.
  • Develop infrastructure for model serving, multi-model orchestration, and inline decision flows.
  • Optimize inference performance: batching, caching, streaming, GPU utilization, memory usage, and runtime acceleration.
  • Build backend orchestration and performance-critical services in Go.
  • Support the model lifecycle: registry integration, packaging, versioning, deployment, monitoring, and operational health.
  • Work closely with research and algorithm teams to productionize AI security models and algorithms at scale.


Requirements
  • 3+ years of hands-on experience in AI inference, production ML infrastructure, model serving, or MLOps.
  • Experience with production inference technologies such as Triton, vLLM, CUDA, Kubernetes, Docker, PyTorch, ONNX, TensorRT, or similar.
  • Strong understanding of low-latency, high-throughput production systems.
  • Experience with model lifecycle concepts: model registry, versioning, deployment, rollout, rollback, monitoring, and observability.
  • 3+ years of experience with Go, or strong experience with a similar high-performance backend language such as C++, Rust, or Java.