infrastructure
Posted Feb 23Senior Platform Engineer (f/m/d)
at Moss
Berlin, GermanyRemote
Responsibilities
- - Own the reliability and scalability of 100+ microservices - including defining and enforcing SLOs, managing autoscaling strategies, and driving resilience patterns like circuit breakers, bulkheads, and graceful degradation.
- - Lead safe, continuous deployment practices across a fully automated CD pipeline - including rollout strategies, rollback mechanisms, and deployment observability at scale.
- - Drive observability across the platform - metrics, distributed tracing, and structured logging - with a focus on reducing MTTR and enabling engineers to self-serve incident diagnosis.
- - Own incident response across networking, load balancing, Kubernetes, and cloud services - and drive post-incident improvements that prevent recurrence.
Requirements
- RESPONSIBILITIES - Design, build, and operate cloud-native infrastructure (GKE, Kubernetes, networking, databases) supporting a high-availability, low-latency FinTech platform processing real-time payments across Europe.
- experience with at least 4+ years in platform, infrastructure, or SRE roles in a cloud-native environment. - Deep Kubernetes expertise - scheduling internals, autoscaling (HPA/VPA/KEDA), pod lifecycle, network policies, PodDisruptionBudgets, and multi-zone topology.
- Not just operational familiarity, you understand what breaks and why. - Strong grasp of microservices operational challenges at scale - service mesh, inter-service resilience patterns, connection pool management, graceful shutdown, and database migration safety in a continuous deployment model. - Solid CI/CD
- experience - designing pipelines for 100+ services, immutable artefact management, Workload Identity Federation, and automated rollback. GitHub Actions
- experience - building platforms covering metrics, logs, and distributed traces including across async boundaries (e.g. Kafka).
- - Proficiency in infrastructure-as-code - Terraform and Helm as primary tools, with a strong IaC-first mindset.
- - Programming proficiency in Golang and/or shell scripting for platform tooling; familiarity with Java/SpringBoot operational characteristics is a plus.
- Experience with dynamic secrets management via HashiCorp Vault, including database credential rotation. - Familiarity with GCP-specific primitives - Workload Identity, GKE Autopilot vs. Standard tradeoffs, Cloud Armor, VPC-native networking. -
- Experience with KEDA or scheduled scaling strategies for predictable traffic spikes. - Cloud cost optimisation - spot/preemptible node strategies, resource right-sizing, log volume and cardinality management. - Prior
- experience in a regulated FinTech or financial services environment.
- Our ambition is bold: to power every SMB’s spend across Europe - fully digital, AI-driven, and seamlessly integrated for complete control.
Experience
- About you - 7+ years total
Benefits
- Moss has raised a total of €180 million in funding and is backed by the most renowned tech investors including Valar Ventures, Tiger Global, Global Founders Capital, Cherry Ventures and A-Star.
- We’re a place where you can fast track your career - here's what else to expect: - Top-of-market compensation package, including equity. - Our vibrant offices are at the heart of our culture, where in-person time fuels collaboration and connection over weekly breakfasts and Friday demos. - Additional
- benefits include: 20 days “work from abroad”, 600EUR/GBP Learning & Development Budget, and other local benefits. Unless stated otherwise,
Contact
- Recognised by Sifted’s Rising 100 https://sifted.eu/rankings/b2b-saas-rising-100-2024 and LinkedIn's Top Startups https://www.linkedin.com/pulse/linkedin-top-startups-2024-20-aufstrebende-unternehmen-bjd0c/, we’re here to help propel your career and together, make Moss a lasting success.
- To date, over 5000 businesses in Germany, Netherlands and the UK use Moss’ leading spend management product, with modules such as corporate cards https://www.getmoss.com/corporate-credit-card, accounts payables https://www.getmoss.com/accounts-payable, employee cash reimbursements https://www.getmoss.com/reimbursements and procurement https://www.getmoss.com/procurement.
Additional details
- At Moss, we give finance professionals the power to automate their day-to-day and make forward-thinking decisions.
- Our team and culture make us unique — we’re driven by impact and growth, where every one of us strives to learn and excel.
- You will work on critical systems that must be updated without downtime, ensuring our services remain secure, scalable, and resilient.
- You’ll collaborate closely with product, data, and security teams, balancing planned initiatives with incident response, cloud engineering, and regular maintenance. YOUR
- - Manage and evolve infrastructure-as-code (Terraform, Helm) with a no-ClickOps discipline - every change peer-reviewed, version-controlled, and auditable.
- - Champion security and compliance practices including Zero Trust architecture, Workload Identity, dynamic secrets via Vault, network policies, and audit readiness (ISO27001, SOC2).
- - Raise the engineering bar - actively contribute to architectural decisions, review platform changes, and help grow the early-senior engineers on the team.
- Able to connect instrumentation to incident workflow, not just tooling setup.
- - Proven troubleshooting skills across distributed systems - latency contagion, cascading failures, connection exhaustion, and autoscaling lag under traffic spikes.
- - Collaborative, low-ego working style - comfortable in a small, high-trust team where engineers raise PRs instead of tickets. NICE TO HAVE -