Financial

How a Leading Financial Institution Uses Robusta to Empower Devs and SREs for Independent Troubleshooting in a Complex Cloud Environment

Company Overview

This financial institution is a multinational banking and financial services company with a presence across multiple countries. Their technical infrastructure is modern and complex, consisting of Kubernetes, Backstage, Artifactory, and on-premise cloud solutions like EKS and Rancher/RKE.They also leverage various observability tools including New Relic and AppDynamics. The organization is exploring Prometheus and Thanos as potential solutions for alert monitoring.

Main Challenges

High Volume of Support Requests: With over 50 development teams, many of which lack deep Kubernetes expertise, support requests flood the platform team. This is unsustainable and results in delays for resolving critical issues.

Large Infrastructure Scale: With over 100 clusters, 1,000 nodes, and 100,000 pods, the sheer volume of alerts and incidents makes it hard to identify root causes quickly, which slows down the incident resolution process.

Information Overload: Developers must sift through data from multiple complex tools to diagnose issues. This creates a steep learning curve and significantly reduces productivity.

Developer Struggles with Troubleshooting: Developers often face challenges diagnosing application issues, leading to delays and over-reliance on the platform team.

High Cognitive Load: Deploying to Kubernetes requires deep expertise across several technologies, placing additional stress on developers who are not Kubernetes experts.

Platform Team Overburdened: Platform engineers spend most of their time supporting developers rather than innovating, as they are constantly dealing with issues that should ideally be solved by developers.

Cloud Cost Management: Developers frequently request more cloud resources, which results in overprovisioning and higher cloud costs.

Robusta has provided a solution that empowers the platform engineering team to share their Kubernetes expertise across development teams and SREs, enabling them to resolve issues independently.

The Impact

  • Simplified Troubleshooting: Robusta simplifies the process of identifying and resolving issues, even for engineers without deep Kubernetes or observability knowledge.
  • Increased Self-Service: Developers and SRE teams can now troubleshoot their applications autonomously, reducing their reliance on the platform team and increasing time-to-resolution.
  • AI-powered Assistance: Robusta’s AI copilot helps development teams ask questions and get immediate answers regarding their Kubernetes applications, without needing platform engineer intervention.
  • Decreased Support Tickets: The platform team has seen a significant drop in support tickets, allowing them to focus on resolving higher-priority issues.
  • Auto-Remediation: Robusta uses auto-remediation and AI-based solutions to resolve issues automatically, leading to improved platform reliability.
  • Cloud Cost Savings: Robusta helps identify when developers are over-allocating resources to applications, recommending optimizations to reduce cloud spend.

The Solution

“Robusta has been on our radar for a while. Its support capabilities have helped our teams solve problems independently. Robusta not only makes problems visible but also helps identify root-cause and resolutions automatically. The results so far are highly promising.” - Head of Platform Engineering

Robusta offers exactly what they need

  • Reduced Platform Team Burden: There is a significant decrease in the number of tickets received, allowing platform engineers to focus on more critical tasks.
  • Increased Self-Service for Developers & SREs: Teams are now able to resolve their own issues more effectively, improving productivity and collaboration.
  • Faster Time-to-Resolution (MTTD/MTTR): Incidents are now resolved faster, thanks to Robusta’s AI-driven troubleshooting capabilities.
  • Higher Uptime: Increased reliability due to faster incident resolution and proactive problem-solving.
  • Cost Savings: Reduced cloud costs through better resource allocation recommendations.
  • Improved Engineer Satisfaction: Developers, SREs, and platform engineers report higher satisfaction as they experience reduced cognitive load and a more supportive work environment.

The Outcome

By implementing Robusta, the financial institution significantly reduced the support burden on their platform engineering team, enabling faster and more efficient incident resolution. Developers and SREs gained the ability to troubleshoot Kubernetes issues independently, reducing reliance on the platform team.

With Robusta's AI-powered insights, the organization achieved increased uptime, lower cloud costs, and enhanced developer satisfaction. The overall platform reliability improved, freeing platform engineers to focus on innovation rather than constant troubleshooting.

Download in PDF

Functions and Responsibilities

Platform Team: 


20+ engineers

Developers: 


3,000+ developers across 50+ teams

SREs: 


Responsible for maintaining system reliability and operational excellence

Trusted By Platform Engineering and DevOps Teams Around The Globe

I really like the stream of information you get simply by installing Robusta. As an operator, it is a no brainer to add it to my clusters. Gives really good insights without a lot of effort.

Matthias Nguyen, Managing Director Unbasical GmbH

"It's the easiest monitoring solution there is for k8s, an excellent, feature rich product, with a team of people behind it you could have a beer with."

Andrew Riddell, IT Systems Manager UGL

“I start mornings by checking production in Robusta. I love how Robusta is opinionated, highlights problems and significant events. After viewing details, I know enough to resolve issues.”

Keir Robinson, Engineering Manager, Navenio

"By adding Robusta to kube-prometheus-stack and enabling alert grouping, we reduced the number of Slack messages by 90% without missing a single important notification."

Yoni Golob, DevOps Engineer,
Placer.ai

“I use Robusta for governance of my Kubernetes infrastructure. A major strength is the Prometheus integration (kube-prometheus-stack).”

Roberto Iannone, DevOps Engineer, RiAtlas

“We manage kubernetes clusters for multiple clients. With Robusta, it's far easier to compare deployments across our clusters, and notice discrepancies in deployed versions.”

Asbjørn Dyhrberg Thegler, DevOps Consultant, Deranged

“One of the most satisfying features of Robusta is consolidating monitoring data from dozens of clusters across multiple regions into a unified interface.”

Silviu Iaşcu, Director Infra Operations & Cloud,
Jedox

“We adopted Robusta for one of our clients in order to have enriched alerts coming from both in-cluster Kubernetes events and an out-of-cluster Alert Manager installation.”

Diego Ojeda, DevOps Consultant, BinBash

“With Robusta, I don’t need to check my cluster’s health every day. If something needs my attention, I get a message in Teams. I can escalate critical issues immediately.”

Oleg Minaev, Lead Backend Developer, Aureliym GmbH

“We're using Robusta to standardize k8s alerting. Previously, we were using kube-prometheus-stack but the default alerts were too noisy and it was harder to configure”.

James Wu, Space Telescope Science Institute

“I told my devops team to evaluate all the observability tools they want, and to choose the best one for Kubernetes. They chose Robusta."

Yonatan Itai, VP R&D, Cyera

Lorem ipsum dolor sit amet consectetur. Lectus cras mauris egestas vestibulum libero quam aliquet tortor. Platea malesuada quis quam ultrices eu egestas.

Lorem ipsum dolor sit amet consectetur. Lectus cras mauris egestas vestibulum libero quam aliquet tortor. Platea malesuada quis quam ultrices eu egestas.

Lorem ipsum dolor sit amet consectetur. Lectus cras mauris egestas vestibulum libero quam aliquet tortor. Platea malesuada quis quam ultrices eu egestas.

Create your account to get started

Email us, and we'll provide you with a login link to complete your onboarding from your computer, where Robusta performs at its best.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.