infrastructure
Posted 1 weeks agoSite Reliability Engineer III
at Candescent
IndiaRemote
Responsibilities
- Learn, deploy and document newer technologies for the potential deployment of services following a development and release life cycle Support production escalations as needed.
- Identify areas for improvement or gaps in our systems, whether related to scaling, reliability, automation or cost optimization.
- Partner with cloud architects to build, test and revise proposed architectures and solutions Assist in building various tools/automation to streamline existing processes Work with Development, Security and Business Unit teams to deliver a world class cloud platform Build automation scripts and frameworks to improve operational processes and procedures.
Requirements
- Required Skills/Experience: Building and supporting production level Kubernetes clusters; Optimizing containerized workloads
- Experience with cloud networking; configuring VPC’s, firewalls, ingress/egress, CDN.
- Experience in AWS services. Handson on EKS, RDS, Lambda, Cloudwatch, Storage Solutions(EFS, S3, EBS..)
- Experience in Terraform/Terragrunt
- Experience in Cloud migrations.
- BS in Computer Science or related field, or equivalent experience.
- Must have high initiative and be a clear communicator.
- experience with Prometheus/Dynatrace or other logging tools. Strong knowledge/experience with Application and Infrastructure Delivery automation, orchestration and configuration management.
- Experience operating within cloud environments Continued establishment of best in class DevOps development, automation and deployment practices, policies and standards.
- Desired Skill Set: Container build/management and Kubernetes Amazon Web Services Cloud migrations (Google/AWS) IAC - Terraform Scripting – Python CI/CD - GitHub Version control – GIT, GitOps Statement to Third Party Agencies To ALL recruitment agencies: Candescent only accepts resumes from agencies on the preferred supplier list.
Additional details
- Candescent is a forward-thinking technology company transforming how financial institutions deliver Intelligent Banking experiences.
- We unite digital banking, account opening, and branch solutions that power and connect digital banking, account opening, and branch solutions—creating seamless engagement across digital, remote, and in-person channels.
- Our Experience-Led, Intelligence-Driven approach combines human-centered design with data, automation, and cloud-based innovation.
- Built on an API-first architecture, our extensible ecosystem enables institutions to adapt quickly, integrate easily, and unlock new opportunities for growth—turning every customer interaction into a moment of clarity, confidence, and connection.
- Experience: 5-8 Years Location: Hyderabad Candescent Site Reliability Engineering (SRE) mission is to proactively ensure the reliability, availability and performance of our Digital First banking applications.
- As a member of the SRE team, you will focus on building and operating highly reliable application platforms by applying SRE principles such as automation, observability, resilience and continuous improvement.
- You will partner closely with application and platform teams to define reliability standards, implement monitoring, alerting and incident response practices and embed scalability and performance considerations into application design and delivery.
- Through tooling, automation, and best practices, you will help development teams build and operate services that meet agreed reliability objectives.
- As a senior engineer in the organization, you will also provide mentorship within the SRE team and across peer engineering teams, helping elevate operational maturity, drive adoption of SRE practices, and strengthen reliability culture across our core initiatives.
- Driving ongoing improvements and efficiencies in operational practices, tools & processes.