Expert Site Reliability Engineer

placeBangalore calendar_month 

Responsibilities:

About the Role

Finastra enables the financial services world to deliver the future of banking with applications that power financial institutions, marketplaces that accelerate industry & an open innovation platform for banks, fintechs & non-banks to connect and collaborate.

FusionOperate is Finastra’s DevOps Self Service PaaS transforming how we develop and operate striving for software delivery & operational excellence from commit to production and deployment frequency through to reliability measured in our change success rate and mean time to repair.

FusionOperate is a Multi Cloud DevOps PaaS focused on Container Orchestration, Continuous Delivery, Observability, AIOPs, Insights & Data.

As a Site Reliability Engineer your mission is to protect and advance the software & systems behind Finastra’s Cloud hosted services running on FusionOperate for the biggest Financial Institutions in the world. Finastra believes in a blameless culture where the primary objective is continuous improvement.

You’ll be treating operations as a software engineering problem aiming to build reactive systems that self-heal, ensuring we keep revenue-critical systems up & running despite natural disasters, unexpected surges in traffic, and configuration errors.

Your day will vary from the fine-grained details of optimizing disk performance, authoring operational code for our applications to the big picture of reliability modelling. You will operate as part of a global scaled agile SRE team applying your experience in Continuous Delivery.

Experience & Qualifications (relate to TP4 level)
  • 3/5+ years of experience in Computer Science
  • Proficiency with Infrastructure as Code technologies such as Terraform, CloudFormation, or ARM
  • Experience developing and deploying resources with a cloud provider (I.e., Azure, AWS, Cloudflare, GCP)
  • Networking concepts (load balancing, TCP/IP, HTTP, gRPC, DNS) and troubleshooting tools (Wireshark, command line, BPF)
  • Comfortable with scripting languages like Python, Bash and Go
  • Familiarity with container technologies like Docker and Kubernetes
  • Knowledge of Cloud-native architecture, Cloud tooling and latest trends and practices
  • Appropriate RHEL, Kubernetes & Cloud Certifications a plus

Responsibilities:

  • Work with containers and container orchestration systems such as Kubernetes
  • Capacity Planning to determine resource requirements of your service for it to be scalable, efficient, and reliable
  • Identify and troubleshoot any availability and performance issues at multiple layers of deployment, from hardware to operating environment, network, and application
  • Collaborate with other engineers to implement operational solutions while defining and adhering to industry best practices
Technology Stack
  • Multi-Cloud; Azure, AWS, GCP
  • Programming (Python, Golang, Bash)
  • Kubernetes, Docker, Helm, GitOps, FluxCD (OpenShift a plus)
  • Terraform, Ansible and/or Puppet
  • Prometheus, Grafana & Loki (Open Telemetry a plus)
apartmentJP Morgan Chase & Co.placeBangalore
Job Description You're ready to gain the skills and experience needed to grow within your role and advance your career - and we have the perfect software engineering opportunity for you. As a Software Engineer II at JPMorgan Chase within the Asset...
placeBangalore
outcomes:  •  Agility in provisioning and using cloud infrastructure.  •  Efficiency in cost and utilisation of cloud infrastructure, as well as toil reduction for developers and engineers.  •  Trust in the safety, reliability, and performance of our cloud...
apartmentNvidiaplaceBangalore
these clusters at high reliability, efficiency, and performance and drive foundational improvements and automation to improve researchers productivity. As a Site Reliability Engineer, you are responsible for the big picture of how our systems relate to each other...