Expert Site Reliability Engineer

placeBangalore calendar_month 

Responsibilities:

Finastra: Who are we?

At Finastra our purpose is to unlock the power of finance for everyone. We build and deliver innovative, next-generation technology on our open Fusion software architecture and cloud ecosystem. We’re one of the world’s largest FinTech’s working with over 9,000 customers, including 90 of the top 100 banks globally.

Our scale and reach allow us to build long-lasting relationships that put our customers and their customers first.

Your future at Finastra

We believe that the future of finance is OPEN. By focusing on OPEN Collaboration and OPEN Finance, supported by our OPEN Platform, we can create financial inclusion and open innovation for everybody. Our people are our greatest asset and we provide an environment where you can develop and grow your career.

From graduates to experienced professionals, we’re leaders in our roles and a key part of making Finastra one of the world’s leading FinTech’s.

Why join us

Alongside amazing colleagues and engaging work, we want to help you get the best out of your career. We offer continuous learning and development to take your skills to the next level. It’s not just about being the best you can be at work we also a variety benefits to make your non-work life better; including paid holiday, flexible working, pension, health and well-being initiatives and many more.

If you’re looking to build your career, work with experts and most of all have fun, join us.

About the role

As a Site Reliability Engineer (Cloud), you will join and reinforce the TCM Kondor modernization and cloud enablement team which primary objectives will be to act as a central team to accelerate the Cloud Transformation journey across our core systems.

We are looking for a curious and enthusiast Site Reliability Engineer to join our team, to optimize, design, implement, observe and maintain our organization’s cloud-based systems.

A Site Reliability Engineer’s responsibilities include design, deploying and debugging systems, as well as executing new cloud initiatives.

Ultimately, you will work with different IT professionals and teams to ensure our cloud computing systems meet the needs of our organization and customers.

Objectives of this Role
  • Work in tandem with our engineering team to identify and implement the most optimal cloud-based solutions for the company.
  • Define and document best practices and strategies regarding application deployment and infrastructure maintenance.
  • Provide guidance, thought leadership, and mentorship to development teams to build cloud competencies.
  • Ensure application performance, uptime, and scale, maintaining high standards of code quality and thoughtful design.
  • Managing cloud environments in accordance with company security guidelines.
  • Stay current with industry trends, making recommendations as needed to help the organization innovate and excel.
Responsibilities
  • Develop, deploy and maintain infrastructure on Azure using Docker and Kubernetes.
  • Implement automation tools and frameworks (CI/CD pipelines).
  • Collaborate with team members to improve the company’s engineering tools, systems and procedures, and data security.
  • Optimize the company’s computing architecture.
  • Conduct systems tests for security, performance, and availability.
  • Develop and maintain design and troubleshooting documentation.
  • Collaborate with the engineering teams to enable their applications to run on Cloud infrastructure.
  • Debugging technical issues inside a complex stack involving virtualization, containers, microservices, etc.
  • Troubleshoot incidents, identify root cause, fix and document problems, and implement preventive measures.
  • Employ exceptional problem-solving skills, with the ability to see and solve issues before they snowball into problems.
Requirements
  • Bachelor’s degree in computer science, information technology, or mathematics
  • 8+ years of proven experience as a Site Reliability Engineer or similar role in software development and system administration.
  • Experience in Docker for containerization and application deployment.
  • Experience with Kubernetes and Helm for orchestration of Docker containers.
  • Experience with Azure cloud services and understanding of their offerings and architecture.
  • Knowledge of databases and operating systems.
  • Ability to troubleshoot complex software and hardware issues.
  • Knowledge of best practices related to data encryption and cybersecurity.
  • Excellent problem-solving and communication skills.
  • Experience in network, server, and application-status monitoring.
  • Operating systems – any Linux/Unix flavor
  • Monitoring – Prometheus, Grafana
Nice to Have
  • Relevant certifications such as Certified Kubernetes Administrator (CKA), Certified Kubernetes Application Developer (CKAD), or Azure Certifications (AZ-104, AZ-204, AZ-400, etc.).
  • Experience with other cloud platforms like AWS or GCP.
  • Experience in network, server, and application-status monitoring.
  • CI/CD - Jenkins (groovy)
  • Exposure to Azure pipelines
  • Knowledge on GIT Version control
  • Scripting
check_circleNew offer

Reliability Test Engineer

apartmentSimple EnergyplaceBangalore
the responsibility to lead the change that will make our world better, safer and more equitable for all. Job description: Reliability Test Engineer Location: Yelahanka, Bangalore To Be Successful, You Will Be Expected To  •  Reliability Testing and Validation...
placeBangalore
Role Description: Site Reliability Engineer - Private Cloud - Our mission at Booking.com is to create transformative, innovative, and personalized travel experiences for millions of customers all across the world. We want customers to have...
business_centerHigh salary

Senior Site Reliability Engineer

apartmentQlikplaceBangalore
empowering teams to address complex challenges and seize new opportunities. The Senior Machine Learning Engineer Role Join the Qlik Site Reliability Engineering (SRE) team and play a crucial role in ensuring the scalability, observability, and reliability...