Group Technology and Operations (T&O) enables and empowers the bank with an efficient, nimble and resilient infrastructure through a strategic focus on productivity, quality & control, technology, people capability and innovation. In Group T&O, we manage the majority of the Bank’s operational processes and inspire to delight our business partners through our multiple banking delivery channels.
Be part of an exciting change and operations team in an Enterprise Architecture and SRE organization that drives improvements in how the bank delivers its Site Reliability across all the services provided by the Bank.
The SRE Engineer collaborates with development and system engineers to augment solutions that will meet operational goals for high availability, performance, stability, and security.
Support Incident retrospective team to Identify the contributing factors for the incidents.
Implement SRE principles across the CICD tools.
Work with engineering and application development teams to improve system performance through environment upgrades and improvements.
You are responsible for managing the CICD tools.
You are responsible for the adoption of SRE principles and apply the same on CICD tools.
Deployment, support and monitoring of existing and new services, platforms, and application stacks.
Measurement and optimization of system performance Capacity planning and management of platform.
Evaluate new technologies and solutions to improve system performance.
Your responsibility will be to Onboard applications and automate the CI/CD processes.
Improve system monitoring and alerting to reduce incident resolution time.
Good knowledge on DevOps Tools, Jenkins, JIRA, Bitbucket, Nexus, Ansible, SonarQube, etc.
Experience in managing OpenShift, Kubernetes, Docker, RHEL.
Strong in Mobile-Application build and deploy tools for both iOS and Android.
At least 5 years of experience of general on DevOps CI-CD tools and managements.
Can work under dynamic change environment 24/7 support and have the right attitude to learn and implement.
Solid experience in container image deploy and release management with OpenShift and Kubernetes.
Must have strong automation and scripting skills – proficiency in shell, groovy & python.
Good knowledge on monitoring tools – Prometheus, Grafana and ELK
Background in large-scale system administration and familiarity with SRE principles and Release Engineering
Have advanced Linux System Administrator skills and advanced configuration management systems skills.
In-depth knowledge in infrastructure areas such as virtual server technologies, networking, firewall, internet protocols.
Communication: Ability to communicate technical ideas to technical and non-technical stakeholders is critical. Additionally, the ability to document support procedures to ensure that environments are properly maintained and supported.
Requirement Gathering: Collaborating with application teams and understanding their technical & business requirements, especially on Build & Release automation, is critical.
Experience with Agile methodologies such as Scrum, Kanban