Site Reliability Engineer (Local Candidate)

Job Description

Responsibilities

· Design, manage and optimise Continuous Integration and Continuous Delivery.

· Monitor application and infrastructure (cloud) to ensure system security and availability usingthe real time dashboard and active alerting mechanism.

· Manage and monitor archival, backup and housekeeping of the data & application resources.

· Make the logs and other information from production & test environment to be securely accessible by people who need it.

· Perform necessary updates to the application, database and infrastructure as required by the business/operation and security requirements.

· Run security scan on the system & source code, perform early assessment on reported security findings and escalate to development team if necessary.

· Strive to increase the service reliability through establishing guidance and methods of improvement.

· Collaborate and cultivate relationships with Development and Support teams to improve reliability, stability and scalability of services.

· Deliver data and analytics to provide insights for our team from a reliability and resilience perspective.

· Identify and resolve problems relating to critical service operations and to prevent their recurrence using automation.

· Improve the incident management lifecycle to identify, mitigate, and learn from reliability risks.

· Work closely with internal teams to ensure technical and operational compliance with ISO 27001 requirements.

Requirements

Skills and Qualifications

· 3 years experience in Site Reliability Engineering, DevOps, or related roles.

· Hands-on experience with AWS cloud infrastructure and services (e.g. EC2, RDS, IAM and etc.)

· Experience in infrastructure as code e.g. Terraform.

· Experience with containerization and orchestration (e.g. Docker, AWS EKS).

· Experience in managing and improving the processes within CI/CD tools (e.g. Jenkins), BitBucket repository and code quality scanner (e.g. SonarQube).

· Experience in application and infrastructure monitoring and familiar with application logging and monitoring tools such as Datadog and Grafana-Loki-Promtail stack.

· Familiar with scripting e.g. bash, python, go for task automation.

· Experience in managing linux servers.

· Awareness of the security practices, standards and processes will be an advantage.

· Experience analyzing and resolving performance, scalability and reliability issues.

· Knowledge on web application environments, such as TCP/IP, SSL/TLS, HTTP, DNS, routing, load balancing, CDNs, Tomcat, Apache, etc.

Preferred Qualifications

· Experience in ISO 27001 policies and processes.

· Experience in SaaS environments with multi-tenant architectures.

· Experience with PDPA, GDPR, SOC 2 or other compliance frameworks.

· Experience with performance testing using JMeter.

Benefits of Joining our Team include:

· Opportunities working with both international and high profile clients

· Enjoy hybrid work, flexible hours, result oriented, and collaborate with a mission-driven team invested in your growth.

· Outpatient medical coverage for employees, their spouses, and children, in accordance with company policy.

· Provision for spectacles and dental expenses for employees.

· A range of allowances, including travel and mobile phone expenses, among others.

· Product and technical training are provided, both from internal & external sources

· Accelerate your career through hands-on challenges, mentorship from leadership, and opportunities to lead as the team scales.

· Partner closely with product, and operation teams while owning decisions in a flat hierarchy.

· Collaborate with a passionate team using the latest technologies and frameworks.

Back to blog