Site Reliability Engineer

Protecht group - Sydney
new offer (03/05/2024)

job description

We are on the hunt of a Site Reliability Engineer to join our fun and exciting Infrastructure team! Are you the one?

Who are we?

We are Protecht - a fast growth Governance, Risk &
Compliance (GRC) SaaS business. We provide world-class enterprise risk management, compliance, training, and advisory services to over 350 customers across various industry sectors through our offices across APAC, USA &
Europe.

Through our people, we enable smarter risk taking by our customers to drive their resilience and sustainable success.

We do this by:

  • Channelling our passion for risk management into knowledge and expertise that drive every aspect of our training, thought leadership, products and services

  • Building true relationships with our customers as we support their risk management journey.

Our cloud-based SaaS platform - Protecht.ERM is what makes us really stand out. It's one of the most comprehensive, flexible, and dynamic risk management solutions available today.
Why join us?

At Protecht, a positive and super friendly culture awaits you, where learning is valued and supported. We empower our people through leadership, training, knowledge-sharing, and mentorship. Here are some of the perks of working with Protecht:

  • A modern TechStack and a great opportunity to work within a dynamic team.
  • A highly flexible culture - our way of working lets people work across home and our offices.
  • A strong commitment to your learning and development - fortnightly dedicated L&
    D afternoons
  • Reward &
    Recognition programs
  • A strong focus on work / life balance with access to Birthday leave, bonus days, paid parental leave and long service leave.
  • Monthly social events
  • Competitive remuneration and Annual Performance Bonus
  • Generous Employee Referral program

Let's talk about your new role!

Reporting to he Head of Infrastructure, Site Reliability Engineer is a n ewly created role within our Infrastructure Operations team. You will be instrumental in driving execution and development of SRE function across Protecht Cloud Platform. Collaborating with software development and Platform teams, you will take ownership for the monitoring and alerting suite while playing key role in improving the reliability and resilience of Protecht's Cloud infrastructure.

Here are some of your key responsibilities:

  • Implement SRE best practices and continuous improvement initiatives including adopting of automation, monitoring and incident management tools and practices.
  • Establish Observability, SLO's and error budgets, tools, practices and implementation.
  • Perform incident, change, and Project management while carrying our BAU activities required.
  • Participate in incident response, on-call rotations, and blameless retrospectives with the goal of discovering the best approach to prevent future occurrences of the issue.
  • Proactively identify bottlenecks in system and platforms to prevent incidents.
  • Communicate and interface with Support Team
  • Design and implement automation solutions to improve the reliability and efficiency of software development and deployment.
  • Contribute to the continuous improvement of software quality, security, and performance standards.

You'll be a great fit if you have:

  • 5+ years in DevOps or Software development. 3+ years' experience operating high-availability, fault-tolerant, scalable, distributed software in production
  • Experience working with AWS.
  • Significant experience in running ITIL processes, Experience in running Ops such as Access Control, Capacity Management, Log, and vulnerability Management.
  • Experience with configuration management and automatic deployments tools such as Ansible, Terraform and Jenkins
  • Strong understanding of cloud native and container based distributed systems like Kubernetes.
  • Experience with Splunk, Dynatrace. Building monitoring, tweaking dashboards, defining alerts, writing runbooks, etc
  • Familiarity with RDS/PostgreSQL database architecture and working principle.
  • Bachelor of Software Engineering, Computer Science, or Information Technology
  • Industry qualification in cloud technologies

Next steps

With a swift screening and interview process in place, we are happy to invite you to apply. If you think this may be your next opportunity and you want to be part of a Great Place to Work - Certified organization,Apply online today!

Visit our website to find out a little more about working with us.

Apply now for
Site Reliability Engineer

Warning: you will leave the jobtome site.

These offers may interest you:

Go back