Site reliability engineers, or SRE engineers, are coding and software automation experts who optimize information technology (IT) infrastructure and processes. They do this by configuring smart codes, tools and applications that streamline operations and enhance productivity from the beginning to the end of the software development lifecycle (SDLC). Google introduced the SRE engineering role in the early 2000’s to operate at the crossroads between software development and IT operations, or DevOps, and it has been growing in popularity ever since.

 

The SRE role and responsibilities include software automation, monitoring, troubleshooting, problem solving, documentation, and team collaboration. Specifically, the role requires a high level of expertise in writing code to automate processes such as log analysis and testing, while responding to any new DevOps issues that arise. Automating processes allows the developers to focus on bringing new features quickly to production and reduces the burden on the IT operations team. An SRE engineer applies software engineering principles to ensure reliable and scalable performance of software and IT services. Site reliability engineers regularly work alongside teams of software developers and IT engineers, guiding them along the development.

 

Site reliability engineering is essential for any organization that needs to continuously improve their people, processes, and technology. SREs help teams to transition to a true DevOps culture, offering numerous benefits to increase speed and reliability. Popular job opportunities for site reliability engineers include at major tech companies, for eCommerce companies, or in payments, banking, and medical software development. As technology continues to evolve, so will site reliability engineering. This means, there will be only more opportunities for SRE engineers!

Apply now for a job as a site reliability engineer!

Intake March ACC Indirect

Location: Amsterdam

Branche: Electronics

Expertise: Health, Safety, Environmental & Quality

Experience: 5 years

This is a test, a GIT test!

Azure Wizard CZ ACC 2

Location: Prague

Branche: Building & Construction

Expertise: Cost Control & Procurement

Experience: 1 years

Azure Wizard NL ACC

Location: Rotterdam

Branche: Building Facilities & HVAC

Expertise: Cost Control & Procurement

Experience: 1 years

Test automation for Global

Location: Perth

Branche: BMB2

Expertise: Cost Control & Procurement

Experience: 1 years

Description for Global test

Test automation for Global

Location: Perth

Branche: BMB2

Expertise: Cost Control & Procurement

Experience: 1 years

Description for Global test

Rig Supervisor II

Location: Midland

Branche: Oil & Gas

Expertise: Drilling, Completions & Geosciences

Experience: 5 years

Our Client, one of the largest international company specialized in O&M services is currently recruiting for Refinery Operations Shift Supervisor (Solid Sulfur Handling) to be based in Kuwait.

Quantity Surveyor - Civil, SMP & E&I

Location: East Perth

Branche: Mining

Expertise: Project Management & Services

Experience: 4 years

Our Client, one of the largest international company specialized in O&M services is currently recruiting for Refinery Operations Shift Supervisor (Solid Sulfur Handling) to be based in Kuwait.

Full Content Vacancy

Location: Amsterdam

Branche: Banking

Expertise: Project Management & Coordination

Experience: 4 years

Je zorgt ervoor dat R&D bij Nederlandse ondernemingen wordt gesteund. Dit doe je door te adviseren over de honorering van projectvoorstellen van ondernemers in de (innovatieve) sectoren maakindustrie, automotive, maritiem en bouw. Daarvoor heb je regelmatig contact met ondernemers, intermediairs en adviseurs. Voortgang en knelpunten in de advisering en verbetering van de processen bespreek je in teamverband d.m.v. de ‘scrum’-werkwijze.De Rijksdienst voor Ondernemend Nederland (RVO.nl) is een grote uitvoeringsorganisatie van het Ministerie van Economische Zaken. De dienst ondersteunt ondernemers met subsidies, kennis, vinden van zakenpartners en voldoen aan wet- en regelgeving. De ambitie van de dienst is om de toonaangevende overheidsdienstverlener van Nederland te worden. Daartoe wordt - veelal projectmatig - hard gewerkt aan optimale geautomatiseerde ondersteuning van de dienstverlening.

Scrum master

Location: Utrecht

Branche: IT & Telecom

Expertise: Functional & Business Analysis

Experience: 3 years

Als Scrum Master ben jij er echt voor je team! Jij zorgt ervoor dat alle teamleden zich kunnen focussen op datgene waar ze goed in zijn: hun vak. In principe hoef je zelf geen IT’er van origine te zijn, zolang je maar snapt waar het voor het Ontwikkelteam om draait: ontwerpen, bouwen, testen, beheren en automatiseren. En verbeteren, continu verbeteren.Een succesvolle Agile-methode begint met duidelijke, beknopte en effectieve backlogs en het managen daarvan. Hiervoor is het belangrijk dat je als Scrum Master inzicht hebt in de product-planning in een empirische omgeving; bijpraten met de Product Owner staat dan ook regelmatig op de agenda.Als facilitator voor het team elimineer je belemmeringen, coach je de teamleden zowel individueel als ook als team, en zorg je ervoor dat de stand-ups, sprintsessies en andere meetings staan als een huis. Last but not least zorg je ervoor dat het Agile gedachtegoed verspreid wordt in de organisatie en coach je het Ontwikkelteam in organisatorische omgevingen waarbinnen Scrum nog niet volledig is opgenomen.

Administration Assistant

Location: Melbourne

Branche: Utilities & Distribution

Expertise: Health, Safety, Environmental & Quality

Experience: 1 years

12 Month Contract – with view to extend Provide administration support to Functions Teams – Human Resources, Strategy, Planning & Commercial, Corporate Affairs, Finance, Health, Safety & EnvironmentOnboarding (contractor, employee, agency) including but not limited to Account creation computer & office equipment and site-based equipment and materials (including PPE)Offboarding (contractor, employee, agencyFacilitate all mobilisation activities including travel and accommodation, security access for all resources, including maintenance events, and ensuring required site-based training & inductions are completed.Ariba/Procurement: Raise service requests and purchase requests as requested; close out service requests when events completedCoordinate workshops for events and department meetings

Common site reliability engineer roles and responsibilities

A site reliability engineer is responsible for performing a range of important software engineering tasks. Responsibilities may include:

 

  • Analyzing DevOps processes and IT architecture for areas of optimization for continuous improvement;
  • Monitoring symptoms documenting every action to automate it through code;
  • Improving operational processes and design, build, and maintain core infrastructure for scaling;
  • Being on-call to respond to incidents that impact product or software availability;
  • Troubleshooting and debugging issues to fix them to ensure high productivity;
  • Preventing incidents from happening;
  • Planning and facilitating IT infrastructure growth;
  • Providing support to, and collaborating with, engineers, developers, and specialists to develop and deploy the codes, tools, and applications in software products;
  • Tracking progress and documenting knowledge and processes;
  • Delivering results in line with agreed SRE engineering project timelines and budgets;
  • Delivering software engineering outputs in compliance with relevant requirements, and in line with customer needs and demands;
  • Leading trainings on software engineering and development as needed.

Qualifications for site reliability engineers

SRE engineers should have at least a Bachelor’s degree in Software Engineering, Computer Science, or related.

 

Additional supporting skills and experience include:

 

  • 2-4+ years of software engineering experience;
  • Solid understanding of coding, DevOps, and IT infrastructures using programming languages such as Python, Go, or Ruby;
  • Excellent analytical and natural problem-solving skills;
  • Proficiency in using diverse software, including Chef, Ansible, Terraform, SaltStack, GitLab CI/CD, Kubernetes, AWS CloudWatch, NewRelic, PagerDuty, VictorOps, Jira and Trello, and similar;
  • Proven experience in project and team management;
  • Strong verbal and written communication skills to be able to work easily with developers, engineers, and other diverse team members.

Share this article