Back to search

Site Reliability Engineer – Cloud Cost Optimisation Engineer

The Cloud Enablement team is responsible for accelerating the delivery and improving the operation of our cloud-based software by providing and supporting tools and patterns which reduce the cognitive load on our development teams. We free up our developers to focus on solving problems for our customers rather than spending time on extraneous tasks. Drawing on the shared experience and expertise from our organization and industry; we create, support and evolve the paved path for teams to build, deploy and run secure and reliable software.

What will you do?

  • Design, build, advocate for and support the common tools and delivery platform used by Flexera developers.
  • Improve developer experience and operational excellence.
  • Foster collaboration and knowledge sharing across Flexera.
  • Select and rollout supported defaults and standards for CI/CD tooling, Observability, Security and Runtime Environment.
  • Work with teams across several continents, build relationships with our engineers by listening and understanding their needs and balancing this with the needs of our business.
  • Research new tools and patterns and continuously measure and evolve our ways of doing things.
  • Cloud Cost Optimization uses a combination of strategies, techniques, best practices and tools to help manage/reduce cloud costs.

You have

  • Developer/DevOps/SRE/Platform experience and a strong interest in software delivery and ongoing operation.
  • Worked on rolling out automation, tools, technologies, patterns and guardrails across an organization or teams.
  • Experience working in a globally distributed team.
  • Extensive public cloud (preferably AWS) knowledge & experience.
  • Deep knowledge of containers (Docker) orchestration (Kubernetes).
  • Knowledge of tools and patterns around CI/CD (familiar with Travis CI, Circle CI, Buildkite or similar).
  • Observability knowledge; Logs, Tracing, Metrics and experience in a few of Elastic Stack, XRay, Jaeger, Zipkin, Prometheus, Honeycomb or LightStep. Enterprise observability tools such as NewRelic, DataDog etc.
  • Cloud cost optimization; Using automation to keep Cloud cost under control and within budget. Enabling individual Engineering teams with cloud cost optimization.
  • Knowledge of operations, including incident management, immutable infrastructure as code (esp. Terraform or CloudFormation), and problem-solving.
  • Produced robust well-tested code preferably in Golang; however, we will also consider Python, JavaScript, Ruby, Java or C# if you are happy to learn Go.
  • Excellent communication skills, including experience in writing good documentation and running workshops.


Critical Skills / Competencies

  • Agile software delivery methodologies
  • Experience managing cloud-based services e.g. AWS, Azure at scale
  • Experience with DevOps
  • Experience with docker Containers, Kubernetes, EKS, ECS
  • Infrastructure as code e.g. Terraform, CloudFormation
  • CI/CD pipelines using Jenkins, travisCI, teamcity, pipeline as code
  • Automation / Configuration Management at scale e.g. Puppet, Chef, Ansible, Salt, Packer etc.
  • Service mesh such as ishtio, Consul or similar
  • Expertise in one or more of the following languages: Python / Go / Java / C# / C / C++
  • Experience with IaaS and Serverless services from a cloud provider
  • A strong understanding in TCP/IP, DNS and experience designing networks
  • Linux & Windows system administration experience
  • Experience implementing fault detection, and automating fixes
  • Experience designing scalable services
  • Experience designing distributed, fault-tolerant systems
  • A good understanding of SQL, No-SQL databases
  • A solid understanding of data structures and algorithms
  • A positive attitude and willingness to learn
  • Strong conflict resolution competence
  • Excellent written and verbal communication skills
  • Detail oriented. The ideal candidate is one who naturally digs as deep as they need to understand the why


Minimum Qualifications

  • Bachelor’s or higher degree in Computer Science, Information Technology, or a related field.
  • At least 4 years of hands-on job experience managing services in a public cloud
  • At least 1 years of experience working as a member of a centralized Cloud enablement / Platform or a similar team


Bonus Skills

The following list of items are not pre-requisites for the role but might give you a bit more of an idea about what you may expect to come across in your SRE role at Flexera:

  • Python / Golang / Java / C# / C / C++ / Bash experience
  • Big Data, Machine Learning, AI (DataBricks, Snowflake etc.) Platforms
  • Experience with Monitoring systems such as New Relic, ELK, Prometheus, Datadog, X-ray etc.
  • Security background
  • SQL, NOSQL and Graph databases
  • Relevant Certification e.g. AWS, GCP, Azure
  • Experience of Disciplined Agile Delivery (DAD)

To request a modification to this listing please email

  • Company: Flexera
  • Published: 20th January 2024
  • Closing Date: 20th July 2024
  • Country: India
  • Type: Full-time
  • Seniority: Mid-level Contributor
  • FinOps Certifications Required: None