Senior Site Reliability Engineer

Lisboa
Permanente
Horário completo

Há 1 mês

Job DescriptionMiniclip is seeking a Site Reliability Engineer (SRE) to join our Cloud Engineering Team, in our Lisbon (Portugal) Games Development Studio. You'll play a critical role in ensuring the reliability, scalability, and performance of our infrastructure, while helping product and engineering teams deliver with speed and confidence.As an SRE, you'll be part of a team responsible for operating and evolving our cloud infrastructure, automating processes, and building systems that detect and prevent issues before they impact our users. You'll bring a strong engineering mindset to systems operations, helping us deliver high availability and rapid iteration without compromising stability.What will you be doing at Miniclip?

Participate in an on-call rotation with the Cloud Engineering team to respond to production incidents and outages.
Operate and evolve infrastructure using Infrastructure as Code (Terraform), configuration management tools, and containerized platforms on AWS.
Build and maintain observability tooling to detect symptoms before they lead to outages.
Automate repetitive tasks and processes to reduce operational toil.
Collaborate with Engineering and Product teams to design resilient systems that meet performance and reliability goals.
Troubleshoot production issues across application, network, and infrastructure layers.
Document systems, processes, and runbooks to improve team transparency and onboarding.

What are we looking for?

5+ years of hands-on experience with AWS in both development and operations contexts.
Strong Linux system administration skills, including performance tuning and debugging (experience with eBPF tracing is a plus).
Software development background and strong coding skills in one or more of the following: Go, Python, Ruby.
Experience with Infrastructure as Code, particularly Terraform.
Familiarity with CI/CD pipelines and artifact management tools (e.g., Ansible, Puppet, Chef, Artifactory, Nexus).
A mindset for resilient systems design, thinking about edge cases, failure modes, and graceful degradation.
Excellent communication skills in English, both written and spoken.
Comfortable in a fast-paced environment and adaptable to shifting priorities.

Great If You Have (But Not Required)

Experience with EKS or ECS.
Familiarity with chaos engineering practices.
Knowledge of OpenTelemetry or Distributed Tracing Systems.
Knowledge of Service Level Objectives (SLOs), Service Level Indicator (SLIs).
Experience setting up Error Budgets and conducting Post Incident Reviews.

About MiniclipMiniclip is a global leader in games and one of the world's biggest developers and publishers of mobile games, with a mission of unleashing the gamer in everyone. It distributes highly engaging games to a global audience of over 400 million monthly and 65 million daily active users across mobile, PC, console, social, and online platforms. Operating in 12 countries, Miniclip develops and launches games in multiple categories across its 20 studios. Founded in 2001 with an internationally recognised brand name, Miniclip has successfully grown a global audience across 195 countries and six continents. It has a unique understanding of the games space, developing and distributing a strong portfolio of over 60 high-quality mobile games globally. To date, Miniclip's studios and companies have generated more than 10 billion downloads, including the following games: 8 Ball Pool™, Subway Surfers™, Golf Battle™, Football Strike™, Carrom Pool™, OSM - Online Soccer Manager™, Football Rivals ™, Pure Sniper ™, Puzzle Page ™, Head Ball 2™, Motorsport Manager™, Darts of Fury™, Ultimate Golf™, Mini Football ™, Triple Match 3D ™, Agar.io™, and PowerWash Simulator ™.For more information, visit

Miniclip

Candidate-se