
Application Reliability Engineer
- Porto
- Permanente
- Horário completo
- Ensure Application Stability - Monitor and maintain containerized applications, quickly responding to incidents and ensuring seamless recovery in production environments
- Investigate and Resolve Issues - Analyze performance and troubleshoot across technologies to diagnose problems and implement effective solutions
- Lead Incident Management - Prioritize and manage incidents from start to resolution, coordinating across teams to minimize downtime and maintain system reliability
- Champion Continuous Improvement - Proactively seek opportunities to enhance processes, implement best practices, and drive operational excellence in a dynamic tech environment
- Experience in administration and service recovery of containerized applications using Kubernetes, Openshift / Microshift
- Experience with SQL, .NET, Clickhouse, Kafka - although primary responsibility is not programming, interpret and troubleshoot and mitigate issues over existing code base is required
- Proficient in managing incidents and setting priorities effectively
- Excellent English skills - spoken and written
- Time management, organization and prioritization
- Attention to detail - thorough in work carried out
- Great interpersonal and communication skills
- Initiative - seeking continuous improvement and implementing best practice in technology environment
- Manufacturing industry business domain knowledge (e.g. Semiconductor, Electronics, Medical Devices or Industrial Equipment manufacturing)
- Experience with SQL Server
- Knowledge and/or Experience with the ITIL framework.
- Experience working with Manufacturing Execution Systems (MES) or production automation platforms
- Exposure to data platforms for operational reporting and analytics
- Understanding of IoT device integration and automation workflows within factory environments
Aceda e crie já a sua conta.ouEntrar
ITJobs