Consultor SRE Senior

Lisbon, Portugal

Job Type

Full Time

Workspace

Hybrid

About the Company

At Sourcing Trust, we are committed to delivering innovative, reliable, and tailored technology solutions that empower businesses to succeed in a rapidly evolving digital landscape. With a focus on excellence, integrity, and collaboration, we build lasting partnerships by understanding our clients' unique needs and providing them with expert support across. Our team is dedicated to fostering a positive and inclusive work environment where every employee's contribution is valued, encouraging continuous growth, learning, and shared success. Join us and be part of a passionate organization driven by innovation and excellence.

About the Role

We are looking for a Senior SRE Consultant with strong architectural expertise to design and lead Site Reliability Engineering solutions across complex cloud-native environments. The role requires deep knowledge of cloud architecture, observability, Kubernetes, and SRE best practices to ensure platform reliability, scalability, and performance.

Requirements

Requirements

Bachelor's degree in Computer Engineering or equivalent qualification.
AWS Certified Solutions Architect - Associate certification (mandatory).
At least one of the following certifications/trainings:
- Kubernetes - Getting Started with Google Kubernetes Engine (certified training)
- AWS Certified Developer – Associate
- AWS Certified Cloud Practitioner
- AI - Certified AI Practitioner
- Grafana Concept and Basic Configuration (certified training)
- Monitoring Key Systems with Prometheus Exporters (certified training)
6+ years of experience in SRE, DevOps, Cloud Architecture, or Platform Engineering roles.
Proven experience designing scalable cloud architectures and highly available systems.
Advanced expertise in Kubernetes orchestration, Helm charts, and container platforms.
Strong experience with observability stacks: Grafana, Prometheus, distributed tracing.
Experience implementing SRE principles: SLOs/SLIs, error budgets, toil reduction, reliability engineering.
Proficiency in Infrastructure as Code (Terraform, CloudFormation) and GitOps practices.
Experience with multi-cloud (AWS, GCP, Azure) and hybrid infrastructure architectures.
Strong programming/scripting skills (Python, Go, Bash) for automation and tooling.
Experience leading technical architecture discussions and mentoring engineering teams.

Main Responsibilities

Architect cloud-native SRE platforms with focus on scalability, resilience, and observability.
Design and implement comprehensive observability solutions using Grafana/Prometheusstacks.
Define and govern SLOs/SLIs, reliability budgets, and service level objectives.
Lead Kubernetes platform design and container orchestration strategies.
Drive automation and GitOps implementations to streamline platform operations.
Conduct architecture reviews, capacity planning, and disaster recovery planning.
Mentor SRE engineers and development teams on reliability and cloud best practices.
Collaborate with security, development, and operations teams on system design and incident response.

Preferred / Valued

Experience with AIOps and AI-driven observability solutions.
Advanced Prometheus/Grafana dashboard design and alerting strategies.
Chaos engineering and resilience testing implementations.
Experience with service mesh architectures (Istio, Linkerd).

Language Requirements

Fluent English (written and spoken).

Apply Now