Required Skills• Must Haves: 3 to 5 years exp. Kubernetes, DataDog, cloud services, large scale systems, AWS&GCP, minor Azure • GKE, home strung clusters on prem, and AKS (Very Small), EKS • Consistent upgrades across all the clusters and clouds • Nice to Have: Gaming experience bonus
Required Qualifications
- 6+ years of demonstrated influence across one or more teams for large scale projects that drive impact and improvement across the organization
- 6+ years of experience in an SRE role for online services in a multi-region, multi-cloud environment with specific experience in reliability and resliency
- 6+ years of developing tools for automation of processes or augmenting off the shelf tool functionality
- 6+ years of AWS and/or GCP cloud experience running highly elastic mission critical workloads
- 6+ years of coding experience in at least one or more of Python, Ruby, Java, or Go and a good understanding of code management
- 6+ years of experience using Infrastructure as Code tools like Terraform, Pulumi, or others
- Extensive knowledge of software build, test, and deploy processes using Git, Jenkins, Puppet, Ansible, Docker/containers, and Kubernetes
- Experience with system analysis and troubleshooting
- Serve as a mentor to junior engineers and provide technical leadership to the organization.
Bonus Points
- Prior hands-on experience running large scale multiplayer video games at scale
- Experience designing and crafting software for systems and network automation
- Debugging, code optimization, and routine task automation skills
- Demonstrated ability to decompose sophisticated problems. Ability to engage in lateral investigations.