Job Requisition ID: 16483
Additional Locations:
Senior Devops Engineer (Remote - Quebec)
Imagine your work helps millions of children to unlock their learning potential?
HMH are a learning company. Over 53 million students and teachers use our learning platform every day. And that's growing every year. You can learn more Here!
With millions of users, our technology infrastructure must be robust, responsive and highly scalable. That's where your deep Site Reliability/ DevOps expertise comes in. You’ll be supporting and scaling the infrastructure needed to help millions of little learners to dream big.
At HMH, we take direct actions to attract, hire, and retain more diverse talent, nurture an inclusive workplace, and create opportunities for meaningful conversations about what it means to be antiracist. We believe that it is through learning that people find their voices, connect with others, and create a better world.
We aim to increase the diversity of our employee base by growing our diverse talent pipeline, including partnerships with organizations like Resilient Coders, Girls Write Now, Hacker X, and Editors of Color.
See here for our philosophy on diversity, equity and inclusion.
Technical Infrastructure:
- Here’s just some of what we use:
- AWS EC2, Terraform Enterprise, Docker, Aurora, Mesos, Kubernetes, ELK (Elastic Search, Logstash & Kibana).
- Grafana, Prometheus, Datadog, Telegraf, Runscope, Apollo, GraphQL.
- Microservices Architecture, Spring, Java & NodeJS, React, Koa, Express.js.
- Amazon RDS, Dynamo DB, Postgres, Oracle, MySQL, Influx DB, Linux, Jenkins. GitHub.
- You can read more on our Engineering Blog - Here.
More About your role:
This is a role with real impact.
You’ll be constantly asking; what are the most important infrastructure problems we need to solve for, today, that will increase our applications and infrastructures reliability and performance.
- You will apply your deep technical knowledge, taking a broad look at our technology infrastructure. You’ll help us identify common and systematic issues and validate these, prioritizing which to strategically address first.
- We value collaboration. So, you will partner with our SRE/DevOps team, discussing and refining your ideas and preparing proof of concepts.
- You’ll present and validate these across technology teams, figuring out the best solution.
- And you’ll be given ownership to engineer and implement your solutions.
There’s lot of interesting technology problems for you to solve, so you are constantly applying latest thinking.
These include, implementing Canary, designing a new automated pipeline solution, extension of Kubernetes capabilities, implementation of machine learning to build load testing, ensuring mutability of containerization etc.
You’ll help us plan for the future.
You’ll get to evaluate existing technologies and design the future state, without being afraid to challenge the status quo. And you’ll regularly review existing infrastructure, looking for opportunities to improve (E.g. Service improvement, cost reduction, security, performance).
You’ll also get to automate everything necessary, combining reliability with a pragmatic approach, doing it right, first time.
We’re continuing our journey of making our code and configuration deployments self-serve for our development teams.
- You’ll help us build and maintain the right tooling.
- And you’ll have ownership to design and implement the infrastructure needed.
- You’ll also be involved in the daily management of our AWS infrastructure. This means working with our Agile development teams, to troubleshoot server, application, and performance issues.
Skills and Experience:
This role is for an expert in cloud computing environments. To thrive in this role, you have;
- Significant hands-on SRE/DevOps experience in an Agile environment.
- You’ll be able to collaborate effectively with both engineers and operations, and be comfortable recommending best practices
- Substantial experience using AWS in a production environment.
- You have the expertise and skills to navigate the AWS ecosystem, and will know when and where to recommend the most appropriate service, and/or usage pattern
- You have experienced resolving outages, and are able to quickly diagnose issues and been instrumental in restoring normal service levels
- You have an intellectual curiosity, and an appetite to learn more.
- You’ll also have significant experience, and/or an interest in the following;
- Managing cloud infrastructure as code.
- Application Container Management
- Expertise with an RDBMS. You’ll know how to tune, scale and how performance and reliability are achieved.
- You’re experienced working with Linux.
- You have experience with management of Messaging Queues and event driven systems.
- Having considered security, you have experience working with firewalls, network and application load balancing & secret management..
- You’re used to working with CI/CD tools.
- You’ve used scripting languages.
- A strong and informed point of view with respect to monitoring tools and how best to use them.
Additional skills:
A keen eye for detail, you’ve got excellent root cause analysis skills.
Experience or the ability to work as a member of a distributed team is important (as your team will be co-located).
ABOUT US:
Houghton Mifflin Harcourt (NASDAQ:HMHC) is a global learning company dedicated to changing people’s lives by fostering passionate, curious learners. As a leading provider of pre-K–12 education content, services, and cutting-edge technology solutions across a variety of media, HMH enables learning in a changing landscape. HMH is uniquely positioned to create engaging and effective educational content and experiences from early childhood to beyond the classroom. HMH serves more than 50 million students in over 150 countries worldwide, while its award-winning children's books, novels, non-fiction, and reference titles are enjoyed by readers throughout the world. Follow HMH on Twitter, Facebook and YouTube. For more information, visit http://careers.hmhco.com
PLEASE NOTE:
Houghton Mifflin Harcourt is an equal employment opportunity employer and participates in E-Verify. All qualified applicants will receive consideration for employment and will not be discriminated against on the basis of gender, race/ethnicity, gender identity, sexual orientation, protected veteran status, disability, or other protected group status.