What you will do:
- Design, build, and operate highly performant and scalable batch and stream data processing infrastructure and solutions to support day to day ML operations including training, serving, evaluation and experimental systems.
- Design and develop Moveworks’ foundational data models, data warehouse, real-time and offline processing pipelines using AWS EMR Spark, Apache Kafka, AWS Athena, Snowflake, Airflow, Apache HUDI, etc.
- Closely work with machine learning teams and data science teams to understand their data needs, influence data team’s roadmap, and lead as well as execute on various projects.
- Build a data governance platform for secure and compliant data management, including services for data cataloging, lineage, audit, deletion and masking.
- Build and operate Orchestration platform that includes Temporal and Airflow, enabling other teams to develop features and workflows.
- Build out platform and data services/APIs to make data available to various different stakeholders and for customer facing data products
What you bring to the table:
- 6+ years of experience as senior/software engineer
- Experience with Python or Golang or Java or C++
- Experience with cloud infrastructure like AWS/GCP/Azure
- Experience with relational or non-relational databases such as Postgres, AWS DataLake/S3 or DynamoDB or Snowflake
- BS or higher in Computer Science or a related field.
Compensation Range: $200,000 - $220000