Position: Senior Software Developer - AWS Glue and ETL
Type: Contract (9–12 months, with potential for extension)
Location: Kitchener/Waterloo / Toronto or Remote
Start Date: ASAP
About Us:
Messagepoint is a privately-owned, PE-funded software company headquartered in Toronto, Ontario. We enable large enterprises to deliver enhanced customer experiences and achieve a better bottom line by optimizing their omni-channel customer communications.
Our award-winning platform gives non-technical business users such as marketers, CX teams and product owners intelligent control over the content in customer communications to achieve unparalleled personalization, relevancy, brand consistency, and compliance. Only Messagepoint harnesses AI-powered Content Intelligence to automate and simplify the process of migrating, optimizing, authoring, and managing business-critical on-demand, interactive and batch communications across all platforms and channels.
Founded in 1998 as Prinova, Messagepoint has grown from its beginnings as a top global Customer Communications Management (CCM) integrator, to become a leader in the space due to our unique ability to intelligently manage content.
Summary
We are seeking an experienced Senior Software Developer specializing in AWS Glue and ETL to lead the integration and adoption of AWS Glue within our technology stack. This is a pivotal role focused on designing and implementing scalable, configuration-driven ETL processes for ingesting, wrangling, and transforming heterogeneous data sources into a centralized data lake. You will play a key part in enabling our team to build reusable, schema-on-read-based pipelines and robust data catalogs to support business-critical applications.
The ideal candidate will have deep expertise in AWS Glue, Spark-based data processing, and the development of reusable, modular code for complex ETL workflows.
Responsibilities:
1. AWS Glue Integration & Enablement
• Introduce AWS Glue as the primary ETL tool in our technology stack.
• Configure AWS Glue crawlers, data catalogs, and ETL jobs to automate schema discovery, normalization, and metadata management.
• Establish best practices and guidelines for AWS Glue adoption.
2. Data Catalogs and Schema-on-Read
• Design and implement schema-on-read architectures to enable downstream applications to dynamically interpret and map schemas.
• Create and maintain centralized data catalogs using AWS Glue to manage metadata for all ingested data sources.
3. ETL Process Design
• Build agile workflows for ingesting, wrangling, and transforming diverse data formats, including JSON, CSV, and text files.
• Optimize ETL processes for scalability, performance, and error handling.
4. Spark-Based Reusable Code Development
• Develop reusable, modular Spark code for data processing, transformation, and cleaning.
• Ensure that code components are optimized for performance and scalable across multiple data sources.
5. Configuration-Driven Pipelines
• Design ETL workflows that minimize coding requirements by leveraging configuration-based setups.
• Collaborate with non-technical stakeholders to build user-friendly interfaces for configuring data pipelines.
6. Data Quality and Validation
• Implement robust QA mechanisms to validate data integrity at every stage of the ETL pipeline.
• Create automated validation scripts to ensure data accuracy and compliance with defined standards.
7. Collaboration and Leadership
• Work closely with cross-functional teams, including data engineers, software developers, and product managers, to align ETL pipelines with business requirements.
• Mentor junior team members in AWS Glue best practices and advanced ETL development techniques.
8. Monitoring and Maintenance
• Integrate monitoring tools such as AWS CloudWatch to provide visibility into ETL job performance and ensure timely issue resolution.
• Establish documentation for ETL processes, AWS Glue configurations, and reusable code components.
Qualifications:
● Proven experience with AWS Glue, including Data Catalogs, Crawlers, and ETL job configuration.
● Strong expertise in Apache Spark for data processing and transformation.
● Hands-on experience in building schema-on-read architectures and managing evolving data schemas.
● Proficiency with ingesting and processing heterogeneous data sources (e.g., JSON, CSV, text).
● Experience developing modular, reusable code for data wrangling and transformation.
● Deep understanding of data quality and validation strategies in ETL pipelines.
● Proficiency in Python or Scala for Spark and AWS Glue development.
● Familiarity with AWS services such as S3, Athena, and Redshift.
● Strong knowledge of configuration-driven ETL design principles.
● Experience with monitoring and logging tools such as AWS CloudWatch.
● Familiarity with agile methodologies for pipeline development.
● Knowledge of performance tuning in Spark-based ETL workflows.
● Ability to document workflows and provide technical training.
● Excellent problem-solving skills and attention to detail.
● Strong communication and collaboration skills to work effectively with diverse teams.
● Ability to work independently and manage multiple priorities in a fast-paced environment.
Why Join Us?
● Work on cutting-edge ETL solutions using AWS Glue.
● Contribute to the transformation of our data ecosystem, impacting critical applications.
● Collaborate with a dynamic and innovative team passionate about data engineering.
● Opportunity to extend or grow within the organization based on project success.
Messagepoint is an Equal Opportunity Employer and encourages diversity and inclusion in the workplace.
We thank you for your interest, however, only those who qualify for an interview will be contacted.