Title – AWS Snowflake Data Engineer
Duration – 6+ months, likely to extend
Top Requirements:
• 7+ years of experience in data engineering, data warehousing, or related roles.
• Extensive hands-on experience with Snowflake implementations and optimizations. Experience with real-time data processing, particularly involving AWS S3 data lakes and mainframe data sources (VSAM, DB2, GDG, COBOL).
• Experience with performance tuning and query optimization in Snowflake. Handling high-volume data pipelines and ensuring their reliability and efficiency.
• Experience with ETL transformations for COBOL data sources, understanding and converting JCL code to Python-based ETL processes.
• Experience using AWS Glue ETL, AWS Lake Formation or any ETL tool compatible with Snowflake for data transformation.
JD:
• Finance and Treasury Data Engineering team constructs pipelines that contextualize and provide easy access to data by the entire enterprise. As a Data Engineer, you will play a key role in growing and transforming our analytics landscape. You will leverage your ability to design, build and deploy data solutions that capture, explore, transform, and utilize data to support business intelligence/insights, Machine Learning, and Artificial Intelligence.
• Develop, implement, and manage auto-ingestion of high-volume data pipelines to integrate Snowflake with AWS S3 data lakes and mainframe data sources (VSAM, DB2, GDG, COBOL).
• Designing a data integration solution to create pipelines for pushing or pulling data from oracle database application.
• Building Data Flow using AWS Glue ETL or any ETL tool compatible with Snowflake for data transformation. Handle ETL transformations to process and transform data from COBOL and other mainframe sources.
• Managing Data Lake using AWS Lake Formation service for data governance and Lineage
• Build and manage data APIs in Python to facilitate data exchange and integration with Snowflake.
• Work closely with clients to understand their data needs and design appropriate Snowflake solutions.
• Provide expert guidance on the implementation and optimization of Snowflake solutions.
• Optimize Snowflake environments for performance, scalability, and cost-efficiency.
• Ensure data security, privacy, and compliance within Snowflake solutions.
• Conduct training sessions and workshops for clients on Snowflake best practices and usage.
• Provide AWS and snowflake data solutions that align with US Bank's policies for network access and data security by configuring access controls (RABC), encryption, data protection, and monitoring solutions.
Other Requirements:
• Experience using AWS Glue ETL, AWS Lake Formation or any ETL tool compatible with Snowflake for data transformation.
• Hands on experience with snowflake features like SnowPipe, Bulk Copy, Tasks, Streams, Stored procedures, and UDFs
• Have experience with Snowflake cloud data warehouse and AWS S3 bucket or Azure blob storage container for integrating data from multiple source system.
• Should have 4+ years of experience on AWS services (S3, Glue, Lambda) or Azure services (Blob Storage, ADLS gen2, ADF)
• Good to have experience in deployment of code using CI/CD for AWS service, and Snowflake solutions, and exp in repositories like Gitlab, GitHub etc.
• Good to have experience in deployment of infrastructure as service (IaC) using tools like Terraform or equivalent tools for AWS service, and Snowflake solutions.
• 4+ years of experience in building and managing APIs using Python/Pyspark integration with Snowflake and cloud (AWS/Azure).
• Knowledge of Snowpark for advanced data processing and analytics within Snowflake.
• Experience in Finance and Treasury projects is preferred
Plusses:
- • Snowflake certification (e.g., SnowPro Core Certification).
- • Experience with other data warehousing tools and technologies (e.g., Azure Synapse, Redshift).
- • Experience with data visualization tools (e.g., Power BI, Tableau).
- • Familiarity with data governance and data quality frameworks.
Job Description:
- We are looking for a candidate who has an interest in back end development, devOps, ETL, and cloud engineering. We are specifically looking for a candidate who has experience with GCP and/or AWS. We would like the candidate to have experience with ETL using the service catalogue offerings of either AWS or GCP.
- Team Digital Triplet manages the data lake, modeling, and a number of cloud accounts associated with various LIMS instances. Digital Triplet will be creating a pipeline between AWS and a (new to Bayer) data warehouse in Google Big Query, configuring multiple accounts, and investigating new technologies that we could utilize to more efficiently manage the lake, ETL, and reporting capabilities for our users and for our full stack development teams. We are looking for a candidate that wants to continuously improve by introducing new ideas, tools, and solutions to this space.