Data Engineer III

Walmart • Bentonville • 4m ago

This notice is being provided as a result of the filing of an Application for Permanent Alien Labor Certification. Any person may provide documentary evidence bearing on the application to the Certifying Officer of the Department of Labor: U.S. Department of Labor, Employment and Training Administration, Office of Foreign Labor Certification, 200 Constitution Avenue, NW, Room N-5311, Washington, DC 20210

What you'll do...

Position:Data Engineer III

Job Location: 702 SW 8th St, Bentonville, AR 72716

Duties:Problem Formulation: Identifiespossible options to address the business problems within one's discipline through analytics, big data analytics, and automation. Applied Business Acumen: Supports the development of business cases and recommendations. Owns delivery of project activity and tasks assigned by others. Supports process updates and changes. Solves business issues. Data Governance: Supports the documentation of data governance processes. Supports the implementation of data governance practices. Data Strategy: Understands, articulates, and applies principles of the defined strategy to routine business problems that involve a single function. Data Transformation and Integration: Extracts data from identified databases. Creates data pipelines and transform data to a structure that is relevant to the problem by selecting appropriate techniques. Develops knowledge of current data science and analytics trends. Data Source Identification: Supports the understanding of the priority order of requirements and service level agreements. Helps identify the most suitable source for data that is fit for purpose. Performs initial data quality checks on extracted data. Data Modeling: Analyzes complex data elements, systems, data flows, dependencies, and relationships to contribute to conceptual, physical, and logical data models. Develops the Logical Data Model and Physical Data Models including data warehouse and data mart designs. Defines relational tables, primary and foreign keys, and stored procedures to create a data model structure. Evaluates existing data models and physical databases for variances and discrepancies. Develops efficient data flows. Analyzes data-related system integration challenges and proposes appropriate solutions. Creates training documentation and trains end-users on data modeling. Oversees the tasks of less experienced programmers and stipulates system troubleshooting supports. Code Development and Testing: Writes code to develop the required solution and application features by determining the appropriate programming language and leveraging business, technical, and data requirements. Creates test cases to review and validate the proposed solution design. Creates proofs of concept. Tests the code using the appropriate testing approach. Deploys software to production servers. Contributes code documentation, maintains playbooks, and provides timely progress updates.

Minimum education and experience :Bachelor's degree or the equivalent in Computer Science, Information Technology, Engineering, or a related field plus 2 years of experience in software engineering or related experience; OR Master’s degree or the equivalent in Computer Science, Information Technology, Engineering, or a related field.

Skills required:Must have experience with: designing and building ETL workflows to load data from a variety of data sources using Spark, SQL, HQL, Triggers and Apache Sqoop to transfer data between database servers such as SQL Server, MySQL or Oracle to Data Lake; building pipelines to ingest data from on-premises clusters to cloud platforms in order to build scalable systems with high performance, reliability, and cost-effectiveness; using Spark SQL and Data Frames to write functional programs with Python and Scala for complex data transformations using in-memory computing capabilities of Spark for fast processing; writing solutions for various scenarios including file watcher and automated validations for data quality with Scripting reusable techniques; designing and developing scripts for creating, dropping tables, and extracting data from files with complex structures; evaluating the latest technologies with proof of concepts to find optimal solutions for Big Data processing in ETL jobs; optimizing SQL Queries and fine tuning data storage using partitioning/bucketing techniques; working with in-memory database tools such as Druid for sub-second query results; performing architecture design using data warehouse concepts with Logical/Physical data modeling and Dimensional Data Modeling involving Big Data tools such as Apache Hadoop Spark, Sqoop, Map Reduce, Hive, or Parquet; designing, building and supporting the platform providing ad-hoc access to large datasets through APIs using HDFS, Hive, Big Query, Spark, Python, Shell Scripting, and Unix; developing analytical insights using SQL, reporting tools and visualization by understanding the business, working with product owners and data stewards; participating in all phases of the product development cycle from product definition and design, through implementation and testing using JIRA for Agile and Lean Methodology; performing Continuous Integration and Deployment (CI/CD) using tools such as Git or Jenkin to run test cases and build applications with code coverage using Junit and automation of acceptance Test framework with Java Spring Boot libraries; and monitoring cluster performance, setting up alerts, documenting designs, workflows and providing production support, troubleshooting, and fixing the issues by tracking the status of running applications to perform system administrator tasks. Employer will accept any amount of experience with the required skills.

#LI-DNP #LI-DNI

Wal-Mart is an Equal Opportunity Employer.