Role: Cloud Engineer
Location: Minneapolis, MN
Key Skills: Big Data, Python, PySpark, AWS, Scripting, Git.
Experience: 08+ years
Mode of Hire: Full Time
Skills Required:
- Hands on experience in Big Data technologies.
- Mandatory - Hands on experience in Python and PySpark. Python as a language is practically usable for anything, we are looking for application Development and Extract/Transform/Load and Data lake curation experience using Python.
- Build PySpark applications using Spark Data frames in Python using Jupyter notebook and PyCharm(IDE).
- Worked on optimizing spark jobs that processes huge volumes of data.
- Hands on experience in version control tools like Git.
- Worked on Amazon’s Analytics services like Amazon EMR, Amazon Athena, AWS Glue.
- Worked on Amazon’s Compute services like Amazon Lambda, Amazon EC2 and Amazon’s Storage service like S3 and few other services like SNS.
- Experience/knowledge of bash/shell scripting, PowerShell etc
- Experience in designing CFTs/ Terraform templates for deploying Infrastructure-as-code
- Has built ETL processes to take data, copy it, structurally transform it etc. involving a wide variety of formats like CSV, TSV, XML and JSON.
- Experience in working with fixed width, delimited , multi record file formats etc.
- Good to have knowledge of data warehousing concepts – dimensions, facts, schemas- snowflake, star etc.
- Have worked with columnar storage formats- Parquet,Avro,ORC etc. Well versed with compression techniques – Snappy, Gzip.
- Good to have knowledge of AWS databases (at least one) Aurora, RDS, Redshift, ElastiCache, DynamoDB.
- Understanding of foundational technologies such as IAM, core IaaS services (compute, storage, networking).