Senior Data Engineer
2 months ago
Category: Software Dev
What Will You Do
- Youll work on high impact projects that improve data availability and quality, and provide reliable access to data for the rest of the business
- Design, architect and support new and existing data and ETL pipelines and recommend improvements and modifications.
- Create optimal data pipeline architecture and systems.
- Assemble large, complex data sets that meet functional and non-functional business requirements.
- Be responsible for ingesting data into our data warehouse and providing frameworks and services for operating on that data including the use of Spark.
- Analyze, debug and correct issues with data pipelines
- Communicate strategies and processes around data modeling and architecture to multi-functional groups and senior level management.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL, Spark and AWS technologies.
- You will build widely used data pipelines and tools making critical business data available to other teams.
- You have at least 5 years of experience implementing complex ETL pipelines preferably in connection with Hadoop or Spark.
- You have lots of experience writing complex SQL and ETL processes
- You have exceptional coding and design skills, particularly in Java/Scala and Python.
- You've worked with large data volumes, including processing, transforming and transporting large-scale data
- You have hands-on experience with AWS and services like EC2, SQS, SNS, RDS, Cache etc.
- You have a BS in Computer Science / Software Engineering or equivalent experience.
- You have knowledge of Apache Hadoop, Apache Spark (including pyspark), Spark streaming, Kafka, Scala, Python, and similar technology stacks
- You have a strong understanding & usage of algorithms and data structures.
Nice To Have
- Spark data pipeline and or streaming experience
- Redshift knowledge and operational experience
- Machine Learning expertise
Please mention that you come from Remotive when applying for this job.
Help us maintain Remotive! If this link is broken, please just click to report dead link!