Senior Data Engineer

SemanticBits


2 weeks ago

03/12/2020 10:21:17

Job type: Full-time

Hiring from: US only

First appeared on StackOverflow

Category: Software Development


SemanticBits is looking for a talented Senior Data Engineer who is eager to apply computer science, software engineering, databases, and distributed/parallel processing frameworks to prepare big data for the use of data analysts and data scientists. You will mentor junior engineers and deliver data acquisition, transformations, cleansing, conversion, compression, and loading of data into data and analytics models. You will work in partnership with data scientists and analysts to understand use cases, data needs, and outcome objectives. You are a practitioner of advanced data modeling and optimization of data and analytics solutions at scale. Expert in data management, data access (big data, data marts, etc.), programming, and data modeling; and familiar with analytic algorithms and applications (like machine learning).

Requirements

  • Bachelor’s degree in computer science (or related) and eight years of professional experience
  • Strong knowledge of computer science fundamentals: object-oriented design and programming, data structures, algorithms, databases (SQL and relational design), networking
  • Demonstrable experience engineering scalable data processing pipelines.
  • Demonstrable expertise with Python, Spark, and wrangling of various data formats - Parquet, CSV, XML, JSON.
  • Experience with the following technologies is highly desirable: Redshift (w/Spectrum), Hadoop, Apache NiFi, Airflow, Apache Kafka, Apache Superset, Flask, Node.js, Express, AWS EMR, Scala, Tableau, Looker, Dremio
  • Experience with Agile methodology, using test-driven development.
  • Excellent command of written and spoken English
  • Self-driven problem solver

Please mention that you come from Remotive when applying for this job.

Help us maintain Remotive! If this link is broken, please just click to report dead link!

similar jobs

  • Thorn is a non-profit focused on building technology to defend children from sexual abuse. Working at Thorn gives you the opportunity to apply your skills, expertise, and passions to directly impact the lives of vulnerable and abused children. Our staff solves dynamic, quickly evolving problems with our network of partners from tech companies, NGOs, and law enforcement agencies. If you are able to bring clarity to complexity and lightness to heavy problems, you could be a great fit for our team. 

    Last year, we took the stage at TED and shared our audacious goal of eliminating child sexual abuse material from the internet. 

    About the Role:

    Law enforcement doesn’t have enough time to navigate the online commercial sex market to find children and identify their traffickers. Spotlight takes this massive amount of data and turns it into an asset for law enforcement. The objective of Spotlight is to improve the effectiveness and efficiency of domestic sex trafficking investigations and increase the number of children who are identified and connected with help resources. Learn more about Spotlight.

    What You'll Do:

    • Collaborate with other engineers on your team to build complex client application features built on top of hundreds of terabytes of data.

    • Work closely with the product manager and engineers to define product requirements, and collaborate to devise optimal engineering solutions.

    • Create technical specifications, prototypes, and presentations to communicate your ideas.

    • Play a critical role in day-to-day coding, code reviews, and troubleshooting production issues.

    • Drive technical innovation by researching and incorporating new technologies and tools into our core system.

    Skills We're Seeking:

    • You have a commitment to putting the children we serve at the center of everything you do.

    • You have proficient software development knowledge with experience building, growing, maintaining a variety of products, and a love for creating elegant applications using modern technologies.

    • You have experience prototyping, implementing, testing, and deploying code to production.

    • You can work with shifting requirements and collaborate with internal stakeholders.

    • You have empathy and can be a strong advocate for our users while balancing the vision and constraints of engineering realities.

    • You communicate clearly, efficiently, and thoughtfully. We’re a highly-distributed team, so written communication is crucial, from Slack to pull requests to code reviews.

    Technologies We Use:

    You should have non-trivial experience with React and SQL, but we’re excited about teaching folks that have the desire and ability to learn the rest. 

    • React / TypeScript

    • Node / Express

    • MemSQL (MySQL-compatible relational database)

    • Docker / Kubernetes

    • AWS

    Thorn is a strong and flexible team because of the diverse backgrounds of our staff. This includes professional background, subject matter expertise, culture, race/ethnicity, sexual orientation, gender identity, and expression, language, hobbies, etc. We strongly encourage women, minorities, and people from underrepresented backgrounds to apply. Your skills are needed here.

  • Close (American or European timezones)
    2 weeks ago

    About Us

    At Close, we're building the sales communication platform of the future. With our roots as the very first sales CRM to include built-in calling, we're leading the industry toward eliminating manual processes and helping companies to close more deals (faster). Since our founding in 2013, we've grown to become a profitable, 100% globally distributed team of 43 high-performing, happy people that are dedicated to building a product our customers love.

    Our backend tech stack currently consists of Python Flask/Gunicorn web apps with our TaskTiger scheduler handling many of the backend asynchronous task processing. Our data stores include MongoDB, Postgres, Elasticsearch, and Redis. The underlying infrastructure runs on AWS using a combination of managed services like RDS and ElasticCache and non-managed services running on EC2 instances. All of our compute runs through CI/CD pipelines that build Docker images, run automated tests and deploy to our Kubernetes clusters. Our backend primarily serves a well-documented public API that our front-end JavaScript app consumes.

    We ❤️open source – using dozens of open source projects with contributions to many of them, and released some of our own like ciso8601, LimitLion, SocketShark, TaskTiger, and more at https://github.com/closeio

    About You

    We're looking for an experienced full-time Software Engineer to join our engineering team. Someone who has a solid understanding of web technologies and wants to help design, implement, launch, and scale major systems and user-facing features.

    You should have senior level experience (~5 years) building modern back-end systems, with at least 3 years of that experience using Python.

    You also have around five years experience using MongoDB, PostgreSQL, Elasticsearch, or similar data stores. You have significant experience designing, scaling, debugging, and optimizing systems to make them fast and reliable. You have experience participating in code reviews and providing overall code quality suggestions to help maintain the structure and quality of the codebase.

    You’re comfortable working in a fast-paced environment with a small and talented team where you're supported in your efforts to grow professionally. You are able to manage your time well, communicate effectively and collaborate in a fully distributed team.

    You are located in an American or European time zone.

    Bonus points if you have...

    • Contributed open source code related to our tech stack

    • Led small project teams building and launching features

    • Built B2B SaaS products

    • Experience with sales or sales tools

    Come help us with projects like...

    • Conceiving, designing, building, and launching new user-facing features

    • Improving the performance and scalability our API. Help expand our GraphQL implementation.

    • Improving how we sync millions of sales emails each month

    • Working with Twilio's API, WebSockets, and WebRTC to improve our calling features

    • Building user-facing analytics features that provide actionable insights based on sales activity data

    • Improving our Elasticsearch-backed powerful search features

    • Improving our internal messaging infrastructure using streaming technologies like Kafka and Redis 

    • Building new and enhancing existing integrations with other SaaS platforms like Google’s G Suite, Zapier, and Web Conferencing providers

    Why work with us?

    • Culture video 💚

    • 100% remote (we believe in trust and autonomy)

    • 2 x annual team retreats ✈️ (Lisbon retreat video)

    • Competitive salary

    • 7 weeks PTO (includes company-wide winter holiday break)

    • 1 month paid sabbatical after 5 years

    • $200/month co-working stipend

    • Parental leave (10 wks primary caregiver / 4 wks secondary caregiver)

    • 99% premiums paid for excellent medical and dental coverage, including an HSA option (US residents)

    • 401k matching at 4% (US residents)

    • Dependent care FSA (US residents)

    • Our story and team 🚀

    • Glassdoor Reviews 

    At Close, everyone has a voice. We encourage transparency and practicing a mature approach to the work-place. In general, we don’t have strict policies, we have guidelines. Work/Life harmony is an important part of our organization - we believe you bring your best to work when you practice self care (whatever that looks like for you).

    We come from 12 countries and 16 states; a collection of talented humans rich in diverse backgrounds, lifestyles and cultures. Twice a year we meet up somewhere around the world to spend time with one another. We see these retreats as an opportunity to strengthen the social fiber of our community.

    This team is growing in more ways than one - we’ve recently launched 11 babies (and counting!). Unanimously, our favorite and most impactful value is “Build a house you want to live in.” We strive to make decisions that are authentic for our organization. At Close, we have a high care factor for one another, in making an awesome product and championing the success of our customers.  

    Interested in Close but don't think this role is the best fit for you? View our other positions.

  • The Challenge

    We’re now looking for a Data Engineer or a Senior Backend Software Engineer (sometimes called Data Infrastructure Engineer, Data Platform Engineer, or Machine Learning Platform Engineer) who can lead the charge in developing and maintaining the platform that will support large-scale ML deployments. Imagine that you have cutting-edge machine learning models, but you now have to deploy them behind a bank’s four walls on a system that could be used by over 30,000 companies simultaneously in a database with billions of records. You must have 3+ years of production-level experience working with Kubernetes.

    Overview

    Our mission is to build financial management technologies that enable the world’s most important companies to grow more quickly in a sustainable way that’s good for people, the planet, and business.

    When companies have strong cash flow performance they can shift from short-term acrobatics to long-term growth and innovation. These are the teams that change the world by being freed to optimize for all of their stakeholders, including their employees, business partners, and environment.

    The Opportunity

    Cash flow is the toughest financial statement to understand but it’s fundamental to funding your own growth. We build the most intuitive and actionable tools for companies to optimize cash flow performance. Our platform analyzes billions of dollars of B2B transactions each year, users spend 70% of their workday in Tesorio, and we save finance teams thousands of hours. As a result, they can invest more confidently and anticipate their capital needs further in advance.

    We’re growing quickly and working with the world’s best companies and the largest bank in the US. We recently raised a $10MM Series A led by Madrona Venture Group and are backed by top investors including First Round Capital, Y Combinator, and Floodgate. We’re also backed by tenured finance execs, including the former CFOs of Oracle and NetSuite.

    We’re now looking for a Data Engineer or Senior Backend Software Engineer who can lead the charge in developing and maintaining the platform that will support large-scale ML deployments. This project you are joining is fast-paced and for a large bank, so you must be experienced—you will not have time to simultaneously onboard, gather business context, and deliver on the tight timeline. To give you a sense for the project, imagine that you have cutting-edge machine learning models, but you now have to deploy them behind a bank’s four walls on a system that could be used by over 30,000 companies simultaneously in a database with billions of records. 

    The ideal candidate for this role is NOT someone that can build a great model, rather you are good at building and maintaining a complex piece of infrastructure on Kubernetes and understand its common pitfalls. You should be strong at Python and SQL, a good communicator, and should be extremely reliable, able to own deliverables without dropping the ball. You must have 6+ years of experience as an engineer with 3+ years of production-level experience working with Kubernetes.  

    Our team is based in the San Francisco Bay Area, and we have a diverse, distributed workforce in five countries across the Americas. We don’t believe that people need to sacrifice being close to their families and where they’d prefer to live in order to do their best work.

    Responsibilities
    • You will be responsible for creating and maintaining machine learning infrastructure on Kubernetes
    • Build and own workflow management systems like Airflow, Kubeflow, or Argo. Advise data and ML engineers on how to package and deploy their workflows
    • Implement logging, metrics and monitoring services for your infrastructure and container logs
    • Create Helm charts for versioned deployments of the system on client premises
    • Continuously strive to abstract away infrastructure, high availability, identity and access management concerns from Machine Learning and Software Engineers
    • Understand the product requirements and bring your own opinions and document best practices for leveraging Kubernetes
    Skills
    • 6+ years of experience in creating and maintaining data and machine learning platforms in production
    • Expert-level knowledge of Kubernetes like various operators, deployments, cert management, security, binding users with cluster and IAM roles, etc.
    • Experience dealing with persistence pitfalls on Kubernetes, creating and owning workflow management system (Airflow, Kubeflow, Argo etc.,) on Kubernetes
    • Experience creating Helm charts for versioned deployments on client premises
    • Experience securing the system with proper identity and access management for people and applications.
    • Ability to work in a fast paced, always-changing environment
    • Nice to have: Experience spinning up infrastructure using Terraform and Ansible
    • Nice to have: Experience working with data engineers running workflow management tools on your infrastructure

Remotive can help!

Not sure how to apply properly to this job? Watch our live webinar « 3 Mistakes to Avoid When Looking For A Remote Startup Job (And What To Do Instead) ».

Interested to chat with Remote workers? Join our community!