Sr Data Engineer

Simon Data


4 months ago

09/20/2019 10:21:23

Job type: Full-time

Hiring from: US only

Category: All others


ABOUT US 

Simon Data was founded in 2015 by a team of successful serial entrepreneurs. We're a data-first marketing platform startup, and we approach our work seriously; we tackle problems in a scrappy and disruptive fashion, yet we build for scale to support our clients at big data volume.

We are the first and only enterprise customer data platform with a fully-integrated marketing cloud. Moving beyond the limitations of both categories, Simon’s platform empowers businesses to leverage enterprise-scale big data and machine learning to power customer communications in any channel. Simon’s unique approach allows brands to develop incredible personalization capabilities without needing to build and maintain massive bespoke data infrastructure.

Our culture is rooted in organizational transparency, empowering individuals, and an attitude of getting things done. If you want to be a valuable contributor on a team that cultivates these core values we would love to hear from you.

THE ROLE

Does your perfect job involve working with petabytes of data or constructing data pipes that operate over widely diverse industries? Do you get excited about the chance for your hard work to have a tangible impact on the bottom line of each of our clients? 

As a Senior Data Engineer at Simon, you’ll immediately dive into a system of multiple streaming and batch pipelines that process petabytes of data daily. You’ll be busy putting your skills to work: a readiness to tackle complex problems; a passion for building fault tolerant, highly available systems; and mindfulness to adapt as our business continues to 10x in scale.

The Simon data pipes are the backbone of our platform and critical to our clients’ success. You will play a pivotal role in our ability to sustainably and rapidly move the data that powers our platform to engage with hundreds of millions of customers -- sending over billions of messages annually. 

WHAT YOU'LL DO

  • Create, scale, and own data pipelines using batch and streaming tools like Python, Spark, Kinesis, Redshift, Snowflake and Elasticsearch or leveraging new technologies

  • Architect and develop new data products for our clients like new reporting capabilities or data transformation tooling for clients

  • Construct the platform and tools for our clients to self serve their data engineering needs on the Simon system 

  • Contribute to the Simon Data ecosystem to address ever-changing requirements of scale

  • Develop an expertise in profiling and debugging AWS services to define performant usage patterns

  • Collaborate every day with a team of your peers, all of whom are passionate about quality, staying ahead of the curve, and continuous improvement

  • Participate in team-wide discussions ranging from architecture to developer productivity to security to the best Gary Oldman movie

  • Be a part of establishing our mark in Open Source Software as well as promoting and sharing it at conferences locally and nationwide

QUALIFICATIONS

  • Minimum of 4 years of experience designing, deploying, and owning several substantive data engineering or analysis projects with company-wide impact

  • Minimum of 2 years of experience working with various functional owners in your company (spanning product management, program management, as well as Dev/Tech Ops)

  • Proficient with at least one mainstream programming language (Python, Java, Scala, C#, Ruby, etc.)

Visa sponsorship for this role is currently not available

Diversity

We’re proud to be an equal opportunity employer open to all qualified applicants regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or expression, Veteran status, or any other legally protected status.

Please mention that you come from Remotive when applying for this job.

Help us maintain Remotive! If this link is broken, please just click to report dead link!

similar jobs

  • Auth0 (US or Argentina)
    1 month ago
    Auth0 is a pre-IPO unicorn. We are growing rapidly and looking for exceptional new team members to add to our teams and will help take us to the next level. One team, one score. 

    We never compromise on identity. You should never compromise yours either. We want you to bring your whole self to Auth0. If you’re passionate, practice radical transparency to build trust and respect, and thrive when you’re collaborating, experimenting and learning – this may be your ideal work environment.  We are looking for team members that want to help us build upon what we have accomplished so far and make it better every day.  N+1 > N.

    The Data engineer will help build, scale and maintain the enterprise data warehouse. The ideal candidate will have a deep understanding of technical and functional designs for Databases, Data Warehousing and Reporting areas. The candidate should feed on challenges and love to be hands on with recent technologies.

    This job plays a key role in data infrastructure, analytics projects, and systems design and development. You should be passionate for continuous learning, experimenting, applying and contributing towards cutting edge open source Data technologies and software paradigms.

    Responsibilities:

    • Contributing at a senior-level to the data warehouse design and data preparation by implementing a solid, robust, extensible design that supports key business flows.
    • Performing all of the necessary data transformations to populate data into a warehouse table structure that is optimized for reporting.
    • Establishing efficient design and programming patterns for engineers as well as for non-technical peoples.
    • Designing, integrating and documenting technical components for seamless data extraction and analysis.
    • Ensuring best practices that can be adopted in our data systems and share across teams.
    • Contributing to innovations and data insights that fuel Auth0’s mission.
    • Working in a team environment, interact with multiple groups on a daily basis (very strong communication skills).

    Skills and Abilities:

    • + BA/BS in Computer Science, related technical field or equivalent practical experience.
    • At least 4 years of relevant work experience
    • Ability to write, analyze, and debug SQL queries.
    • Exceptional Problem solving and analytical skills.
    • Experience with Data Warehouse design, ETL (Extraction, Transformation & Load), architecting efficient software designs for DW platform.
    • Knowledge of database modeling and design in a Data Warehousing context
    • Strong familiarity with data warehouse best practices.
    • Proficiency in Python and/or R.


    Preferred Locations:

    • #AR; #US;
    Auth0’s mission is to help developers innovate faster. Every company is becoming a software company and developers are at the center of this shift. They need better tools and building blocks so they can stay focused on innovating. One of these building blocks is identity: authentication and authorization. That’s what we do. Our platform handles 2.5B logins per month for thousands of customers around the world. From indie makers to Fortune 500 companies, we can handle any use case.

    We like to think that we are helping make the internet safer.  We have raised $210M to date and are growing quickly. Our team is spread across more than 35 countries and we are proud to continually be recognized as a great place to work. Culture is critical to us, and we are transparent about our vision and principles. 

    Join us on this journey to make developers more productive while making the internet safer!
  • 1 month ago

    Summary 

    Wikipedia is where the world turns to understand almost any topic — The Wikimedia Foundation is the nonprofit that operates Wikipedia with a small staff.  We are looking for a great data architect who wants to modernize the infrastructure underlying Wikipedia with distributed storage, services and REST interfaces.  If this excites you, we welcome you to join us.

    Description

    • Collaborate with Product Owners, Engineers and stakeholders on product discovery and improvements of our existing systems
    • Design and implement effective data storage solutions and models
    • Articulate the flow of data across our diverse range of systems
    • Ensure reusable clear service design and  documentation
    • Defining and aligning the forms and sources of data to facilitate WMF initiatives
    • Ensure monitoring system performance and identify, define and implement internal process improvements and SLOs
    • Work with Site Reliability and Operations Engineers to analyse and determine service discoverability, capacity plans and high availability
    • Recommend solutions to improve new and existing data storage and delivery systems
    • Change the world for more than half a billion people every month ;) 

    Skills and Experience

    • 3+ years experience in a Data Architect role as part of a team
    • You have a track record of leading data architecture initiatives to completion
    • You have experience analysing, reasoning about, optimising and implementing complex data systems
    • You have expertise in data handling approaches and technologies with good understanding of system development lifecycles and modern data architectures(Data Lakes, Data Warehouse)
    • You are comfortable modeling complex systems using approaches such as Domain Driven Design, eventual consistency, stream processing
    • You have experience with a diverse set of data storage and persistence frameworks and have a strong understanding of core data modelling concepts:
      • Relational & distributed databases (e.g. MySQL, Cassandra, Neo4j, Riak, HBase, DynamoDB, Elasticsearch)
      • Consistency trade-offs and transactional algorithms in distributed systems
      • Principles of fault tolerance and robustness
    • Use the best available tools & languages for each task. Currently we work a lot with Node.js but also use other tools and languages like Go, Python, Java, C, C++ and PHP where it makes sense. 
    • You have experience working with data streaming and pipelining systems(Hadoop, Kafka, Druid)
    • You have experience working with an engineering team, and communicate effectively with other stakeholders.
    • You have a track record of combining a solid long-term architectural strategy with short-term progress.
    • With freedom comes responsibility. You direct your own work and are pro-active in asking for input.
    • You have a scientific mindset and empirically test your hypotheses.
    • BS, MS, or PhD in Computer Science or equivalent work experience

    Pluses

    • Experience working with microservice architectures
    • Experience with open source technology and free culture, and have contributed to open source projects
    • Experience working remotely
    • You know what it means to be a volunteer or to coordinate the work of volunteers
    • Big ups if you are a contributor to Wikipedia
    • Please provide us with information you feel would be useful to us in gaining a better understanding of your technical background and accomplishments

    Show us your stuff! If you have any existing open source software that you've developed (these could be your own software or patches to other packages), please share the URLs for the source. Links to GitHub, etc. are exceptionally useful. 

    The Wikimedia Foundation is... 

    ...the nonprofit organization that hosts and operates Wikipedia and the other Wikimedia free knowledge projects. Our vision is a world in which every single human can freely share in the sum of all knowledge. We believe that everyone has the potential to contribute something to our shared knowledge, and that everyone should be able to access that knowledge, free of interference. We host the Wikimedia projects, build software experiences for reading, contributing, and sharing Wikimedia content, support the volunteer communities and partners who make Wikimedia possible, and advocate for policies that enable Wikimedia and free knowledge to thrive. The Wikimedia Foundation is a charitable, not-for-profit organization that relies on donations. We receive financial support from millions of individuals around the world, with an average donation of about $15. We also receive donations through institutional grants and gifts. The Wikimedia Foundation is a United States 501(c)(3) tax-exempt organization with offices in San Francisco, California, USA.

    The Wikimedia Foundation is an equal opportunity employer, and we encourage people with a diverse range of backgrounds to apply.

    U.S. Benefits & Perks*

    • Fully paid medical, dental and vision coverage for employees and their eligible families (yes, fully paid premiums!)
    • The Wellness Program provides reimbursement for mind, body and soul activities such as fitness memberships, baby sitting, continuing education and much more
    • The 401(k) retirement plan offers matched contributions at 4% of annual salary
    • Flexible and generous time off - vacation, sick and volunteer days, plus 19 paid holidays - including the last week of the year.
    • Family friendly! 100% paid new parent leave for seven weeks plus an additional five weeks for pregnancy, flexible options to phase back in after leave, fully equipped lactation room.
    • For those emergency moments - long and short term disability, life insurance (2x salary) and an employee assistance program
    • Pre-tax savings plans for health care, child care, elder care, public transportation and parking expenses
    • Telecommuting and flexible work schedules available
    • Appropriate fuel for thinking and coding (aka, a pantry full of treats) and monthly massages to help staff relax
    • Great colleagues - diverse staff and contractors speaking dozens of languages from around the world, fantastic intellectual discourse, mission-driven and intensely passionate people

    *Eligible international workers' benefits are specific to their location and dependent on their employer of record

    More information

    Wikimedia Foundation
    Blog
    Wikimedia 2030
    Wikimedia Medium Term Plan
    Diversity and inclusion information for Wikimedia workers, by the numbers
    Wikimania 2019
    Annual Report - 2017 

    This is Wikimedia Foundation 
    Facts Matter
    Our Projects
    Fundraising Report

  • 2 months ago

    Parse.ly is a real-time content measurement layer for the entire web.

    Our analytics platform helps digital storytellers at some of the web's best sites, such as Arstechnica, The New Yorker, The Wall Street Journal, TechCrunch, The Intercept, Mashable, and many more. In total, our analytics system handles over 65 billion monthly events from over 1 billion monthly unique visitors.

    Our entire stack is in Python and JavaScript, and our team has innovated in areas related to real-time analytics, building some of the best open source tools for working with modern stream processing technologies.

    On the open source front, we maintain streamparse, the most widely used Python binding for the Apache Storm streaming data system. We also maintain pykafka, the most performant and Pythonic binding for Apache Kafka.

    Our colleagues are talented: our UX/design team has also built one of the best-looking dashboards on the planet, using AngularJS and D3.js, and our infrastructure engineers have built a scalable, devops-friendly cloud environment.

    As a Python Data Engineer, you will help us expand our reach into the area of petabyte-scale data analysis -- while ensuring consistent uptime, provable reliability, and top-rated performance of our backend streaming data systems.

    We’re the kind of team that does “whatever it takes” to get a project done.

    Parse.ly’s data engineering team already makes use of modern technologies like Python, Storm, Spark, Kafka, and Elasticsearch to analyze large datasets. As a Python Data Engineer at Parse.ly, you will be expected to master these technologies, while also being able to write code against them in Python, and debug issues down to the native C code and native JVM code layers, as necessary.

    This team owns a real-time analytics infrastructure that processes over 2 million pageviews per minute from over 2,000 high-traffic sites. It operates a fleet of cloud servers that include thousands of cores of live data processing. We have written publicly about mage, our time series analytics engine. This will give you an idea about the kinds of systems we work on.

    What you'll do

    For this role, you should already be a proficient Python programmer who wants to work with data at scale.

    In the role, you’ll...

    • Write Python code using the best practices. See The Elements of Python Style, written by our CTO, for an example of our approach to code readability and design.

    • Analyze data at massive scale. You need to be comfortable with the idea of your code running across 3,000 Python cores, thanks to process-level parallelization.

    • Brainstorm new product ideas and directions with team and customers. You need to be a good communicator, especially in written form.

    • Master cloud technologies and systems. You should love UNIX and be able to reason about distributed systems.

    Benefits

    • Our distributed team is best-in-class and we happily skip commutes by working out of our ergonomic home offices. Here's a photograph of our CTO's setup running two full-screen Parse.ly dashboards.

    • Work from home or anywhere else in our industry-leading distributed team.

    • Earn a competitive salary and benefits (health/dental/401k).

    • Splurge with a generous equipment budget.

    • Work with one of the brightest teams in tech.

    • Speak at and attend conferences like PyData on Parse.ly's dime.

    Parse.ly Tech

    • Python for both backend and frontend -- 2.7, some systems in 3.x, and we're going full-on 3.x soon.

    • Amazon Web Services used for most systems.

    • Modern databases like Cassandra, ElasticSearch, Redis, and Postgres.

    • Frameworks like Django, Tornado and the PyData stack (e.g. Pandas).

    • Running Kafka, Storm, Spark in production atop massive data sets.

    • Easy system management with Fabric and Chef.

    Fully distributed team

    • Parse.ly is a fully distributed team, with engineers working from across the world. People with past experience working remotely will be prioritized. US/Eastern timezones will be prioritized.

    Apply

    • Send a cover letter, CV/resume, and optionally links to projects or code, to [email protected] Make sure to indicate you are applying for the "Python Data Engineer" role.

Remotive can help!

Not sure how to apply properly to this job? Watch our live webinar « 3 Mistakes to Avoid When Looking For A Remote Startup Job (And What To Do Instead) ».

Interested to chat with Remote workers? Join our community!