Site Reliability Engineer

Auth0


2 weeks ago

05/11/2019 11:03:27

Job type: Full-time

Category: Software Dev

SRE

Auth0, a global leader in Identity-as-a-Service (IDaaS), provides thousands of enterprise customers with a Universal Identity Platform for their web, mobile, IoT, and internal applications. Its extensible platform seamlessly authenticates and secures more than 2.5B logins per month, making it loved by developers and trusted by global enterprises. Auth0 has raised more than $110 million to date and continues its global growth at a rapid pace. We are consistently recognized as a great place to work based our outstanding leadership and dedication to company culture, and are looking for the best people to join our incredible team spread across more than 35 countries!


Auth0 gives companies simple, powerful and developer friendly building blocks so they can free up resources to focus on innovation. We strive to be the identity platform of choice for developers and Enterprises. We take our culture very seriously and are looking for people who are drawn to both our mission and our culture.


The Auth0 platform processes thousands of requests per second (2.5 billion logins per month) for customers all around the world - and we're growing very fast! The Site Reliability team aims to improve reliability and uptime in a data-driven way to support our customers' needs.


We are looking for senior software engineers with a good understanding of how systems fail, solid background in software engineering, and a desire to learn about reliability and large-scale systems.

You are a good fit if you...

Have initiative and can "unblock" yourself to get things done.

Tend to deliver work incrementally to get feedback and iterate over solutions.

Can mentor junior people and pair with other teams: education is a very important part of this role.

Like to get your hands dirty by debugging and fixing issues in production.

Understand the real problems by reading between the lines and asking good questions.

Are easy to work with: you communicate well, take feedback in a positive way and are OK not always doing the most glamorous tasks.

Responsibilities:

Analyze and optimize our core product by developing and implementing reliability and performance practices.

Scale systems sustainably through automation, and evolve systems by pushing for changes that improve reliability and velocity.

Perform Root Cause Analysis of production issues to identify reliability improvements of our services.

Evangelize and advocate for reliability practices across our organization

Collaborate with other Engineering teams to support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning and launch reviews.

Be on-call for services that the SRE team owns.

Practice sustainable incident response and blameless postmortems.

Requirements:

You have contributed to design applications and systems that scale, are resilient to failure, and are observable.

You are interested in designing, analyzing and troubleshooting large-scale distributed systems.

You have a systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive.

You have a great ability to debug and optimize code and automate routine tasks.

You have a solid background in software development and architecting resilient and reliable applications.

Timezone: we are giving preference to candidates located in GMT-8 to GMT+2.

Extra Points:

Experience with Amazon Web Services.

Experience with Node.js or any other application development language.

Experience with MongoDB.

Experience working in a remote friendly, async environment.

Preferred Locations:

(GMT-8); (GMT-7); (GMT-6); (GMT-5); (GMT-4); (GMT-3); (GMT-2); (GMT-1); (GMT); (GMT+1); (GMT+2)

Auth0 is an Equal Employment Opportunity employer. Auth0 conducts all employment-related activities without regard to race, religion, color, national origin, age, sex, marital status, sexual orientation, disability, citizenship status, genetics, or status as a Vietnam-era special disabled and other covered veteran status, or any other characteristic protected by law. Auth0 participates in E-Verify and will confirm work authorization for candidates residing in the United States.

Please mention that you come from Remotive when applying for this job.

Help us maintain Remotive! If this link is broken, please just click to report dead link!

similar jobs

  • 3 weeks ago
    Heetch is a mobility app with a simple mission: we want people to enjoy going out.
    Every night and every day, our drivers are doing their best to make their rides unforgettable and friendly!
    We are focused on young people's expectations and are competing within a fast-paced market.
     
    The service was launched in Paris on September 2013 and has been growing since then, with thousands of rides every night in France, Belgium and Morocco.
    With more than 1 million users in Europe, we are proud to be one of the fastest French growing startups!
     
    Driver Growth @Heetch
     
    We're a thoughtful, talented, full stack and distributed product team of backend, mobile, frontend and QA engineers, as well as product managers and product designers. We're responsible for the acquisition, engagement, and retention of all our drivers ?.
    Our multi-disciplined team allows us to work autonomously across the realms of our scope. This means we own our roadmap entirely, and we empower each team member to contribute and influence what we work on and how.
     
    Our mission is quite simple; Deliver Driver happiness and ensure they get the optimum experience that they deserve. Drivers use and rely on the products we build every single day to earn a living. This is a responsibility that we hold dear and do not take for granted.
     
    SRE within Driver Growth
     
    Our infrastructure receives 2.5 millions of events per day and processes 100M of API requests. We also serve over a dozen thousand rides, have a Driver signup funnel with 50 separate Data fields and process hundred of gigabytes of log and interaction data daily. Our team owns upwards of 20 microservices on top of Elixir, Kafka and Docker, and are focussing our efforts on adding to this number as we extract from our legacy codebase.
     
    To put it simply; The services we support and the code we produce are critical to the business. Be it a potential driver going through our acquisition funnel, an active driver entering our marketplace or a driver viewing their earnings and account details to name but a few, the impact our backend engineers have on the business as a whole is enormous.
     
    Team Values
    • Transparency: We discuss everything openly within the team. Our speak up culture is strong.
    • Remote first: Our team is fully distributed, and we work hard at that, but feel free to work from any of our offices in Paris, London, Brussels or Casablanca.
    • The courage to fail: We celebrate the wins, but more importantly we're not afraid to fail, we always learn and go again.
    • Team unity: No one is left behind.
    • Code quality: It's not software without tests.
     
    Your role
    In this role, you'll be in charge of building the tools and systems that every backend engineer in the Driver Growth team uses to develop, scale, understand, and monitor their operations.
    You will dive deep into gnarly operational issues; from the software, systems, automation, and process perspectives, and, you will work with our production services throughout their entire life cycle, from design and architecture, through implementation, deployment, and sustaining operations.
     
    What will you do?
    • Build tools and infrastructure to make the team iterate faster without overthinking about the core infrastructure.
    • Partner with fellow backend engineers to architect and build mission-critical systems that can stand the test of scale and availability, while limiting operational overhead.
    • Perform deep dives into both systemic and latent reliability issues; partner with software and SRE engineers across the organization to produce and roll out fixes.
    • Design, build & support systems to detect, alert and remediate or escalate on the team' platform.
    • Contribute to standardization efforts across multiple disciplines and services in conjunction with the Core SRE team
    • Handle efficiencies in systems and processes: design, configuration management, performance tuning, monitoring, and root cause analysis.
    • Participate in an on-call rotation and contribute to needed escalation missions.
     
    What do you need?
    • Software Engineer background (+5 years)
    • Practical knowledge of various aspects of service design like messaging protocols & behavior, caching strategies and software design practices
    • Solid understanding of systems and application design, including the operational trade-offs of various designs
    • Excellent programming skills in Go, and an ability to pick up new programming languages
    • Excellent written and social communication, and documentation skills in English
    • Be adaptable and able to focus on the most straightforward, most efficient & reliable solutions
    • Experience in the Linux environment and a deep understanding of its fundamentals and internals: filesystems and modern memory management, threads and processes, the user/kernel-space divide, networking
    • Exposure to the AWS ecosystem
    • Real world experience with Packer/Terraform
    • Customer service skills and empathy to develop solutions that span multiple teams
    • Work well with and be able to influence a myriad of personalities at all levels
    Bonus
    • Experience building highly-available fault-tolerant distributed systems with microservices, including containerized architectures, application security, monitoring, and storage systems
    • Experience with message brokers (such as RabbitMQ or Kafka)
     
    Perks
    • Stocks
    • Paid conference attendance/travel
    • Heetch credits
    • A Spotify subscription
    • Code retreats and company retreats
    • Travel budget (visit your remote co-workers and our offices)
    Hiring process:
    • Non technical interview with the Engineering Manager of your potential team (1h30)
    • Take home assignment (~5 days deadline)
    • Interview with your future teammates (1h)
    • Day on site (Paris) to meet your future stakeholders
     
     
    Check out our Engineering Blog and follow our twitter :)
    You can also have a look at our open-source projects and contributions here
  • Noredink (PST to CET)
    1 week ago

    NoRedInk is using technology to help millions of students become better writers. We’re seeking mission-driven engineers who like to ship code, tackle hard engineering problems, and fundamentally impact how kids learn.


    We’re hiring a site reliability engineer to handle availability and scalability, as well as product development. When students hit our site, you will help make sure there's a site to hit.


    About You

    You have at least 4 years of professional experience as a software developer or equivalent knowledge

    You have professional experience administering Linux servers with configuration management tools

    You have experience scaling with large deployments on AWS or bare metal

    You have experience supporting production stack for a web application. We use Rails, Redis and MySQL.

    You can be your own DBA including setup, optimization and troubleshooting

    You are comfortable either working remotely, or commuting to our office in San Francisco

    Experience with Docker, microservices and/or security a plus 

    What are we up to?

    To see what our engineering team has been doing lately, check out our blog!

    NoRedInk helps millions of students in grades 5-12 become better writers. Our adaptive curriculum guides learners through a continuous process of skill-building, feedback, and revision and delivers actionable performance data to teachers and administrators. Used in over 50% of school districts, we're on a mission to unlock every writer's potential. Here’s a 2-minute pitch we gave on NBC and articles about us in The Washington Post, Wall Street Journal, and Forbes.

  • 1 month ago

    At Elastic, we have a simple goal: to pursue the world's data problems with products that delight and inspire. We help people around the world do exceptional things with their data. From stock quotes to Twitter streams, Apache logs to WordPress blogs, our products are extending what's possible with data, delivering on the promise that good things come from connecting the dots. Often, what you can do with our products is only limited by what you can dream up. We believe that diversity drives our vibe. We unite employees across 30+ countries into one unified team, while the broader community spans across over 100 countries. Thanks to our ongoing expansion we have the opportunity to grow our Cloud Application Security team.

    We're a part of the Elastic Cloud team with a focus on finding security flaws in complex distributed systems and coming up with creative and approachable solutions that enable developers to ship secure code.

    We’re looking for people who are just as passionate about uncovering an obscure security vulnerability as they are about working with developers to ship more secure code. Would you like to focus on building and maintaining Application Security program that will be used throughout the industry?


    What you will be doing:

    Take shared ownership in driving the creation and implementation of a best-in-class application security program for Elastic Cloud.

    Take ownership for the offensive security program, including penetration testing, red team activities, and security research.

    Responsible for manual code analysis, proof of concept exploit code development, and deploying automated solutions to do the same.

    Be a proponent and champion of a DevSecOps culture and environment for a large team of highly talented developers and engineers

    What you bring along:

    A history of uncovering, exploiting, and remediating application or system security flaws.

    A deep understanding of coding and scripting languages such as Java and Python, Scala, among others and can easily adapt to other languages quickly and efficiently.

    Knowledge of and experience with manipulating protocols and libraries in order to compromise the security of a set of systems or code

    Previous work as a developer for a large code base and collaboration with engineers and developers

    Bonus Points:

    You have hands on experience in both using and securing Linux based systems and containers

    You've worked on open source projects before and are familiar with different styles of source control workflow and continuous integration and management (GitHub, Terraform, Ansible, RunDeck, etc).

    Additional Information:

    Competitive pay

    Equity

    Catered lunches, snacks, and beverages in most offices

    An environment in which you can balance great work with a great life

    Passionate people building excellent products

    Employees with a wide variety of interests


Remotive can help!

Not sure how to apply properly to this job? Watch our live webinar « 3 Mistakes to Avoid When Looking For A Remote Startup Job (And What To Do Instead) ».

Interested to chat with Remote workers? Join our community!