Senior Site Reliability Engineer

Greenhouse


6 months ago

07/07/2019 10:21:23

Job type: Full-time

Hiring from: US only

Category: Software Dev


About the position

Our Infrastructure team is a small but critical part of our organization, responsible for designing, implementing and maintaining Greenhouse’s platform. We craft the environment that enables our engineers to focus on shipping new features, each of which brings us closer to the goal of delivering the best recruiting software possible.

We’re working on some interesting problems, and we’re searching for a Senior Site Reliability Engineer who’ll help keep our site performant and secure by building scalable and fault-tolerant cloud infrastructure. You’d design and implement features that support our in-house development platform, using technologies like Docker and Kubernetes.

In the coming year, you'd help us build out advanced monitoring and autoscaling systems, support for global scalability and disaster recovery, and continuous integration and deployment for our applications.

If you’re looking for a dynamic environment and excited by what we’re looking to do, this may be the job for you.

Learn more about our engineering culture here!

Who will love this job:

  • A problem solver, who not only thinks about the bigger picture, but can also connect the dots and dedicatedly resolve issues quickly and efficiently

  • A lover of detail with the mentality to analyze and manage large problem domains

  • A doer, who doesn't just tinker but has a strong bias for action

  • A great teammate, who is able to contribute and thrive within a fast-paced environment

What you’ll do:

  • Build new features and improvements to our Kubernetes-based PaaS

  • Enhance system observability by instrumenting applications, adding new data sources, and creating high-impact dashboards and alerts

  • Collaborate with product engineers to design new applications and make them performant, scalable, and reliable

  • Scale the platform to deliver a consistently great user experience worldwide

  • Learn and deploy cutting-edge tools and technologies: Docker, Kubernetes, Container Linux, Terraform, Prometheus, Grafana, and Go

You Should Have:

  • A deep understanding of Linux systems

  • Experience running production cloud infrastructure

  • Proficiency in a high-level programming language

  • A knack for troubleshooting and fixing hard bugs

  • Ability to design and build large distributed systems

  • Your unique talents! If you don’t meet 100% of the qualifications above, tell us in your cover letter why you’d be a great fit for this role.

Pay, Perks & Such:

At Greenhouse, we love to celebrate our diverse group of hardworking employees – and it shows. We’re proud to say that in 2018, we’ve been ranked #2 by Crain’s New York Best Places to Work, #10 Best Company Culture to work for by Comparably, #37 Best Place to Work by Glassdoor and are recognized on Inc. Magazine’s Best Workplaces list. We pride ourselves on our collaborative culture that is pervasive throughout every step of a Greenhouse employee's journey. Starting with our interviews and continuing through our executive “Ask Me Anything” sessions, collaboration is at the heart of working at Greenhouse.

We offer a full slate of benefits including competitive salaries, stock options, medical, dental, vision, life and disability coverages, FSA, HSA, flexible vacation, commuter benefits, a 401(k) plan and a parental leave program. And... we offer some not-so-standard, extra-fun benefits, including learning & development stipends, adoption and fertility benefits, an employee discount platform, and of course, fully stocked fridges and cold brew on tap. :)

We value diversity and believe forming teams in which everyone can be their authentic self is key to our success. We encourage people from underrepresented backgrounds and different industries to apply. Come join us, and find out what the best work of your career could look like here at Greenhouse.

Please mention that you come from Remotive when applying for this job.

Help us maintain Remotive! If this link is broken, please just click to report dead link!

similar jobs

  • ROLE DESCRIPTION

    We are looking to expand our team of Kubernetes Operations engineer. The focus of this role is 3-fold:

    Work directly with clients and customers to assist in consulting, setting up and operating Kubernetes clusters. Generally, the Kubernetes clusters will be based on our own Lokomotive full-stack Kubernetes distribution.

    Work as a part of the engineering team working to improve Lokomotive, Flatcar Container Linux and our other Kubernetes-related projects, applying your operations experience and direct feedback from customers to help guide the projects.

    Be part of the oncall schedule to support our subscription customers.

    We are working to build a follow-the-sun team so that support times during the week are during work hours.

    In essence this role has elements of both a Site Reliability Engineer (SRE) and a Software Engineer.

    The ideal candidate has operations experience with Kubernetes but also beyond. It is a person who has the experience of being oncall, and resolving and helping to mitigate issues in production environments. It’s also a person who can clearly communicate to customers about these issues and communicate with the engineering team about the experiences that matter to customers. This role is the interface between the customer and the product engineering teams.  It is this role’s positioning as the feedback loop between customers and the product that makes it so crucial.

    To support you, you’ll have at your disposal the renowned Kinvolk engineering team that has completed dozens of challenging projects at every layer of the system. You’ll find that your supporting team can get you then answers you need and help you find short and long-term solutions to issues and help you grow as an engineer.

    Responsibilities

    • Work directly with customers to assist in consulting, setting up and operating Kubernetes clusters

    • Interface with clients and advise on best practices for managing Kubernetes cluster

    • Be on call during reasonable hours on a rotating basis (follow-the-sun rotation)

    • Provide first-line support to customers

    • Work to improve our open-source cloud products

    • Be a liaison between customers and product engineering team

    • Participate in product engineering

    • Review and document changes

    • Stay current on the cloud infrastructure technology landscape

    • Work closely with the rest of the Kinvolk team; communicating across projects.

    • Represent Kinvolk at community events

    Multiple openings are available.

    REQUIREMENTS

    This role requires experience in setting up and operating Kubernetes clusters at a senior level. One is expected to be able to interface directly with customers to provide authoritative responses and advice.

    To get the job done, you’re going to need these.

    Required

    • Experience operating Kubernetes in production

    • Deep understanding of how Kubernetes works

    • Ability to listen to customers and distill that input into actionable tasks and recommendations

    • Good knowledge of distributed systems

    • Good knowledge of Linux systems

    • Good networking know-how

    • Experience in scripting languages

    • Ability to interface directly with clients and customers

    • Ability to work independently

    • Good at communicating technical issues and requirements

    • Good written and spoken English

    Desired, not-required

    If these items apply to you, awesome! If not, expect to add these while at Kinvolk.

    • Passed Certified Kubernetes Administrator exam. If not, we will support you in attaining this within 6-months of joining

    • Experience with the Go programming languages

    • Low-level knowledge of container and process isolation technologies

    • Comfortable giving talks at conferences

    • If in Berlin, good written and spoken German is a plus

    WHY KINVOLK?

    • We’re always looking for ways to make Kinvolk a friendly and motivating work environment. Here are some of the things we already offer.

    • You would be working on the cutting edge of technology, with a world-class team from whom you will be able to learn - just as we hope to learn from you!

    • We offer a competitive salary (reviewed annually), with equity participation (virtual share options) for all employees

    • Flexible working hours policy, and generous holiday allowance

    • An open, non-hierarchical, multi-cultural environment, with nearly as many nationalities represented in as we have people

    • And many others like:

    • Work exclusively on Linux technologies

    • Work closely with open-source communities

    • Lunch paid once/twice weekly (Berlin)

    • Assistance with public transport ticket and home Internet bill (Berlin)

    • Company mobile phone plan (Germany)

    • German language classes 2 times weekly, if needed (Berlin)

    • Generous hardware allowance for laptop, monitor, phone and/or tablet of your choice

    • Represent Kinvolk at conferences

    • Free drinks and snacks if you're working out of the office

    • Need a book? We’ll order it for you and add it to our tech bookshelf

    HOW TO APPLY

    Apply using the button below. If you have other questions, please send those to [email protected]

    ABOUT US

    Kinvolk is a rapidly growing tech company building Linux & Kubernetes-based open-source software products, and offering related engineering and technical support services. Our customers are amongst the largest and most influential in the space: Microsoft, SAP, CoreOS, and many more.

    While founded in Berlin, Kinvolk is quicky expanding and has recently opened an India subsidiary.

  • 1 month ago

    About HashiCorp

    At HashiCorp, we operate according to a strong set of company principles, many of which are described in The Tao of HashiCorp. We value top-notch collaboration and communication skills, both among internal teams and in how we interact with our users. We take care to balance and be responsive to the needs of our open source community as well as our enterprise level customers.

    Engineering at HashiCorp is largely a remote team, and this role is no exception. We are looking for a Full-time Remote Employee within the US or Canada. While prior experience working remotely isn't required, we are looking for team members who perform well given a high level of independence and autonomy.

    Our Products

    We build Consul, Nomad, Vault, Terraform, Packer, and Vagrant. Alongside of that, we deploy enterprise products for each in a variety of different ways: licensed and unlicensed binaries, appliances to public cloud platforms, and hosted SaaS platforms. Our products help organizations of all sizes run any infrastructure for any application.

    At HashiCorp, we value top-notch collaboration and communication skills, both among internal teams and in how we interact with our users. We take care to balance and be responsive to the needs of our open source community as well as our enterprise level customers.

    The Cloud Services team is an organization focused on delivering Hashicorp’s software as a Cloud service.  This effort will enable a distribution model wherein customers can use a fully managed service with an API contract.

    In your cover letter, please describe why you're interested in working at HashiCorp, and what draws you to this role in particular!  Specifics of your past experiences that are relevant to this role are great to include, too.

    In this role, you can expect to:

    • Design, implement, and maintain a secure and scalable infrastructure platform for delivering Cloud Services’ applications
    • Own and ensure that internal and external SLA’s meet and exceed expectations, System centric KPIs are continuously monitored and improved
    • Create tools for automating deployment, monitoring and operations of the overall platform
    • Participate in on-call rotation to provide application support, incident management, and troubleshooting
    • Provide ongoing maintenance and support of internal tools, improve system health and reliability
    • Program mostly in Golang, learning from and contributing to a team committed to continually improving their skills

    You may be a good fit if:

    • Familiarity with infrastructure management and operations lifecycle concepts and ecosystem
    • Experience operating and maintaining production systems in a Linux and public cloud environment
    • You have prior experience working in high performance or distributed systems; while we strive to hire at a variety of experience levels, this particular opening is not well-suited for recent graduates
    • Working knowledge of industry best practices with regard to information security
    • You have built or operated a large scale Cloud service
    • Comfortable with Go or another low-level programming language

    HashiCorp embraces diversity and equal opportunity. We are committed to building a team that represents a variety of backgrounds, perspectives, and skills. We believe the more inclusive we are, the better our company will be.

    #LI-RM1

     

  • Our Organization

    At HashiCorp, we operate according to a strong set of company principles, many of which are described in The Tao of HashiCorp. We value top-notch collaboration and communication skills, both among internal teams and in how we interact with our users. We take care to balance and be responsive to the needs of our open source community as well as our enterprise level customers.

    Engineering at HashiCorp is largely a remote team, and this role is no exception. We are looking for a Full-time Remote Employee within the US, UK, Canada, or the Netherlands. While prior experience working remotely isn't required, we are looking for team members who perform well given a high level of independence and autonomy.

    Our Products

    We build Consul, Nomad, Vault, Terraform, Packer, and Vagrant. Alongside of that, we deploy enterprise products for each in a variety of different ways: licensed and unlicensed binaries, appliances to public cloud platforms, and hosted SaaS platforms. Our products help organizations of all sizes run any infrastructure for any application.

    Our Team

    HashiCorp is evolving its Terraform Cloud platform and needs help solving problems in the infrastructure management and site-reliability space. We're looking for an experienced software or operations engineer who is motivated to help deliver a better Terraform Cloud experience.

    Join us as a Site Reliability Engineer to help us maintain and evolve the infrastructure that supports Terraform Enterprise.

    Responsibilities

    • Participate in a 24/7 on-call rotation that supports our production infrastructure.
    • Work to constantly improve our resiliency by developing self-healing, self-assembling infrastructure.
    • Collaborate across teams to improve our open source tools based on experiences found from running our own software in production.
    • Dive into problems with an eye to both immediate remediation as well as the follow-through changes and automation that will prevent future occurrences.
    • Maintain day-to-day vigilance with regards to security while helping to enhance the intrinsic security of the overall production system.

    Requirements

    • Familiarity with infrastructure management and operations lifecycle concepts and ecosystem
    • Experience operating and maintaining production systems in a Linux and public cloud environment
    • Experience building and scaling distributed, highly available systems
    • Working knowledge of industry best practices with regards to information security
    • Comfortable with Go or another low-level programming language

    About the Application Process

    All work requires excellent written communication skills, remote work doubly so. For this reason, we require a cover letter for your application to be considered complete.

    In your cover letter, please describe what draws you to working at HashiCorp and to this role in particular. Specifics of your past experience are great to include, too.

    At HashiCorp, we are committed to hiring and cultivating a diverse team. If you are on the fence about whether you meet our requirements, please apply anyway!

    We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

    #LI-NL1

Remotive can help!

Not sure how to apply properly to this job? Watch our live webinar « 3 Mistakes to Avoid When Looking For A Remote Startup Job (And What To Do Instead) ».

Interested to chat with Remote workers? Join our community!