Apply for this position

Senior Site Reliability Engineer

Pear Deck


1 month ago

Job type: Full-time

Remote (USA Only)

Hiring from: USA Only

Category: DevOps / Sysadmin


Senior Site Reliability Engineer (Remote)

Pear Deck is an Ed Tech startup headquartered in Iowa City, Iowa, with remote team members around the country. We're driven by a mission to help teachers deliver powerful learning moments to every student, every day.

We are building a team of individuals who value inclusion, work according to our core values of truth, brilliance, humility, and determination, and are excited to apply their talents to creating something meaningful together. If you like the idea of being part of a mission-driven company working on big problems in education, join us!

We embrace diversity and invite applications from people of all walks of life. We don't discriminate against employees or applicants based on gender identity or expression, sexual orientation, race, religion, age, national origin, citizenship, disability, pregnancy status, veteran status, or any other differences. Also, if you have a disability, please let us know if there's any way we can make the interview process better for you; we're happy to make accommodations.

As a Senior Site Reliability Engineer you will contribute to Pear Deck's mission by helping us focus our expectations around availability, correctness, and performance while building tools and sharing expertise with the team to ensure our service continues to meet expectations as it scales. The work will cover a wide area, from directly improving our core services to oncall and incident analysis, education around scaling and resilience, and feedback into the product itself.

 

Responsibilities:

  • Demonstrate truth, humility, brilliance, and determination in their work
  • Demand Forecasting and Capacity planning for continued and/or improved site reliability
  • Implement and provision necessary infrastructure changes for continued and/or improved site reliability
  • Plan and Implement changes to reduce toil
  • Read, understand, and review application code to support software development efforts from a reliability / infrastructure perspective
  • Monitor health of production infrastructure and investigate/analyse any issues and abnormalities to identify problems or bottlenecks
  • Communicate uptime and quality of service issues effectively
  • On call rotations and incident response during off-hours
  • Implement and deploy hotfixes as necessary
  • Plan, track and perform routine system maintenance and software updates to infrastructure
  • Track and document reliability related issues and incidents
  • Mapping business goals to architectural/infrastructure decisions

 

Requirements:

  • Software development experience and understanding of programming languages, data structures and algorithms
  • Experience operating Kubernetes clusters
  • Experience in large-scale cloud environments  
  • Excellent troubleshooting/debugging skills 
  • Willingness to learn about, work with and understand existing systems
  • Comfortable with a Blameless Post-Mortem Culture.  
  • Ability to remain calm and collected under pressure
  • Significant experience with the following technologies:
    • Google Cloud Platform
    • Amazon Web Services
    • Kubernetes (CKA or similar certification preferred)
    • Docker
    • NodeJS / Javascript / Typescript
  • Experience with the following or related technologies is a plus:
    • Terraform
    • Python
    • Prometheus
    • MongoDB
    • Redis
    • Firebase Realtime Database
    • BigQuery

 

Benefits:

  • 401K with company match
  • Health, Dental, Vision Insurance
  • Paid Holidays and Unlimited PTO

Before you apply, please check if any restrictions apply in terms of time zone or country.

This job has a geo-restriction in place: USA Only.

Apply for this position

Please mention that you come from Remotive when applying for this job.

Does this job need an edit? 🙈

similar jobs

Remotive can help!

Not sure how to apply properly to this job? Watch our live webinar « 3 Mistakes to Avoid When Looking For A Remote Startup Job (And What To Do Instead) ».

Interested to chat with Remote workers? Join our community!