Remote site reliability Jobs in April 2020

9 Remote site reliability Jobs in April 2020

Post a job
    • 🔥 NEW: -50% on Remotive Community Memberships during COVID 19.
      Remotive Slack Community
  • Hottest Remote Jobs

    • Chainlink (Some overlap with EST)
      2 weeks ago

      Smart contracts are on track to revolutionize how all agreements work, through an entirely new system of technologically enforced contract guarantees. Chainlink enables next-generation smart contracts that can be written about any/all events in the real world, the details of our approach can be found in our whitepaper. We are well recognized for providing highly secure and reliable blockchain connectivity to the world's largest enterprises such as Google, Oracle, SWIFT, and many more. This is a unique opportunity to join one of the top companies developing cutting-edge blockchain technology while working closely together with a team of experienced senior developers.

       

      About this Role

      As a site reliability engineer, you’ll work directly with the company’s CTO, CEO and a technical team of other senior engineers. You’ll develop and build highly scalable, secure, and reliable software that will change the way smart contracts function at a fundamental level. You’ll have the opportunity to learn and master the latest research concerning cryptography, blockchains, game theory, consensus algorithms, and decentralized applications. We live by an open-source ethos and believe in giving back to the community. You'll join us in enabling the future architecture of Chainlink, including the following:


      • Work directly with AWS in a expert capacity using Terraform
      • Maintain reliable application and network infrastructure focusing on time to recovery, monitoring, reduced downtime during upgrades, and disaster recovery
      • Apply the 12 factor app methodology to blockchain infrastructure appropriately.
      • Use data to understand the availability, reliability, and sustainability of our service
      • Build tools and systems for a great developer user experience
      • 5+ years of professional software development

      • B.S. or higher in computer science or a similarly technical field
      • Experience with test driven development and the use of testing frameworks
      • Knowledge of system design concepts
      • Experience with distributed systems and/or container orchestration
      • Strong communication skills, specifically giving/receiving constructive feedback in a collaborative setting
      • Excitement about building, operating and maintaining resilient, scalable services

      Preferred Qualifications

      • Demonstrated understanding of container networking and security 
      • Comfort working with network protocols, proxies and load balancers
      • Experience building highly available services at scale
      • Professional experience with Golang, TypeScript, Solidity, Rust
      • Experience with distributed systems
      • Ability to optimize and refactor for scaling and/or testability
      • Experience defining security strategies and securing high value systems
      • Excitement for blockchain, Web 3.0, and similar decentralized technologies
      • Comfort with pair programming
      • Comfort working remotely in a distributed team
      • Experience with Continuous Integration and Continuous Delivery
      • Passion for open source

      This role is location agnostic anywhere in the world. Though we ask that you overlap some working hours with Eastern Standard Time (EST). 

       

      *Chainlink is an Equal Opportunity Employer.*

  • Software Development (9) Software Development rss feed

    • Pitch, a new company from the makers of Wunderlist, is looking for a backend engineer with deep experience working with Amazon Web Services (AWS). If you’re excited about planning and rolling out reliable and simple Cloud infrastructure, you’re in the right place. Pitch is a product-minded company structured in cross-functional teams, where DevOps experts are focused on delivering a great product to the user.

      What you will be doing 
      • Lead initiatives for new features with a large infrastructure component
      • Plan and build out our cloud infrastructure
      • Help expand our Continuous Deployment and Cloud Deployment systems
      • Play a key role in handling incidents
      What we are looking for
      • Excellent communication skills and willingness to share and teach
      • Ability to wear multiple hats in a cross-functional team
      • Ability to focus on the product even when building infrastructure
      • 2+ years of working experience of Amazon Web Services (AWS) and key AWS services (EC2, Route53, S3)
      • Deep understanding of how the Web works: HTTP, CORS, how to build a REST API
      • Deep understanding of IAM and other cloud security mechanism
      • Experience in a backend programming language such as Clojure (or equivalents like Golang, Python, etc)
      • Understanding of key application metrics and using them to make decisions
      • Familiarity with CloudFormation (or similar cloud provisioning systems like Terraform or Pulumi)
       Would be nice to have 
      • Excellent writing skills. Please submit a writing sample of any kind of technical writing if there’s anything you can share
      • Experience with Site Reliability Engineering or Production Engineering

       If you tick all these boxes we'd love to get to know you

      We value diversity of perspective and seek to build an inclusive workplace that welcomes people from all different backgrounds (including dogs).

    • Chainlink (Some overlap with EST)
      2 weeks ago

      Smart contracts are on track to revolutionize how all agreements work, through an entirely new system of technologically enforced contract guarantees. Chainlink enables next-generation smart contracts that can be written about any/all events in the real world, the details of our approach can be found in our whitepaper. We are well recognized for providing highly secure and reliable blockchain connectivity to the world's largest enterprises such as Google, Oracle, SWIFT, and many more. This is a unique opportunity to join one of the top companies developing cutting-edge blockchain technology while working closely together with a team of experienced senior developers.

       

      About this Role

      As a site reliability engineer, you’ll work directly with the company’s CTO, CEO and a technical team of other senior engineers. You’ll develop and build highly scalable, secure, and reliable software that will change the way smart contracts function at a fundamental level. You’ll have the opportunity to learn and master the latest research concerning cryptography, blockchains, game theory, consensus algorithms, and decentralized applications. We live by an open-source ethos and believe in giving back to the community. You'll join us in enabling the future architecture of Chainlink, including the following:


      • Work directly with AWS in a expert capacity using Terraform
      • Maintain reliable application and network infrastructure focusing on time to recovery, monitoring, reduced downtime during upgrades, and disaster recovery
      • Apply the 12 factor app methodology to blockchain infrastructure appropriately.
      • Use data to understand the availability, reliability, and sustainability of our service
      • Build tools and systems for a great developer user experience
      • 5+ years of professional software development

      • B.S. or higher in computer science or a similarly technical field
      • Experience with test driven development and the use of testing frameworks
      • Knowledge of system design concepts
      • Experience with distributed systems and/or container orchestration
      • Strong communication skills, specifically giving/receiving constructive feedback in a collaborative setting
      • Excitement about building, operating and maintaining resilient, scalable services

      Preferred Qualifications

      • Demonstrated understanding of container networking and security 
      • Comfort working with network protocols, proxies and load balancers
      • Experience building highly available services at scale
      • Professional experience with Golang, TypeScript, Solidity, Rust
      • Experience with distributed systems
      • Ability to optimize and refactor for scaling and/or testability
      • Experience defining security strategies and securing high value systems
      • Excitement for blockchain, Web 3.0, and similar decentralized technologies
      • Comfort with pair programming
      • Comfort working remotely in a distributed team
      • Experience with Continuous Integration and Continuous Delivery
      • Passion for open source

      This role is location agnostic anywhere in the world. Though we ask that you overlap some working hours with Eastern Standard Time (EST). 

       

      *Chainlink is an Equal Opportunity Employer.*

    • Wikimedia Foundation, Inc.
      2 weeks ago

      The Wikimedia Foundation is hiring two Site Reliability Engineers to support and maintain (1) the data and statistics infrastructure that powers a big part of decision making in the Foundation and in the Wiki community, and (2) the search infrastructure that underpins all search on Wikipedia and its sister projects. This includes everything from eliminating boring things from your daily workflow by automating them, to upgrading a multi-petabyte Hadoop or multi-terabyte Search cluster to the next upstream version without impacting uptime and users.

      We're looking for an experienced candidate who's excited about working with big data systems. Ideally you will already have some experience working with software like Hadoop, Kafka, ElasticSearch, Spark and other members of the distributed computing world. Since you'll be joining an existing team of SREs you'll have plenty of space and opportunities to get familiar with our tech (Analytics, Search, WDQS), so there's no need to immediately have the answer to every question.

      We are a full-time distributed team with no one working out of the actual Wikimedia office, so we are all together in the same remote boat. Part of the team is in Europe and part in the United States. We see each other in person two or three times a year, either during one of our off-sites (most recently in Europe), the Wikimedia All Hands (once a year), or Wikimania, the annual international conference for the Wiki community.

      Here are some examples of projects we've been tackling lately that you might be involved with:

      •  Integrating an open-source GPU software platform like AMD ROCm in Hadoop and in the Tensorflow-related ecosystem
      •  Improving the security of our data by adding Kerberos authentication to the analytics Hadoop cluster and its satellite systems
      •  Scaling the Wikidata query service, a semantic query endpoint for graph databases
      •  Building the Foundation's new event data platform infrastructure
      •  Implementing alarms that alert the team of possible data loss or data corruption
      •  Building a new and improved Jupyter notebooks ecosystem for the Foundation and the community to use
      •  Building and deploying services in Kubernetes with Helm
      •  Upgrading the cluster to Hadoop 3
      •  Replacing Oozie by Airflow as a workflow scheduler

      And these are our more formal requirements:

      •    Couple years experience in an SRE/Operations/DevOps role as part of a team
      •    Experience in supporting complex web applications running highly available and high traffic infrastructure based on Linux
      •    Comfortable with configuration management and orchestration tools (Puppet, Ansible, Chef, SaltStack, etc.), and modern observability infrastructure (monitoring, metrics and logging)
      •    An appetite for the automation and streamlining of tasks
      •    Willingness to work with JVM-based systems  
      •    Comfortable with shell and scripting languages used in an SRE/Operations engineering context (e.g. Python, Go, Bash, Ruby, etc.)
      •    Good understanding of Linux/Unix fundamentals and debugging skills
      •    Strong English language skills and ability to work independently, as an effective part of a globally distributed team
      •    B.S. or M.S. in Computer Science, related field or equivalent in related work experience. Do not feel you need a degree to apply; we value hands-on experience most of all.

      The Wikimedia Foundation is... 

      ...the nonprofit organization that hosts and operates Wikipedia and the other Wikimedia free knowledge projects. Our vision is a world in which every single human can freely share in the sum of all knowledge. We believe that everyone has the potential to contribute something to our shared knowledge, and that everyone should be able to access that knowledge, free of interference. We host the Wikimedia projects, build software experiences for reading, contributing, and sharing Wikimedia content, support the volunteer communities and partners who make Wikimedia possible, and advocate for policies that enable Wikimedia and free knowledge to thrive. The Wikimedia Foundation is a charitable, not-for-profit organization that relies on donations. We receive financial support from millions of individuals around the world, with an average donation of about $15. We also receive donations through institutional grants and gifts. The Wikimedia Foundation is a United States 501(c)(3) tax-exempt organization with offices in San Francisco, California, USA.

      The Wikimedia Foundation is an equal opportunity employer, and we encourage people with a diverse range of backgrounds to apply.

      U.S. Benefits & Perks*

      • Fully paid medical, dental and vision coverage for employees and their eligible families (yes, fully paid premiums!)
      • The Wellness Program provides reimbursement for mind, body and soul activities such as fitness memberships, baby sitting, continuing education and much more
      • The 401(k) retirement plan offers matched contributions at 4% of annual salary
      • Flexible and generous time off - vacation, sick and volunteer days, plus 19 paid holidays - including the last week of the year.
      • Family friendly! 100% paid new parent leave for seven weeks plus an additional five weeks for pregnancy, flexible options to phase back in after leave, fully equipped lactation room.
      • For those emergency moments - long and short term disability, life insurance (2x salary) and an employee assistance program
      • Pre-tax savings plans for health care, child care, elder care, public transportation and parking expenses
      • Telecommuting and flexible work schedules available
      • Appropriate fuel for thinking and coding (aka, a pantry full of treats) and monthly massages to help staff relax
      • Great colleagues - diverse staff and contractors speaking dozens of languages from around the world, fantastic intellectual discourse, mission-driven and intensely passionate people

      *Eligible international workers' benefits are specific to their location and dependent on their employer of record

    • Aptible (North America)
      3 weeks ago
      About Aptible

      Our Vision

      We see a future where it’s easy to bring a great idea into the world using the internet, while respecting data security and privacy. The next generation of businesses will design security and privacy into their operating processes. If every business is going to be a software business, every business will need to be a security business.

      We’re working to make information security a core competency of every startup. We envision a world in which startups have access to great information security, are empowered to focus on their businesses instead of on compliance, can scale faster and more efficiently, and are confident that they're creating quality products.

      Our Team
      We wrote the Aptible Owner's Manual to help members of the company get a clear sense of what this team is — what we mean by “us.” We've now made this open to the world and invite you to read it, as a prospective member of the Aptible Team.

      Our Commitment to Diversity and Inclusion
      We prioritize diversity within our team and value different perspectives, educational backgrounds, and life experiences. We encourage people from underrepresented backgrounds to apply.

      About this Role

      We're looking for a Site Reliability Engineer to improve the infrastructure, reliability and security of our PaaS product, Aptible Deploy.

      Our next SRE will be an early member of the Aptible team. Reporting to our Customer Reliability Engineering Manager, you will be responsible for reducing the overall amount of Site Reliability work and determining an SRE roadmap.

      Our Commitment to Diversity and Inclusion
      We prioritize diversity within our team and value different perspectives, educational backgrounds, and life experiences. We encourage people from underrepresented backgrounds to apply.

      Your Impact
      • You will own and manage both internal and external tooling like PagerDuty
      • You will develop tools and processes to make monitoring, detection and issue resolution easier
      • You will prioritize and perform proactive maintenance and improvements of the entire system
      • You will help assess and remediate vulnerabilities and risks as a member of the security team
      • You will be a key member of our 24/7 oncall rotation
      You Competencies
      • You have some familiarity with one or more of the technologies that we use including: Ruby, Docker, Postgres, MySQL or Redis
      • You have experience running production environments on AWS
      • You have 3-5 years as software engineer or SRE or equivalent experience
      Our Interview Process

      We seek to make the experience of interviewing with us as delightful, efficient, fair, respectful, and transparent as possible.

      A typical process at Aptible might include the following steps, and takes approximately 3 Weeks to complete. We try to move as quickly as possible, but if you have any time constraints, please let us know and we'll do our best to accommodate.
      1) An Introduction to Aptible with the Hiring Manager (30 Minutes via Zoom)
      2) A Discussion-Based Interview with an Aptible Team Member (45-60 Minutes via Zoom)
      3) A Take-Home Work Sample Exercise (NB: You will be compensated for completing this.)
      4) A Discussion-Based Interview with an Aptible Team Member (45-60 Minutes via Zoom)

      We believe that the Work Sample Exercise is an important part of the process, in that it gives you the opportunity to demonstrate your skills in a concrete way. We take the time to design these exercises such that they: a) give you a view into the actual work you'd do at Aptible, and b) are standardized, so every candidate is evaluated using the same criteria.

      Lastly, Aptible conducts calls with 3-4 References, ideally managers who have directly supervised you in the past and/or colleagues who can speak to your work.

      If you have a disability or special need that requires accommodation, please let us know by completing this form, and we will reach out soon to see how we may be able to assist.
    • Thought Industries (US - East Coast)
      1 month ago

      As our US east-coast based Site Reliability Engineer with solid coding skills you will be working with our Development team to ensure the availability, reliability, scalability, and performance of our platform’s automated cloud infrastructure. You will be part of a larger, distributed team that is focused on improving the business of learning in the cloud environment.

      As part of our SRE team, you will:

      • Work with SRE team and other developers to code, build, maintain, and monitor core pieces of infrastructure.

      • Take part in migrating data and other platform-related tasks (via automation when possible).

      • Work with our wider product team to meet new platform needs.

      • Take part in on-call rotation, responding to alerts and handling platform outages (particularly during EST hours).

      As an SRE Engineer, you:

      • Understand the requirements and challenges of hosting applications in the cloud

      • Understand the flow of a web request through a cloud application stack

      • Are mindful of risk-management and testing new production changes thoroughly

      • Feel the need to automate your problems away

      As an Engineer, you:

      • Communicate and collaborate well in a distributed team

      • Take a pragmatic and thoughtful approach to solving problems

      • Are a self-starter who can take a challenging task and run with it

      • Care about the quality of your work

      • Have empathy for your users and team

      • Enjoy learning new skills and building solutions to difficult problems

      Our Ideal Candidate:

      • 2+ years of engineering experience

      • Experienced in building, managing, monitoring, testing and optimizing a production cloud application.

      • Confident in their overall coding & application development skills

      • Fluent with one scripting language (ideally python, bash)

      • Has working experience with Node.js

      • Experienced with container-based deployment (e.g. K8s)

      • Experienced with AWS and its various offerings

      • Experienced with at least one flavor of linux and its setup and maintenance

      • Experienced with maintaining a production application across multiple regions

      The company

      Thought Industries is a startup in the Online Learning space. We enable training and software companies to launch and monetize external learning programs — think Shopify meets Udemy/Coursera.

      We are a growing, well-funded technology company, with a talented team and a clear vision. This is a unique opportunity to take a lead role at an exciting SaaS software company with a robust cloud-based platform. We hire talented people who are self-motivated and team orientated. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability or veteran status.

      To apply: Please submit your cover letter explaining what kind of role you are looking for and why Thought Industries specifically interests you along with your resume.

    • Bold Penguin (Eastern Time +/- 2 hours)
      1 month ago

      We didn’t create Bold Penguin because commercial insurance is broken. It isn’t. But as the world has gotten more connected and digitized, commercial insurance lags behind—creating a fragmented landscape where businesses, agents, and insurance companies struggle to interact in a smooth and easy way. That’s why we’ve built a highly efficient exchange that cuts the friction out of commercial insurance by connecting everyone to the right quote in record time.

      Powering the world of insurance is no small feat, so we’ve brought on a team that's not only incredibly talented but also passionate about our potential to upgrade the entire industry. As more and more companies big and small depend on our technology to operate in the commercial insurance space, we’ll need the best talent all around to support our growth. That’s why we’re looking at you (yes, you!) to make a bold move and join our adventure.

      Your  Role

      As a Cloud & Site Reliability Engineer, you will be a subject matter expert in building highly reliable, highly scalable features and infrastructure. You’ll use DevOps principles to ensure that Bold Penguin’s software systems are always available and ready to scale to meet growing demands. 

      Click here to learn more about DevOps on the glacier

      What You’ll Do

      • Ensure the reliability, performance, and availability of our platform by working as part of a cross-functional product team
      • Participate in agile ceremonies such as iteration planning, retrospective, and daily standups
      • Be part of the shared on-call rotation and proactively research possible issues affected the availability of our platform
      • Understand and clearly articulate tradeoffs in architecture decisions with regards to cost, security, operational efficiencies, performance, and availability
      • Build and maintain infrastructure with executable code (IaC) and automated delivery pipelines
      • Be passionate about Cloud/DevOps/SRE concepts such as Immutable Infrastructure, Cattle vs Pets, Infrastructure as Code, Delivery Pipelines

      Skills & Qualifications

      • Deep, hands-on expertise with AWS Cloudformation and other Infrastructure as Code tools
      • Experience with Amazon Web Services; specifically EC2, ECS, ELB, CodePipeline, RDS, Redshift, S3, IAM, and Lambda
      • Ability to articulate Cloud & DevOps concepts to a variety of technical & non-technical team members
      • Bonus points for expertise in implementing security & compliance frameworks such as SOC/2, NIST 800-53, and NIST 800-171 especially in Amazon Web Services
      • Bonus points for AWS Certifications 
      • Bonus points for familiarity with microservices architectures, Ruby on Rails and/or ETL tools such as Fivetran.
      • Experience working at technology companies and startups desirable
      • 2-4 years + of working remote, full time, and/or with full time co-located teams across different time zones.

      BONUS POINTS

      • Full-stack expertise in multiple tiers of modern web applications (e.g. front end, back end, infrastructure, etc.)
      • Open-source contributions and/or speaking experience.
      • Previous work experience in insurance and/or experience with policy rating very desirable.
      • You love Penguins! ;P

      TRAVEL TO THE "GLACIER" (please read)

      • We are firm proponents of "seeing eye to eye by meeting face to face". As such, our remote team travels in once a quarter for a full day of collaboration, goal setting, team building, etc.  Are you able to make this work?  In addition to this we also ask that, if hired, you are able to make the first week onsite for onboarding/training. 

      PENGUIN PERKS

      • For a healthy colony.
        • Our plan covers 50% of your Medical Premiums – Health - HRA, Dental, Vision, and Life Insurance, as well as Short & Long Term Disability (Trust us, the benefits are great!
      • Penguins plan for the future.
        • 401k Match program, up to 4%! 
      • Parental Leave
        • 16 weeks of parental leave (your kids need you there!)
      • Need a vacation?
        • Unlimited PTO - Please take a vacation - you need it and we applaud it and in fact we require you take 10 days off!
      • Hungry? Thirsty?
        • We offer free snacks and drinks, as well as catered lunch every Monday (even to our remote employees...nomb nomb nomb)
      • Penguins need to learn!
        • We support your professional growth. Certifications, training, memberships, and conferences are actively encouraged—and often covered.
      • Penguins are social creatures and love to play!
        • We have frequent happy hours, company events, and outings. What kind of company would we be if we didn't have some fun!?!? 
      • Penguins give back.
        • We offer volunteer opportunities every month!  There is no better feeling than giving back =)
      • Don’t want to move to Columbus?
        • We offer up to 100% remote engineers!
        • You must be OK visiting the office for a day or two every quarter - we are all about that camaraderie! 

      Penguins believe in inclusion. That’s why we’re proud to be an equal opportunity employer that considers all qualified applicants regardless of race, color, religion, gender identity or expression, sexual orientation, national origin, genetics, disability, age, veteran status, beak size, or inability to fly.

    • InVision is the digital product design platform used to make the world’s best customer experiences. We provide design tools and educational resources for teams to navigate every stage of the product design process, from ideation to development. Today, more than 5 million people use InVision to create a repeatable and streamlined design workflow; rapidly design and prototype products before writing code, and collaborate across their entire organization. That includes 100% of the Fortune 100, and organizations like Airbnb, Amazon, HBO, Netflix, Slack, Starbucks and Uber, who are now able to design better products, faster.  

      Our team is in search of a Lead Software Engineer - SRE to help us change the way digital products are designed.

      This role will help ensure uninterrupted service for InVision customers and act as a force multiplier for product teams to deliver better software faster. This role will have ownership of foundational reliability services and a big impact on our product.

      About the team:

      The reliability team is dedicated ensuring resiliency at scale. You will lead design, development and delivery of solutions which to enhance the scalability, availability, and efficiency of microservices. This role is will have direct impact on platform and product teams by identifying problems, anti-patterns, and opportunities to add resilience to applications. Our tech stack includes but is not limited to Kubernetes, AWS, Kafka, Kinesis, Go and Java based microservices.

      What you’ll do:

      • Provide leadership and guidance in addition to participating in hiring efforts
      • Uncover and advocate reliability, performance and upstream solutions with internal stakeholders
      • Create tools for monitoring, self-healing infrastructures
      • Code in Golang!
      • Develop solutions for circuit breaking, chaos testing, load shedding, rate limiting, server side and event bus resiliency
      • Identify performance bottlenecks and troubleshoot performance issues
      • Collaborate to problem solving and design
      • Engage in service capacity planning and demand forecasting, software performance analysis and system tuning
      • Mentor other developers and site reliability engineers in new technologies being implemented

      What you’ll bring (we encourage you to apply even if you don’t meet every single one):

      • Demonstrated Leadership experience
      • Experience finding anti-patterns and engineering reliability at scale
      • 1+ years of experience with Golang
      • Good communication skills and experience leading projects
      • A degree in computer science, software engineering, or a related field, or equivalent experience
      • Systematic problem solving approach, coupled with a strong sense of ownership and drive
      • A passion for creating performant and reliable applications

      About InVision:

      InVision offers an incredibly unique work environment. The company employs a diverse team all over the world. Each InVision team member is given the freedom and tools to do their best work from wherever they choose.

      The benefits we offer in the United States and Canada include competitive health plans and retirement plans. Some InVision-wide benefits offered to all employees across the globe include a flexible vacation policy, monthly coffee shop stipends, annual allowances for books related to your profession, and home office setup & wellness reimbursements. InVision is an international employer so some benefit offerings will vary from country to country.

      InVision is proud to be an equal opportunity workplace. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status. If you have a disability or special need that requires accommodation, please let us know.

    • 1 month ago
      Do you want to be part of a team that helps over one million designers create amazing products every day? We're looking for a full-time Site Reliability Engineer to join us at Sketch.

      We are building a cloud platform that helps teams to collaborate on Sketch designs in every possible, efficient, and beautiful way.
      Your mission will be to shape this cloud infrastructure defining and building every piece, from development environments to metrics processing and observability, including security policies, network design, deployment strategies, high availability, etc...

      Our stack is currently based on a mix of serverless and traditional server applications. You will propose new projects to make sure this platform has the best technology for our product goals and our team. You are proactive and have a "get the job done" attitude. You are also not afraid of getting deeper and deeper in order to debug a problem, especially in production.

      There are always many things to do at Sketch. You need to be an organized and communicative person. You are used to prioritizing Infrastructure tasks and projects and you like to back your decisions and proposals with arguments. As a part of a team with very skilled people being an excellent team player is essential.

      As a remote organization
      There are three keys to us. It requires excellent communication skills as well as good written and spoken English. You need to be self-motivated and be comfortable working in a remote position. And also it requires high-quality documentation. You to have an eye for detail, in general, and especially for the documentation.

      We believe in
      Automated, simple, and quality tested infrastructures. It's essential that you have experience developing infrastructures as code and you enjoy coding. You are very critic with your own job and you always try to find the cleanest way to do it. You understand well the right balance between adopting new technology, current stability, maintainability, and simplicity. Like us, you also believe that speed and reliability are two of the most important web platforms features. You like to design and build processes and platforms that run flawlessly and fast.

      The ideal candidate
      • Has experience with different stacks (mainly Linux based), technologies and production models and has participated actively on the build of important pieces of a cloud platform.
      • We would like to know as much about you as possible. Contact us and tell us about your experience and your motivations for this job and send us any link of something that represents you or your experience.
      Even if you feel you are not 100% exactly the person described, we would still love to hear from you. We value anything that makes you different from the description.
      Even if you're not able to tick all of these boxes, we would still love to hear from you.
    • 1 month ago
      To join our growing team, SugarCRM is currently seeking an experienced Site Reliability Engineer.  This role can be based in one of our U.S.-based offices or remote.

      Impact you will make in the role:
      • Manage applications in a CentOS Linux-based environment
      • Build repeatable infrastructures with Ansible
      • Develop and execute plans for rolling out new technologies rapidly
      • Improve monitoring infrastructure, build out data aggregation and alerting rules
      • Work closely with engineering to build scalable solutions
      • Triage tickets raised by our support organization and implement fixes
      • Support our private and public cloud environments and customers
      • Mentor other members of the Operations team
      • Participate in an on-call rotation

      Expertise you will bring in:
      • BA/BS in Computer Science with Network Engineering or Information Systems emphasis, or equivalent work experience
      • Extensive knowledge with container orchestration technologies including Docker and Kubernetes
      • 6+ years experience in an Operations or Systems Administration role
      • Superior Unix administration skills
      • Extensive knowledge of common Internet Protocols
      • Extensive knowledge of TCP/IP
      • Experience with virtualization and cloud technologies
      • Hardware management, network switch and router administration experience
      • Experience with Apache, MySQL, and PHP in a production environment at scale
      • Strong knowledge of version control systems and hands-on experience with Git
      • Experience with writing code around infrastructure automation
      • Understanding of how to architect and implement highly available, scalable, and secure network in multiple cloud environments
      • Strong affinity and experience in working with continuous deployment and continuous integration environments
      • An understanding around micro-service architectures and the complexities around their deployments 
      • Extensive programming experience in PHP, Ruby, Python, and Shell
      • Full stack troubleshooting and instrumentation experience
      • Extensive experience with IT automation technologies like Puppet, Salt, Chef, or Ansible
      • Experience with data aggregation, alerting, and reporting and supporting technologies such as Sensu and Graphite

      Nice to haves:
      • Experience in an on-call rotation
      • Experience with Elastic Search or Apache Solr
      • Experience with Spinnaker and/or other CI/CD tools
      • Previous experience as a mentor or advisor
      • Current contributor to open source projects (a Github account you can link us to would be ideal)
      We are an Equal Opportunity, Affirmative Action employer. Minorities, women, veterans and individuals with disabilities are encouraged to apply.

      Benefits and Perks:

      Beyond a stellar work environment, friendly people, and inspiring, innovative work, we have some great benefits and perks:
      Competitive salariesExcellent medical, dental and vision coverage for you and your family, along with other benefit plans like 401(k) matchUnlimited Paid Time OffWellness Reimbursement ProgramOnsite Programs, depending on location, such as Dry Cleaning, Car Washes, Massage, Yoga, and moreCareer & Personal Development Program – multi-platformRegular social eventsOwnership is the greatest self-identity at SugarCRM - you are making an impact nowWe are a merit-based company - many opportunities to learn, excel and grow your career