Site Reliability Engineer
2 weeks ago
Job type: Full-time
Hiring from: Anywhere
We are looking for a Site Reliability Engineer (SRE), to join the Data Platforms team at Autumn Compass. In the Site Reliability Engineer role you will be working with our cloud distributed compute infrastructure which powers our trading research. This platform forms a foundational component of our business and so it is important we are able to keep it performant, reliable and flexible.
Autumn Compass is a proprietary trading company that uses modern techniques from software engineering and machine learning in order to develop intelligent algorithms and strategies for trading. With no external clients and a culture of devops, you will work with the team through the entire lifecycle of our platform’s new features, from design to release and post rollout maintenance.
Hours: Need to be available at least between 00:00-07:00 UTC to line up with Sydney timezone
Salary: up to AUD$170k package, location and experience dependent
Click 'Apply' to begin your journey with us, and take this opportunity to raise your career to new heights.
Site Reliability Engineer Responsibilities:
- Ensuring our compute infrastructure is reliable, fault-tolerant and can scale across 1000s of compute systems.
- Building in flexibility to our systems to try new ideas at minimal cost in time and technical load (I.e. have the ability to fail fast when experimenting)
- Evolving our systems to improve robustness and correctness through the use of new technology, automation and simplification.
- Leading our team’s approach to devops by developing our philosophy and bringing our approach in line with best practices.
- Willingness to challenge the status quo to bring new ideas and innovation to the team.
Site Reliability Engineer Profile:
- 3+ years of experience working in a DevOps/SRE team.
- Development experience in Python (or a similar language) and a willingness to work with Python infrastructure.
- Knowledge of best practice in DevOps and the interest to stay on top of the changes in the space
- Expertise in designing, architecting, and troubleshooting large-scale distributed systems.
- Systematic problem-solving approach, coupled with effective communication skills and a sense of drive.
- Understanding of Unix/Linux operating systems.
- Experience with the AWS stack is desirable
- Infrastructure is important but non-critical (with a large error budget) so no on-call required.
- Flexible work hours
- Flat management structure, report to the CTO
Before you apply, please check if any restrictions apply in terms of time zone or country.
This job has a geo-restriction in place: Anywhere.
Please mention that you come from Remotive when applying for this job.
Does this job need an edit? 🙈