Rate $55/Hr. On C2C
Job Title: Site Reliability Engineer (Python)
Job Description:
-
Infra/Service Build and maintain components based on large-scale infrastructure (e.g., database systems, distributed queues, deployment platforms)
-
Comfortable making changes in mid/large scale distributed systems (order of 10k servers) by handling end-to-end life-cycle
-
Proactive maintenance through extensive use of monitoring, logging and metrics dashboard, work with core infra to diagnose and resolve issues
-
Proficiency in Python 70%
- Familiarity with data systems, ML pipelines, and distributed databases
-
Binary packaging and distribution
-
Build and CI/CD
-
Miscellaneous operational tasks related to build and third-party modules
- Monitor and maintain health of various infra services, data pipelines and build streams
- Provide on-call support during business hours
- Assess level of urgency and escalate to core team, if necessary
- Work on infra tasks/bugs
- Work on items related to infra health and efficiency
- These are most likely follow-ups from monitoring/alerts.