Candidates must be located in the Eastern Time Zone and be a US Citizen or US Green Card Holder to apply no contracts and no 3rd parties this is a full-time position will a comprehensive benefits package for an amazing well-known technology company.
Our client is a leading National Technology Company who is seeking a Senior System Reliability Engineer, who is passionate about delivering reliable, scalable, efficient, and highly available platforms.
The ideal candidate is an insatiably curious tinkerer with technical chops, who has done this type of role before. Working under the general supervision of the Director Enterprise Technology Services, you will rely on experience and judgment to plan and accomplish goals.
In this role, you will be constantly optimizing and automating our processes and systems to improve reliability, scalability, and reduce toil. Plus, you will participate in systems design, deployment and take on real-time responsibilities, such as monitoring, incident management, and recovery.
Where applicable, this position assures departmental processes are performed in compliance with applicable Sarbanes-Oxley controls.
Essential Job Functions
- Experience running high availability cloud deployments with a major provider, namely Microsoft Azure
- Experience automating systems administration tasks using tools like Ansible, Terraform and languages such as Python, Bash or Go
- Experience with cloud monitoring and observability
- Comfortable with git and Infrastructure a Code workflows
- General familiarity with Data Center workflows
- Curious, collaborative, and humble team player
- Solid verbal, and written communication skills
- Experience with Red Hat Enterprise Linux
- Experience with Docker or Kubernetes to create and manage portable, extensible, containerized workloads and services a plus
- Experience in an agile environment a plus
Primary Job Duties
- Collaborate with and across teams to design, develop, test, implement, and support technical solutions for container orchestration platforms
- Build standard processes and procedures to automate the deployment, troubleshooting, monitoring, and recovery of infrastructure in the cloud leveraging infrastructure as code practices
- Architect and execute migration of existing workloads from on-prem, traditional infrastructure, to the cloud
- Share your passion for staying on top of tech trends, experimenting with and learning new technologies, participating in internal and external technology communities, and mentoring other members of the team.
- Build tools to monitor systems and automate processes around the core network, storage and network infrastructure
- Core contributor to our Architecture Review Board, change management and blameless postmortem processes
- Collaborate with teams and assist in troubleshooting issues across the whole stack – hardware, software, applications, and network
- Capacity planning and performance engineering related projects
- Collaborate with other business functions to bring best of breed product and solution to fruition with automation, reliability, scalability, and observability as core tenets
- Build for resilience. Our goal is that nobody gets called off-hours, ever! While w work on that, participate in a weekly on-call rotation.
- Improve our infrastructure capabilities, optimizing for cost, simplicity, and maintainability
- Bachelor’s Degree or the equivalent combination of education and work experience
- Minimum 7-10 years of professional Systems Engineer experience
- At least 3 years of Linux system administration experience. In-depth experience with RHEL, CentOS, Windows Server with strong debugging, troubleshooting and problem-solving skills.
- At least 4 years of experience in DevOps Engineering – Internship experience will be considered
- At least 4 years of experience with Microsoft Azure
- 4+ years’ experience with scripting and coding (Bash, Python, SQL or Golang or comparable languages)
- 2+ years of experience with Terraform or Docker or Ansible, Git, and Jenkins
- 2+ years of experience with multi-tenant container orchestration platforms and services including Docker or Kubernetes
- 2+ years of experience working with Agile Development Practices
- Nice to have experience with Kubernetes based cloud-native technologies such as argo, Kubeflow, istio, linkerd, and dex.