Tumblelog by Soup.io
Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

Site Reliability Engineer managing high availability SaaS applications at DrFirst, Inc. (Rockville, MD) (allows remote)

The SRE team is in-house expert on building reliable and maintainable systems. They plan infrastructure capacity to accomplish High Availability and uptime goals for all of the DrFirst products.
The DevOps/ Site Reliability team eliminates inefficiencies and incompatibilities which jeopardize service availability to deliver a reliable and scalable software service to DrFirst’s clients. Key aspects of this role include automation, configuration management, and tools development while collaborating with the engineering team on projects/products as an expert on reliability, performance, and efficiency.

As a part of the Systems team, you will:
• Periodically assess all monitoring requirements and implement necessary enhancements to meet changing/growing business needs
• Enhance current automation processes of managing capacity, safely deploying software and mitigating failures
• Tune and troubleshoot full-stack software applications using OOPS, Java, web services, Oracle DB, Mongo DB, networks concepts and virtualization techniques
• Proactively review, recommend and implement changes to the live infrastructure after ensuring the right validation has been carried out
• Assist in rollout and deployment of new product features and installations to facilitate rapid iteration.
• Confidently make informed, data-driven decisions in a fast-paced environment with competing priorities
• Create and maintain Chef recipes for instance configuration management
• Participate in 24/7 on-call rotation and after hours deployment

Don't be the product, buy the product!