Cloud Ops Engineer

Job Post Information* : Posted Date 12 hours ago(7/15/2025 4:29 PM)
ID
2025-1862
# of Openings
2
Category
Engineering

Overview

This position will assist in performing implementation, operation, monitoring, recovery, and performance tuning for infrastructure and application services at symplr.

The CloudOps team augments the symplr Development, IT teams by focusing on application deployment on production systems using a software engineering approach and manageability of application failure resolutions within Service level agreements.

CloudOps goals include improving system performance, increasing operational observability, enhancing system stability, and reducing time for software delivery.

Duties & Responsibilities

  • Be a champion for department initiatives and values by ensuring all actions promote the department’s mission statement
  • Participate in release cycles of product by closely working with Engineering Managers, Architects and Developers.
  • Work towards automating the product deployment to various environments by integrating with continuous integration (CI) and continuous delivery (CD) tools, monitoring, and change management practices.
  • Create and maintain standard operating procedures (SOPs) for performing maintenance tasks, applying configuration changes, and remediating problems in the environment.
  • Implement monitoring, alerting, notification and metrics collection for
    • Infrastructure and application performance
    • System uptime
    • Error rate
  • Monitor and continually improve the capacity and reliability of our production environments infrastructure.
  • Investigate and fix performance and scalability bottlenecks, proactively identify issues and create work items to improve stability and performance.
  • Respond to alerts from production systems, identify and resolve root causes in a timely fashion
  • Identify single points of failure and other high-risk architecture issues and propose resilient resolutions to mitigate the risk thereby improving the system reliability.
  • See opportunities of automation and reduce the operational workload, build scripts, introduce new tools and practices as needed
  • Work with other Cloud Infrastructure Engineer and developers to ensure maximum performance, reliability and automation of our deployments and infrastructure.
  • Work with, consult and influence developers on new features and software architecture to ensure scalability.
  • Communicate to stakeholders and handle the deployment/maintenance/support efficiently
  • Ticket Handling and Support
    • Tickets that are handled should have clear communication and correct stakeholders involved
    • Tickets should be completed within the SLA and should be clearly informed, documented if there is any delay or improper tickets.
    • Tickets should have proper comments to close the ticket including steps for resolutions, screen shots.
    • Tickets that are repetitive should be discussed in standup call for brainstorming and eventually should lead into resolution through automation if necessary.

Skills Required

  • 4+ years of experience with any public cloud provider such as Amazon Web Services (AWS), Microsoft Azure and On-Prem Servers
  • Solid understanding of standard TCP/IP networking, Windows IIS, Load Balancing and common protocols like DNS, HTTPS
  • Good knowledge on CI/CD tools like Octopus CD, Azure ADO, GitHub Actions, Jenkins etc
  • Monitoring and Logging: Experience with any Application monitoring and logging tools (e.g. Datadog, New Relic, AppDynamics, Application Insight, ELK, Prometheus).
  • Good understanding of Web Servers & Database
  • [Optional] Good understanding in Docker and Kubernetes.
  • Good scripting knowledge & Software life cycles model.
  • Good understanding of DevOps practices.
  • Should have worked on high traffic & highly scalable systems in past
  • Knowledge on fundamental aspects for release automation (packaging, dependencies, promotion, deployment, compliance)
  • A passion for collecting, evaluating, and improving performance metrics.
  • Excellent time management, resource organization and priority establishment skills, and ability to multi-task in a fast-paced environment
  • Ability to work quickly and efficiently with minimal supervision
  • Excellent communication skills with both written and verbal
  • Should be able to handle On-calls 12-hours following a week rotation pattern for symplr products.
  • Able to work during the US Day hours shift and coordinate with team members in US/India for completing the day-to-day tasks.

 

 

Qualifications:

  • Have HEART. To work here, you must be:
    • Humble – self-aware and respectful
    • Effective – measurably move the needle & immeasurably add value
    • Adaptable – innately curious and constantly changing
    • Remarkable – stand out in some way
    • Transparent – openly and honestly sharing knowledge
  • 3+ years of Systems Engineering experience in the following areas
    • Cloud platforms (Azure, AWS) and On-Prem Servers
    • Windows and Linux Servers
    • Application Monitoring Tools (Datadog, New Relic, AppDynamics, Application Insights)
    • Log Aggregation Tools (Datadog, ELK, etc)
    • PowerShell, Bash, or Python scripting
    • CI/CD tools (Azure Pipelines, GithHub Actions, Jenkins, Octopus, etc.)
    • Infrastructure management tools (Terraform, Ansible, etc.)
    • Application Hosting (IIS, Apache, Tomcat)
    • Alerting (PagerDuty)
    • Ticketing (ADO Boards and Ivanti)
    • Documentation (Confluence)
  • Bachelor’s degree or equivalent experience

Options

Sorry the Share function is not working properly at this moment. Please refresh the page and try again later.
Share on your newsfeed