Sr. Software Engineer - IT Operations Management (ITOM) Job
Pune, IN
YASH Technologies is a leading technology integrator specializing in helping clients reimagine operating models, enhance competitiveness, optimize costs, foster exceptional stakeholder experiences, and drive business transformation.
At YASH, we’re a cluster of the brightest stars working with cutting-edge technologies. Our purpose is anchored in a single truth – bringing real positive changes in an increasingly virtual world and it drives us beyond generational gaps and disruptions of the future.
We are looking forward to hire IT Operations Management (ITOM) Professionals in the following areas :
Years of Experience - 4 to 6 years
Job Profile – AIOps / Observability Engineer
Role Title
AIOps & Observability Engineer (Event Management & Intelligent Monitoring)
Role Summary
We are looking for a highly skilled AIOps & Observability Engineer to drive the next evolution of enterprise monitoring by integrating infrastructure observability, ServiceNow Event Management, and AI-driven analytics.
This role focuses on reducing alert noise, predicting outages, correlating events across tools, and enabling intelligent incident management using modern monitoring platforms (SCOM, AppDynamics, Datadog, New Relic) and AI/ML capabilities.
The ideal candidate will bring a holistic understanding of infrastructure, monitoring ecosystems, integrations, and AI-driven operations (AIOps).
Key Responsibilities
1. Observability & Monitoring Integration
- Integrate multiple monitoring tools:
- SCOM / AppDynamics / Datadog / New Relic/ Zabbix/ AWS CloudWatch etc.
- Standardize and normalize alert data across tools
- Build unified event pipelines into ServiceNow / data platforms
2. Event Management & Noise Reduction
- Implement event correlation, deduplication, and suppression strategies
- Reduce alert noise using:
- Rule-based logic
- Pattern-based clustering
- Improve signal-to-noise ratio across monitoring ecosystem
3. AIOps & Predictive Analytics
- Apply ML/AI techniques for:
- Anomaly detection
- Event correlation
- Root cause analysis
- Outage prediction
- Explore LLMs / AI models to enhance operational insights
- Build intelligent alerting mechanisms based on historical patterns
4. ServiceNow & ITOM Integration
- Implement and enhance ServiceNow Event Management & ITOM modules
- Enable:
- Auto incident creation
- Alert enrichment using CMDB
- Automated remediation workflows
- Integrate external monitoring tools with ServiceNow
5. Data & Search Platforms
- Use tools like OpenSearch / Elastic for:
- Log aggregation
- Event analytics
- Pattern detection
- Build dashboards and insights for operations teams
6. Infrastructure & Cloud Awareness
- Understand infrastructure dependencies:
- Servers, networks, applications
- Work with:
- AWS / Azure environments (good to have)
- Kubernetes / containerized workloads
7. Automation & Innovation
- Automate repetitive operational tasks
- Continuously improve monitoring efficiency
- Drive adoption of AI-driven operations
Required Skills
Core Skills
- Strong understanding of monitoring & observability concepts
- Hands-on with at least 2 tools:
- Preferred - SCOM / AppDynamics / Datadog / New Relic/ Zabbix/ AWS CloudWatch etc.
- Experience with event management & alert handling
- Strong integration experience (APIs, REST, data pipelines)
ServiceNow & ITOM
- Experience with:
- ServiceNow Event Management
- ITOM (preferred but not mandatory)
- Understanding of CMDB and CI relationships
Data & AIOps
- Basic to intermediate knowledge of:
- AI/ML concepts (anomaly detection, clustering, prediction)
- Exposure to:
- OpenSearch / Elasticsearch
Infrastructure
- Understanding of:
- Linux / Windows servers
- Networking basics
- Application architectures
Good to Have
- AWS / Azure exposure
- Kubernetes knowledge
- Experience with AIOps platforms
- Exposure to LLM / GenAI use cases in operations
At YASH, you are empowered to create a career that will take you to where you want to go while working in an inclusive team environment. We leverage career-oriented skilling models and optimize our collective intelligence aided with technology for continuous learning, unlearning, and relearning at a rapid pace and scale.
Our Hyperlearning workplace is grounded upon four principles
- Flexible work arrangements, Free spirit, and emotional positivity
- Agile self-determination, trust, transparency, and open collaboration
- All Support needed for the realization of business goals,
- Stable employment with a great atmosphere and ethical corporate culture