

Not in love with this template? Browse our full library of resume templates
Related Resumes & Cover Letters
1
site reliability engineer
- Escalate issues as needed to product development or service engineering team per documented procedures, while at the same time establishing a contingency plan to eliminate any intermittent service disruption
- Document and detail areas of improvement to bolster architecture, design, technical requirements and service specifications.
- Present architecture, design, and technical choices to internal audiences Design and deploy metrics, monitoring, and logging systems on AWS / Infra systems to understand the system performance and isolate bottlenecks.
- helps drive efforts to improve triage time and bring down MTTR (Mean Time to Repair) and provides follow-up support to provide mitigation in the future
- Proactively monitor availability and performance of the SAP ARIBA cloud products using the required toolset
- Effectively respond to Monitoring alerts, incident tickets, email requests or other channels coming in to Site Reliability Engineering team
2
senior site reliability engineer
- design and DevOps implementation of a multi-tenant Kubernetes cluster, running a set of open ecosystem tools (calico, nginx-ingress, fluentd, Prometheus, kube2iam, LDAP auth, etc),
- authoring of configuration management procedures, workflows, and playbooks,
- design and execution of management procedures and configuration standards
- implementations based on devops tools (Ansible, Terraform, AWS API)
- implementation e2e tests for infrastructure CI and CD processes based on pytest framework
3
site reliability engineer
- Ensured production service availability with maximum uptime for Adobe Campaign. Developing tools to facilitate production system uptime and achieving product SLA .
- Automated production deployment using Ansible.
- Infrastructure Automation and Orchestration .
- Automation of daily Ad-hoc manual processes .
- Experience in development, deployment and scaling systems across DC and Cloud infrastructure.
- Production System Monitoring , Incident Management , Server Capacity Management .
- Troubleshoot operational and application issues and fix them within the SLA.
4
site reliability engineer
- Query AnalyzerSniffs packets(using pcap) on ethernet interface, decodes the packet by using MySQL client-server protocol, calculates the checksum of the query and sends aggregated data to the centralized server .
- Built UI on top of above data to project meaningful data.
- Writing up control scripts for new processes.
- CVE Tracking and Security related fixes in Infrastructure.
5
site reliability engineer
- Build from scratch, a web application for Infrastructure inventory management, using the LAMP stack.
- Developed micro-services, in Golang, for ETL jobs and data collection.
- Created web application using Django, Javascript/JQuery, D3.js for graphs and ag-grid for the reports.
- Performing data analyses and reporting key insights using python modules like NumPy, Pandas, Matplotlib, …
- Maintaining the Service Level Agreements (SLAs) with respect to key SLI indicators, in the project.
- Created CI/CD pipeline setup and followed Test Driven Development (TDD).