Automation Engineer
May be filledJob Summary: Our client is seeking an Automation Engineer to join their team! Duties: Design, develop, and execute performance, load, stress, and scalability tests to assess system responsiveness, stability, and…
Read full description
Job Summary: Our client is seeking an Automation Engineer to join their team! Duties: Design, develop, and execute performance, load, stress, and scalability tests to assess system responsiveness, stability, and resource utilization Identify performance bottlenecks across application, database, and infrastructure layers, and provide actionable optimization recommendations Analyze test results and produce clear, data-driven performance reports for technical and non-technical stakeholders Design and execute resiliency and chaos test scenarios to validate system behavior under failure conditions Simulate real-world failure events such as service outages, network latency, and resource exhaustion; assess recovery mechanisms Recommend, implement, and validate resiliency patterns like circuit breakers, retries, bulkheads, rate limiting, and fallbacks Configure and maintain performance monitoring and observability tools for continuous system health tracking Analyze real-time metrics, logs, and traces to proactively detect, diagnose, and resolve performance and stability issues Partner with engineering teams to define performance SLIs/SLOs and alerting thresholds Conduct capacity planning exercises to ensure systems can handle expected growth and peak workloads Provide data-backed recommendations for horizontal and vertical scaling strategies Support performance readiness for production releases and high-traffic events Collaborate with development, DevOps, and infrastructure teams to optimize application code, database queries, APIs, and system configurations Recommend and enforce best practices for performance tuning within microservices architectures Define and configure Kubernetes performance parameters, including resource requests/limits, autoscaling policies, and pod distribution strategies Ensure optimal performance and stability of containerized applications in cloud-native environments Document performance testing frameworks, tools, methodologies, and best practices Provide guidance, training, and support to development and operations teams on performance and resiliency engineering Evaluate and improve testing, monitoring, and resiliency processes Stay current with emerging performance engineering tools, cloud-native technologies, and industry best practices Desired Skills / Experience: 5+ years of experience in performance engineering, automation engineering, or site reliability engineering (SRE) 3+ years of pr...