Lead Data QA Engineer (Spark Data Integrity & Performance Testing)

Pune, Maharashtra, India | Full-time | Partially remote

Apply

Must-Have Skills:

  • Apache Spark expertise: design, configure, and optimize Spark clusters (or similar engines like Dremio)

  • Data Integrity QA: create and execute test cases to validate accuracy, consistency, and completeness of data; implement and maintain automated test scripts

  • Performance Testing: architect and run benchmark tests; analyze the impact of Spark cluster configurations on query/workflow performance

  • Leadership: mentor and guide other testers on best practices in both data integrity and performance testing

Good-to-Have Skills:

  • Familiarity with performance testing tools (e.g., JMeter, Gatling)

  • Experience integrating tests into CI/CD pipelines (e.g., Jenkins, GitLab CI)

  • Exposure to cloud-based Spark services (AWS EMR, Azure Synapse)

Who You Are

  • A data-driven QA leader passionate about ensuring both correctness and speed in big-data pipelines

  • Comfortable translating complex requirements into repeatable, automated test suites

  • Skilled at troubleshooting anomalous results, performing root-cause analysis, and optimizing system configurations

  • A collaborative mentor who elevates team practices and drives continuous improvement

What You’ll Do & Learn

  • Design & run comprehensive data integrity tests for Spark queries, investigating failures and ensuring zero data discrepancies

  • Implement automated validation scripts that integrate with our CI/CD workflows

  • Define & execute performance benchmarks across varied Spark cluster setups; report on metrics like throughput, latency, and resource utilization

  • Tune Spark configurations to meet SLAs for data freshness and query response times

  • Lead test planning sessions, coach junior testers, and document best practices for reproducible, scalable testing

  • Collaborate with data engineering, DevOps, and product teams to embed quality gates into the development lifecycle

Why Join Us?

  • Own the quality and performance of our core big-data platform powering mission-critical analytics

  • Work alongside experienced data engineers and architects on cutting-edge Spark deployments

  • Drive automation and efficiency in a growing, innovation-focused environment

  • Enjoy opportunities for professional growth, training, and attending industry conferences