See all the jobs at InfraCloud Technologies here:
| Full-time | Fully remote
, ,Data Engineer (3–6 Years)
We are looking for a Data Engineer who can work across modern data platforms and streaming frameworks to build scalable and reliable pipelines. If you enjoy working with Spark on Databricks, Kafka, Snowflake, and MongoDB — and want to solve real-world data integration challenges — this role is for you.
What you’ll do:
-
You will develop ETL/ELT pipelines in Databricks (PySpark notebooks) or Snowflake (SQL/Snowpark), ingesting from sources like Confluent Kafka
-
Handle data storage optimizations using Delta Lake/Iceberg formats, ensuring reliability (e.g., time travel for auditing in fintech pipelines).
-
Integrate with Azure ecosystems (e.g., Fabric for warehousing, Event Hubs for streaming), supporting BI/ML teams—e.g., preparing features for demand forecasting models
-
Contribute to real-world use cases, such as building dashboards for healthcare outcomes or optimizing logistics routes with aggregated IoT data.
-
Write clean, maintainable code in Python or Scala
-
Collaborate with analysts, engineers, and product teams to translate data needs into scalable solutions
-
Ensure data quality, reliability, and observability across the pipelines
What we’re looking for:
-
3–6 years of hands-on experience in data engineering
-
Experience with Databricks / Apache Spark for large-scale data processing
-
Familiarity with Kafka, Kafka Connect, and streaming data use cases
-
Proficiency in Snowflake — including ELT design, performance tuning, and query optimization
-
Exposure to MongoDB and working with flexible document-based schemas
-
Strong programming skills in Python or Scala
-
Comfort with CI/CD pipelines, data testing, and monitoring tools
Good to have: -
-
Experience with Airflow, dbt, or similar orchestration tools
-
Worked on cloud-native stacks (AWS, GCP, or Azure)
-
Contributed to data governance and access control practices