Overview:

Are you a data enthusiast who thrives on building scalable and efficient data pipelines? Do you have a passion for transforming raw data into actionable insights? If so, we want you to join our fast-growing startup and help shape the future of data-driven retail!

Who We Are:

We're a small but ambitious startup revolutionizing the retail experience through innovative technology. We're passionate about creating user-friendly, engaging experiences that make shopping effortless and enjoyable. We work collaboratively, value open communication, and believe in fostering a culture of continuous learning and growth.

What You’ll Do:

  • Design, build, and maintain scalable data pipelines and ETL processes to support business analytics and operational needs.
  • Collaborate with cross-functional teams to integrate, transform, and make data accessible for analysis and decision-making.
  • Implement and optimize data ingestion processes, ensuring efficient data movement across systems.
  • Develop and manage Spark-based data processing workflows for real-time and batch processing.
  • Write clean, efficient, and well-documented Python code to support data engineering workflows.
  • Monitor and troubleshoot data pipelines, ensuring high availability and reliability.
  • Drive best practices for data governance, security, and quality to ensure accuracy and consistency.
  • Stay up to date with industry trends, tools, and best practices to continuously improve our data architecture and processes
  • Mentor team members or lead projects if interested in growing into a leadership role.

What You Bring:

    • 3+ years of experience as a Data Engineer, working with large-scale data infrastructure.
    • Strong proficiency in Python for data processing and automation.
    • Deep understanding of distributed data processing architecture and tools such as Spark, Kubernetes and Kafka.
    • Experience working with data integration tools.
    • Deep understanding of ETL processes and data pipeline orchestration.
    • Self-motivated and proactive with a strong ability to learn new technologies quickly.
    • A humble yet confident mindset, open to feedback and collaboration.
    • Excellent problem-solving skills with a keen attention to detail.

    Bonus Points:

    • Experience with cloud platforms such as GCP, AWS, or Azure.
    • Familiarity with data warehouse solutions like Snowflake, BigQuery, or Redshift.
    • Knowledge of workflow orchestration tools such as Apache Airflow.
    • Experienced in deploying and maintaining complex machine learning models in production environments.
    • Prior experience working in a startup environment.
    • Knowledge of CI/CD pipelines and test-driven development.