Data Engineer


At phData, we build platforms exclusively for data and machine learning. Our services and software are used by the world's largest companies to solve their hardest problems.

Our work is challenging and our standards are high, but you'll be set up for success with tooling, training, and colleagues who are among the brightest minds in the field. You will be working with the latest cloud-native and distributed data platforms on the market. Since the data and machine learning industry changes quickly, you have the opportunity to continuously learn. Our strategy is to remain innovative and cutting edge, while also ensuring our work is practical and unlocks real business value for our customers.

While we're growing extremely fast, we maintain a casual, small business work environment. We hire top performers and allow them the autonomy to deliver results. Our award winning workplace fosters learning, creativity and teamwork.

  • Best Places to Work (2017, 2018, 2019, 2020, 2021)
  • Inc. 5000 Fastest Growing US Companies (2019, 2020)
  • Snowflake Emerging Partner of the Year 2020
  • Databricks Rising Star Partner of the Year 2020
  • Cloudera Partner of the Year 2020

Technical Requirements:

  • Hands-on experience as a Software Engineer or Data Engineer
  • Solid proficiency with at least one programming language (Java, Scala, Python)
  • Very strong understanding of SQL alongside traditional/conventional data warehousing design patterns.
  • Hands-on experience with tools such as Spark (Scala or PySpark), Map Reduce (Java), YARN, HDFS, Hive, Impala, Sqoop, Oozie, HBase, Kudu.
  • Good understanding of UNIX shell scripting.
  • Develop analytical functionality and complex transformation that will be finally deployed in production data platforms.
  • Good understanding of Big data design patterns
  • Hands-on experience troubleshooting, optimizing, and enhancing the big data pipeline and bring improvements.
  • Worked and deployed at least one data engineering application using cloud-native data platform and services (example EMR or Redshift in AWS or Data factory in Azure or BigQuery in GCP)
  • Strong troubleshooting and performance tuning skills.
  • Very well versed with continuous integration and deployment procedure considering the bigdata stack on-prem or cloud environment.
  • Ability to analyze business requirement user stories and translate them into system requirement specifications.
  • Experience working under the agile delivery methodology.

Behavioral Requirement:

  • Demonstrated ability to work independently
  • Good communication skills and documentation skills
  • Good & Collaborative Team Player
  • Good organizational and time management skills
  • An ability to work to deadlines
  • A good eye for detail
  • Must be ready to learn and adapt to new technologies

Qualifications Requirements:

  • BE/BTech in computer science Or MCA with sound industry experience
  • Experience in a cloud-based environment with PaaS & IaaS
  • Work iteratively in a team with continuous collaboration

Perks and Benefits:

  • Medical Insurance for Self & Family
  • Medical Insurance for Parents
  • Term Life & Personal Accident
  • Wellness Allowance
  • Broadband Reimbursement
  • Professional Development Allowance
  • Reimbursement of Skill Upgrade Certifications
  • Certification Bonus

Please let phData know you found this position on Remotely We Code as a way to support us.


  • September 30


  • Remote - India

Job Type

  • Full-Time


Please let phData know you found this position on Remotely We Code as a way to support us.

About phData

Company profile