For job seekers, BONAPOLIA offers a gateway to exciting career prospects and the chance to thrive in a fulfilling work environment. We believe that the right job can transform lives, and we are committed to making that happen for you.
We are looking for a certified Data Engineer who will turn data into information, information into insight, and insight into business decisions. This is a unique opportunity to be one of the key drivers of change in our expanding company.
About the Customer
The customer is a leading provider of software solutions and healthcare services. They provide hospitals and health systems and help those systems generate better data and insights to enable better healthcare.
The customer offers a robust portfolio of solutions that can be fully integrated and custom-configured to help healthcare organizations efficiently capture information for assessing and improving the quality and safety of patient care.
Requirements
- Background in Data Engineering with hands-on work using Databricks
- Strong expertise in ETL processes and data pipeline development
- Proficiency with Spark and Python for large-scale data processing in Databricks
- Experience with data extraction from web-based sources (APIs, web scraping, or similar approaches)
- Familiarity with handling structured and unstructured data and ensuring efficient data storage and access
- Competency in Azure or other cloud platforms
- Knowledge of SQL and database management for data validation and storage
English level
Upper-Intermediate
Responsibilities
Design, develop, and implement data ingestion and transformation pipelines using Databricks and Azure DatabricksManage and orchestrate data extraction from a public information website, handling bulk data downloads efficientlyDevelop solutions for periodic data updates (monthly), optimizing data refresh cycles to ensure accuracy and efficiencyClean, aggregate, and transform the data for analysis, ensuring the quality and completeness of dataCollaborate with stakeholders to understand data requirements and propose solutions for efficient data managementImplement best practices for ETL processes in a Databricks environment, ensuring scalability and performanceMonitor and troubleshoot data pipelines, ensuring smooth operation and timely updates.Document the data engineering processes, workflows, and architecture