[Remote] Senior Data Engineer
Note: The job is a remote job and is open to candidates in USA. CitiusTech is a healthcare technology company that aims to solve the industry's greatest challenges through innovation and collaboration. They are seeking a Senior Data Engineer who will be responsible for designing, building, and operating ETL/ELT pipelines on cloud platforms for healthcare data, ensuring data governance and quality practices are upheld.
Responsibilities
- Design, build, and operate scalable ETL/ELT pipelines across GCP, Azure, BigQuery, and SQL Server
- Model data for analytical workloads including dimensional modeling, SCDs, normalization, and schema design
- Orchestrate pipelines using Airflow, Cloud Composer, Azure Data Factory, or similar frameworks
- Ensure secure handling of PHI in alignment with HIPAA—covering data movement, de-identification, access controls, and audit readiness
- Implement and enforce data governance practices, including metadata management, data lineage, cataloging, and stewardship workflows
- Integrate with enterprise data governance platforms such as Microsoft Purview, Collibra, or Alation for:
- Data cataloging and classification
- Lineage tracking (end-to-end pipeline visibility)
- Glossary and business metadata management
- Define and implement data quality frameworks including validation rules, anomaly detection, and monitoring
- Enable data discoverability and trust through proper tagging, classification, and governance standards
- Deploy pipelines using Git-based workflows and CI/CD; monitor, troubleshoot, and optimize production pipelines
- Collaborate with stakeholders (business, analytics, governance teams) to translate requirements into scalable technical solutions
- Communicate technical tradeoffs, risks, and governance implications early in the design lifecycle
Skills
- 7+ years of experience in data engineering with strong exposure to senior-level ownership of production systems
- Strong proficiency in SQL and either Python or Java for pipeline development
- Hands-on experience across cloud platforms such as GCP and Azure, including BigQuery and SQL Server
- Deep experience designing scalable and reliable ETL/ELT pipelines with performance optimization
- Hands-on experience with orchestration tools such as Airflow, Cloud Composer, ADF, Dagster, or Prefect
- Strong data modeling skills — dimensional modeling, normalization, and slowly changing dimensions
- Data Governance Expertise: Experience working with tools like Microsoft Purview, Collibra, Alation, or Informatica EDC
- Understanding of data cataloging, lineage, metadata management, and business glossaries
- Exposure to data classification, data stewardship workflows, and governance frameworks
- Experience implementing data quality frameworks (DQ rules, profiling, validation pipelines)
- Working knowledge of HIPAA and PHI compliance requirements
- Experience operating within enterprise security, governance, and compliance frameworks
- Proficiency with Git, CI/CD pipelines, and production deployment practices
- Experience integrating governance tools with cloud-native ecosystems (e.g., Purview with Azure data services, Collibra with multi-cloud pipelines)
- Exposure to Master Data Management (MDM) and reference data systems
- Familiarity with semantic layers, data mesh, or data fabric architectures
- Experience with LLM-assisted development, data observability tools, or modern ELT frameworks (dbt, Dataform)
- Knowledge of healthcare data standards (HL7, FHIR, OMOP, etc.)
Benefits
- Medical, dental, and vision insurance
- Paid time off
- Parental leave
- Comprehensive benefits package
Company Overview