Duties and Responsibilities:
- Build, refactor, and maintain pipeline components of data products for a cloud environment.
- Own the ingestion, storage, cleansing, profiling, transformation, and presentation of data products.
- Analyze large and complex datasets for quality, accuracy, anomalies, and performance.
- Own pre-UAT testing for every change using the standard ABC framework.
- Consultatively interface with analysts and stakeholders on new data requests, enhancements, recommendations, quality, and awareness items.
- Assess project-specific integration restrictions/options and recommend load strategy, storage targets, data access layer model, and solution architecture.
- Participate in modeling star-schema data marts, denormalized performant layers, presentation layers, and Data Vault (or other immutable storage).
- Organize source code with versioning framework.
- Create and maintain project-specific reference architecture diagrams and engineering-oriented documentation.
- Triage emergent data concerns and coordinate with stakeholders through resolution.
Skills:
- Demonstrated ability to program advanced SQL and Python.
- Demonstrated ability to write and maintain scripts using Bash (Unix shell), PowerShell, or DOS batch scripting.
- Demonstrated ability to use ELT applications such as WhereScape or IDMC.
- Demonstrated ability to present at team-wide code reviews.
- Demonstrated ability to proactively analyze and resolve data issues in an agile, collaborative environment.
Knowledge:
- Knowledge of data ingestion using unstructured, semi-structured, and structured data.
- Knowledge of data quality concepts and the ABC framework.
- Familiar with source code versioning tools such as Git.
- Comprehensive understanding of both Star Schema modeling and Normalized Relational modeling. Understanding of Data Vault 2.1 is a plus.
- Foundational knowledge of data preparation needs for Large Language Models.
- Familiar with metadata capture, row access policies, data masking policies, and data governance principles.
Experience:
- 3+ years in database technologies and data engineering experience
- 3+ years of modeling, building, and testing data products experience
- 2+ years in cloud data infrastructure with a SaaS delivery model. Azure and Snowflake experience are a plus.
- Healthcare experience is preferred
Education and Certifications:
- Bachelor’s Degree in Computer Science, Software Engineering, Healthcare Informatics or related studies; or 7 years of relevant experience in lieu of degree.
Work Environment:
Ability to provide off-hours assistance to support critical time sensitive data products.!!Ability to handle multiple assignments concurrently.!!Ability to adapt to changing priorities of multiple customers.!!Ability to work independently and/or as a member of a project team.