Senior Data Engineer

2/25/2026

The role involves end-to-end engineering delivery for data pipelines, including design, development, deployment, and taking Day-2 accountability for production stability, data quality remediation, and operational health.

Working Hours

40 hours/week

Company Size

11-50 employees

Language

English

Visa Sponsorship

About The Company

Cygnify is an on-demand, plug & play TA team on a month-to-month subscription, delivering unlimited global hires with no placement fees. Our Talent Acquisition as a Service (TAaaS) offers companies instant access to a fully managed team of recruitment experts, cutting-edge AI tools, and a 100M+ candidate database. All our monthly plans are transparent, and flexible, with no lock-ins, supporting all roles, levels, and locations globally. Press Play to supercharge your Talent Acquisition—streamlining hiring with a single partner across every location, leveraging our deep market expertise, extensive networks, and proven success in securing top talent. Avoid the high costs of growing an in-house team and agency placement fees. We have it all in our plug & play TA solution.

About the Role

Role Mission:

Deliver and operate reliable, high-quality data pipelines and curated datasets on DXP data platform. This role owns end-to-end engineering delivery for assigned pipelines/data products and takes Day-2 accountability for DataOps stability, observability, cost-efficiency and data quality remediation in production.

Accountabilities: <Areas this person needs to own>

1. End-to-end pipeline delivery (Build): Independently design, develop, test, and deploy ingestion and transformation pipelines from source to curated layers.

2. Production reliability (Run): Own operational health for assigned pipelines - monitoring, incident response, recovery, and continuous improvement to meet SLA and freshness expectations.

3. Data quality management (Govern): Implement and run data quality controls (validation, reconciliation, anomaly detection), drive root-cause analysis, and coordinate remediation with data stewards and source owners.

4. Engineering standards & observability: Apply engineering standards for CI/CD, version control, pipeline instrumentation, documentation, RBAC alignment, and cost/performance guardrails. Contribute to continuous improvements to engineering standards, including optimizing workflows using AI.

5. Stakeholder collaboration: Work directly with architects, platform engineers, data stewards, application domain teams and analytics users to clarify requirements, manage trade-offs, and deliver trusted datasets for self-serve analytics.

Responsibilities: <Tasks this person to deliver accountabilities>

1. Data Engineering Delivery

a. Build/extend ingestion pipelines using Datapipe (Airbyte/Airflow), Snowflake (Snowpark, Snowpipe, Openflow) and AWS integration patterns; implement robust retry, idempotency, and backfill strategies.

b. Implement data model & develop transformation logic in Snowflake (SQL/Python where relevant) across Bronze/Silver/Gold (or equivalent) layers; optimize for maintainability and cost.

c. Deliver well-tested changes via CI/CD across DEV/SIT/PROD with clear release notes and rollback plans.

2. DataOps / Production Support

a. Monitor pipelines and data SLAs; triage failures, recover production runs, and perform RCA with preventive actions (not just quick fixes).

b. Create and maintain runbooks/playbooks, on-call handover notes, and operational dashboards for owned pipelines.

c. Collaborate with Platform Engineering on observability, alert tuning, operational readiness, and automation improvements.

3. Data Quality & Stewardship – Controls & Remediation

a. Implement automated DQ checks (completeness, uniqueness, referential integrity, schema drift, reconciliation) and publish outcomes to stakeholders.

b. Partner with Data Quality Stewards to track, prioritize, and remediate DQ issues; clearly separate source-system defects vs pipeline defects and drive the right owner actions.

c. Enable stewardship tooling: help domain stewards operationalize governance artifacts (e.g., turning glossary/CDE definitions into checks/scorecards; integrating with ticketing/knowledge base), without turning the engineer into the “governance admin.”

4. Collaboration & Enablement

a. Participate in requirement refinement with architects/analysts to shape requirements into implementable data contracts and acceptance criteria.

b. Produce engineering documentation (data lineage notes, assumptions, operational procedures) and contribute to team knowledge base and onboarding materials.

c. Co-own domain data contracts (with steward sign-off): translate business definitions (KPIs, allowable values, timeliness, “gold” dataset expectations) into implementable data contracts, acceptance criteria, and change control notes.

Team Scope/ Stakeholders:

1. Scope: Assigned pipelines, datasets, and operational ownership within the DXP Data Platform (Datapipe/Airflow/Airbyte, Snowflake, AWS).

2. Key stakeholders: Data Engineering Lead(s), Data Architects, Platform Engineering, Data Quality Stewards, BI/Analytics users, and source system owners.

3. Decision rights (within owned area): pipeline design approach, implementation choices (aligned to paved road/patterns), testing strategy, and operational guardrails (aligned to standards); incident triage actions and recovery steps; recommendations on prioritization for fixes vs enhancements.

1. 3 - 8+ years in data engineering with proven hands-on delivery and production operations ownership (not project-only).

2. Strong practical skills in Snowflake (or equivalent modern data platforms): data loading/transforms, performance tuning basics, role-aware designs, and cost awareness.

3. Orchestration experience: Airflow (or equivalent) DAG design, scheduling, dependency control, retries, and observability.

4. Python + SQL proficiency for transformation, validation, and operational tooling/scripts.

5. AWS fundamentals: S3 data structures/lifecycle, IAM-aware integrations, and monitoring basics (e.g., CloudWatch patterns).

6. Applied AI/agentic approaches to Data DevOps: hands-on exposure building or integrating AI-assisted operational workflows (e.g., incident triage summarization, log/query analysis helpers, automated runbook suggestion, anomaly detection for freshness/volume/schema drift, or LLM-based knowledge retrieval for pipeline support), with clear guardrails (RBAC, auditability, and “human-in-the-loop” approval for production actions).

7. Demonstrated ability to handle production incidents with structured RCA and preventative improvements.

8. Learning agility and problem-solving: picks up modern stack components fast and applies them pragmatically.

Core Traits (Non-negotiables)

1. "Build it, run it" ownership: doesn’t outsource ops thinking to others.

2. Production-first mindset: prioritizes reliability, data correctness, and recoverability.

3. Structured problem solving: hypothesis-driven debugging, evidence-based RCA, and tight feedback loops.

4. Collaboration maturity: works with stakeholders without over-promising; escalates early with options and trade-offs.

Key Skills

Data PipelinesDataOpsSnowflakeAirflowAirbytePythonSQLAWSCI/CDData QualityObservabilityData ModelingIncident ResponseRCAData ContractsLLM