ABX.
Summer cohort · 3 spots remaining
All tracks
Track 2 · Roadmap

Data Platform

Batch, stream, and lakehouse infrastructure.

Batch, stream, and lakehouse infrastructure that a data-platform engineer owns end to end. Kafka is the shared backbone; everything else answers to scale, latency, and correctness.

Who this track is for

Same curriculum. Two different interview loops.

New grad
L3 / L4 · first data-platform role
Targeting a first data or platform engineer role at FAANG, or a specialist role at a data-infra company (Snowflake, Databricks, Confluent, dbt Labs). Same capstone — the interview loop leans on SQL depth, DSA, and one focused system-design round.
Experienced
L4 / L5 / L6 · platform lateral
Targeting a platform or staff-level move — often from application-side data work into real infrastructure ownership. The capstone becomes the artifact that proves you can own multi-workload tenancy, cost, and freshness SLAs.
How the 12 months line up

Recruiting is seasonal. Your pace is yours.

US tech hiring runs hot in September–November and January–March, and is quiet the rest of the year. Your offer loop is timed to whichever window lands inside your program — we don't walk students into a dead market.

The 12-month arc is a default, not a contract. Students arriving with solid production fundamentals compress Phases 01 and 02 into weeks; students new to distributed systems take the full arc. Either way, Phase 04 starts the moment you're interview-ready — often halfway through the capstone, not after it.

Active recruiting windowQuiet — build phase
Four focus areas · 12-month default

What you build, at your pace.

Durations below describe a default arc. Students with stronger foundations move through Phases 01 and 02 faster, and Phase 04 runs in parallel with the capstone once onsites start landing.

  1. Phase 01
    ~3 months

    Storage and compute primitives

    • File formats, partitioning, and lakehouse table primitives.
    • Query engines: planning, push-down, vectorized execution.
    • Schema evolution, backfills, and data contracts that don't rot.
    • Lineage and catalog fundamentals.
  2. Phase 02
    ~4 months

    Batch and stream at production scale

    • Batch compute engines and the shuffle-heavy patterns they handle.
    • Stream processing: exactly-once, event time, watermarks, stateful operators.
    • Orchestration: DAGs, retries, idempotency, and real SLAs.
    • Platform observability: freshness, volume, schema drift, cost attribution.
  3. Phase 03
    ~3 months

    Capstone

    • Ship a platform-grade pipeline that handles multiple workloads cleanly.
    • Own the cost story, the freshness SLA, and the failure playbook.
    • Defend schema and architectural decisions at the level you're interviewing for.
  4. Phase 04
    Parallel once you're ready · timed to the next window

    Offer loop

    • Platform system design rehearsed at the level you're targeting.
    • Behavioral loop grounded in real ownership stories from capstone.
    • OA and VO drill sets for data-infra-specific question patterns.
    • Offer negotiation for data-platform roles — leveled and paid on their own curve.
    New grad
    Target Sep–Nov with data-infra-first companies (Snowflake, Databricks, Confluent) where the funnel is less crowded than general SWE.
    Experienced
    Align onsites inside Sep–Nov or Jan–Mar — platform roles move slower than product SWE, so starting recruiter outreach 2–3 weeks earlier is standard.
Outcome

You walk in with a platform you built, not a set of tutorials you followed.

Stack recap

Every technology you'll touch.

Core
  • Apache Spark
  • Apache Flink
  • Kafka
  • Airflow
  • dbt
  • Iceberg
  • Snowflake
Exposure
  • Databricks
  • ClickHouse
  • Druid
  • Trino
  • Airbyte
Ready for this track?

Apply to the summer cohort.

Start application