Wiki

Analytics Engineering Projects

Project ideas for showing SQL modeling, metric ownership, dbt tests, documentation, BI readiness, and stakeholder judgment.

Related Wiki Pages

Portfolio Projects Analytics Engineering Dashboard and Metric Layer Project Checklist Analytics Engineering Roadmap Data Engineering Portfolio Projects Product Analytics Data Quality and Observability Job Search

An analytics engineer portfolio should show how a candidate turns messy source data into reusable models, shared metric definitions, and a trusted analytical surface.

Strong projects go beyond SQL or a dashboard. They explain table grain and modeled layers, add tests, and show how BI consumers use the definitions. They also show the business question behind the model.

These project ideas focus on reusable models and handoff. The broader role is covered in Analytics Engineering and Analytics Engineering Roadmap. The role boundary is covered in Data Analyst vs Analytics Engineer. Dashboard implementation connects to Dashboard and Metric Layer Project Checklist. Ingestion, orchestration, and platform-heavy work belong with Data Engineering Portfolio Projects.

Reviewable Analytics Project

A good analytics engineering portfolio project starts with a repeated business question and ends with a trusted analytical surface. The repository should make source assumptions and staging models easy to review. It should also show intermediate logic, marts, tests, and docs. The final analytical surface should be a dashboard or query layer that consumes shared models.

Perez Mola’s role discussion makes a dashboard-only project weak unless the dashboard sits on reusable models. A dbt-only project is also weak unless the models answer a business question and expose definitions to consumers ^[1]. Perafan’s modeling discussion makes the writeup part of the evidence: the project should explain why the model represents the business correctly. It should also state what one row means, which joins preserve the grain, and which caveats stakeholders should know ^[2].

The same work sits inside ETL and ELT. Data arrives first, and analysts or analytics engineers then transform it with SQL and dbt and publish data marts or consumption tables ^[3]. That favors projects that show source assumptions and warehouse-side transformations, even when the portfolio isn’t a full data-engineering project.

Reviewer Signals

Reviewers should be able to see the model, the business reason for it, and the handoff. Victoria Perez Mola grounds that in modeling and quality. Looker, dbt, and collaboration with analysts also belong in the evidence ^[1]. Juan Manuel Perafan adds that the project should make business reality explicit and safer through testing, documentation, and rigor ^[2].

Nikola Maksimovic shows a transition version of the portfolio, and the proof didn’t start as a public repository. It started with marketing reporting and BI-team conversations. Looker work, SQL practice, and BI projects happened alongside marketing work. The later role included dbt migration, LookML, product analytics, and A/B testing ^[4]. That supports portfolios that turn domain knowledge into modeled metrics instead of treating domain context as background.

Analysts can use the Data Analyst to Analytics Engineer to turn dashboard and KPI work into this kind of portfolio.

Arpit Choudhury widens the project boundary toward Data Activation. His episode connects tracking plans and event collection with warehouse transformations and BI. Reverse ETL then sends modeled data to support, sales, and engagement tools ^[5]. For portfolio builders, this makes activation projects legitimate analytics-engineering evidence when the work documents event ownership, data meaning, and downstream consequences.

Use Analytics Engineering Roadmap for the learning order, and choose proof-of-work projects that reviewers can look at.

Metric Mart and Dashboard Project

A metric mart and dashboard project is the clearest portfolio option because it connects Metrics, modeled tables, and a visible consumption surface. Pick a domain with repeated decisions. Useful domains include marketing funnels, product usage, business-health reporting, and finance reporting. Build source models before staging tables, facts, dimensions, and metric definitions. Then add tests and one documented dashboard that uses only the modeled layer.

Victoria Perez Mola ties analytics engineering to modeling and Looker exposure ^[1]. Tammy Liang adds business-health monitoring and streamlined reporting. Her team used documentation, testing, and adoption workshops ^[6].

For a portfolio, the README should show who uses the dashboard. It should explain what changed from the old spreadsheet or duplicated query. It should also show how another analyst finds the definitions.

The project should answer these review questions:

Business question: name the decision and metric owner. This follows Nikola Maksimovic from performance marketing into BI and product analytics. Funnels, retention, RFM analysis, and A/B testing gave modeling work a target ^[4].
Row grain: state what one row represents and which joins preserve or change that grain, because Juan Manuel Perafan ties this modeling question to representing business reality ^[2].
Modeled layers: separate sources, staging logic, intermediate joins, and marts, since Natalie Kwong distinguishes warehouses, transformations, and data marts inside the modern stack ^[3].
Consumption: make the dashboard use shared models instead of embedded duplicate metric logic because Nikola Maksimovic connects Looker, LookML, dbt migration, and product analytics in the same BI stack ^[4].

dbt Migration or Refactor Project

A dbt migration or refactor project works when the starting point is messy SQL, duplicated dashboard logic, or spreadsheet-defined metrics. Refactor the logic into model layers and add tests, docs, lineage, and a deployment note. Use reusable macros only where they remove duplication.

Nikola Maksimovic grounds this in a real dbt migration and LookML reporting. He also discusses wide-versus-narrow tables and incrementalization tradeoffs ^[4]. Christopher Bergh adds the DataOps standard for version control and tests. He also covers CI/CD, runbooks, documentation, and end-to-end versioning ^[7].

This project is strongest when it shows before-and-after behavior. Include the old query or dashboard calculation, the new dbt model structure, the tests that catch broken assumptions, and a reconciliation note for stakeholders. That reconciliation belongs in the portfolio because Barr Moses connects schema changes, lineage, ownership, and SLAs to data reliability ^[8].

Product Analytics and Event Model Project

A product analytics project should start with events, not charts. Write a tracking plan, then simulate or instrument events. Model user journeys and publish activation, retention, funnel, or experiment metrics.

Arpit Choudhury names signup and project-created events as SaaS examples. Invite and invoice events fit there too. He then connects collection and storage with transformation, analysis, and activation ^[5]. Nikola Maksimovic shows why marketing and product domain knowledge matter for funnels, retention, RFM analysis, and A/B testing ^[4].

This project should connect Event Tracking, Product Analytics, and A/B Testing through modeled tables. Document event owners and required properties. Also explain late-arriving events, user identity rules, and which modeled metrics feed the dashboard or experiment readout. That source-semantics work follows Arpit Choudhury on tracking plans with events, properties, and ownership ^[5].

Reverse ETL Project As An Activation Example

A reverse ETL or activation project is useful when the portfolio needs to show operational consequences. Model a customer or account segment in the warehouse. Then push it to a mock CRM, support tool, or marketing destination. Document ownership, refresh cadence, and privacy assumptions. Also explain the consequence of a wrong segment.

Arpit Choudhury covers reverse ETL and product-led activation ^[5]. Natalie Kwong covers warehouse tables flowing back into operational systems ^[3].

For analytics engineering, the important proof isn’t the connector. It’s that a trusted modeled segment can safely leave the warehouse. Link the segment to Data Activation, Reverse ETL, and the metric or event definitions that produced it. Include one failure example, such as a stale trial-status field or a duplicated account. Operational activation makes wrong analytical definitions visible to sales, support, or lifecycle marketing.

Hiring-Focused Fundamentals Project

A hiring-focused fundamentals project should go deep on SQL and modeling before adding tools. Jeff Katz places an analytics-engineering module around dbt, Snowflake, Mode, and Fivetran. He also emphasizes SQL mastery, window functions, OLTP versus OLAP, and sample database modeling practice ^[9].

The concrete artifact can be a small warehouse model over a sample transactional database. Show OLTP-to-OLAP modeling and window functions. Then add dimensional choices and a concise dashboard or query notebook. The Data Analysis guide covers the analyst-facing side of this work, while this page keeps the focus on reusable modeling and handoff.

Junior candidates can win with a smaller project when the grain definitions and tests are strong. Docs and SQL explanations can matter more than a broad stack that hides the modeling decisions. Connect the writeup to Data Analyst vs Analytics Engineer when the project explains the move from interpreting a dashboard to owning reusable analytical models.

Quality, Documentation, and Handoff

Every portfolio project should be reviewable as if another analyst had to maintain it next month. That means the repository and dashboard docs should make quality rules and handoff assumptions visible.

Tests and docs aren’t ornamental. Add non-null checks, unique checks, and accepted-values checks where they match the data rules. Add relationship checks and freshness checks where they protect consumers. Use custom tests when the business rule is specific.

Victoria Perez Mola discusses dbt tests and upstream checks. She also covers warnings, errors, docs, and profiling tools ^[1]. Barr Moses frames freshness, volume, distribution, and schema as reliability signals. She then ties lineage, ownership, and SLAs to data trust ^[8].

Documentation should make owners and purpose visible. It should also make caveats, columns, dependencies, and example queries findable. Tammy Liang uses a Notion wiki plus dashboard checks. She also connects workshops to data adoption outside the data team ^[6]. That links portfolio quality to Data Quality and Observability and DataOps, not just to model count.

Anti-Patterns

Avoid a dashboard built directly from raw tables with metric logic hidden in charts. Victoria Perez Mola places analytics-engineering value in modeled data, dbt transformations, and Looker exposure, not in isolated charts ^[1].

Avoid a dbt repository with many models but no business definitions, tests, owners, or BI consumer. Juan Manuel Perafan argues that the work should map business reality and make the data safer ^[2]. Tammy Liang shows that adoption, documentation, and trust matter after the models exist ^[6].

Avoid copying a public template without explaining grain, joins, slowly changing attributes, or incremental logic. Nikola Maksimovic grounds the role in practical data-modeling tradeoffs during a dbt migration, including wide versus narrow tables and incrementalization ^[4].

Avoid final KPI screenshots without source caveats, data-quality checks, or reconciliation notes. Barr Moses shows how silent failures, schema changes, freshness, and lineage break trust. Ownership matters too when teams only look at the final output ^[8].

Avoid treating analytics engineering as “SQL plus dashboard.” Strong projects show software practices and tests, then docs and lineage. They also show version control, warehouse transformations, and adoption ^[1] ^[7]. The broader workflow context belongs with Analytics Engineering.

DataTalks.Club