Building a housing market data platform as a solo engineer

Aug 2, 2024 · 7 min read

This project started as a personal attempt to better understand the Norwegian housing market. It gradually evolved into a long-running data platform that collected, processed, and analyzed real-world listings over time.

Working solo forced a focus on correctness, automation, and operational simplicity. Every manual step eventually became a liability.

Data collection in the real world

Scraping production websites is less about clever selectors and more about resilience. Pages change, requests fail, and partial data is unavoidable.

The scraper was designed to be idempotent and restartable, with clear checkpoints and persistent state to avoid duplication and silent corruption.

Cleaning and normalization are the real work

Raw listings contained inconsistent formatting, missing values, and implicit assumptions. Normalization and validation accounted for a significant portion of the total effort.

Serving data is an operational concern

Exposing the data for analysis required predictable schemas and versioned transformations. Changes were treated as migrations rather than ad-hoc fixes.

Over time, the project became less about housing and more about building systems that can run unattended without surprises.

← Back to writing