SYSTEM LIVE
// Historical Real Estate Intelligence Platform

We don't collect listings.
We build market memory.

NARMYX AI is a full-cycle data engineering and spatial AI platform that observes the entire real estate market, captures every change, and stores intelligence for years.

Total Ads Tracked
0
503K listings — live right now
Raw HTML Pages Stored
0
1.1M HTML snapshots in storage
Parse Runs Executed
0
1.13M successful parse runs
Fetch Tasks Completed
0
1.2M queued tasks — completed
Active Listings (live)
0
+ 90,934 removed · 7,393 missing
Database Size
9.69 GiB
PostgreSQL production — and growing
Parse Success Rate
99.96%
1,135,737 success / 441 failed
HTTP 200 Fetches
0
out of 1.2M fetch tasks completed
01

Data Lifecycle Pipeline

🔭
DISCOVERY
Market scanning, URL discovery, segment indexing
⬇️
FETCH
50 workers, 18K pages/hr, raw HTML capture
⚙️
PARSE
8K listings/sec, attribute extraction, no data loss
📍
GEO
Coordinates, district, metro, address resolution
🧠
ENRICHMENT
Spatial features, POI distances, urban morphology
📈
LIFECYCLE
Status tracking, time-on-market, change detection
🗄️
ARCHIVE
S3 HTML storage, historical records, full versioning
02

Performance Benchmarks

Parser Performance
REAL-TIME
1 worker 400 pages/sec
10 workers 4,000 pages/sec
20 workers 8,000 pages/sec
Fetch Performance
50 WORKERS
HTML pages / hour 18,000
GraphQL pages (1 segment) 3,500
Spatial query / listing < 1.5 sec
03

Spatial Intelligence Layer

POI Coverage
6 Spatial Categories Mapped
Retail points 6,000+
Food locations 5,000+
Healthcare 2,000+
Transit nodes 1,800+
Education 1,600+
Park polygons 6,500+
Spatial Features
106 Geo Columns per Listing
Distance to metro
Distance to school
Urban compactness
Building footprint density
Road network exposure
100% spatial coverage Building footprints Planning layers Coastline geometry
04

Listing Lifecycle Tracking

ACTIVE
Listing visible on market. Price, attributes, and metadata tracked in real-time.
MISSING
Listing temporarily unavailable. System monitors for reappearance.
REMOVED
Listing confirmed deleted. Time-on-market recorded. History preserved.
ARCHIVED
Full historical record stored. HTML snapshot, price history, all events retained.
05

Database Architecture — 20+ Tables, 8 Schemas

core
Listings, properties, canonical records
discovery
URL registry, segment index, crawl state
fetch
Task queues, retry logic, HTML receipts
parse
Extracted attributes, raw fields, unknown attrs
geo
Coordinates, districts, address resolution
enrich
Spatial features, POI distances, morphology
archive
Price history, status events, HTML snapshots
ops
System health, worker logs, job metrics
06

Machine Learning — Valuation Engine

Core Model
CatBoost Regressor
Handles categorical and numerical features natively. Robust to noise. Production-grade tabular ML, predicting price_per_m² for lower variance and better generalization.
Versioned SHAP-ready Segmented
Training Scale
503,480 Training Records
Full training pipeline: listing snapshots → spatial enrichment → parquet export → model training → model registry → prediction API. Not a prototype — a production structure.
Dataset versioning Feature store
Execution DAG
6-Stage Pipeline
Deterministic directed graph: geometry → spatial keys → distances → density → accessibility → market. Every step depends on the previous. Industrial data pipeline architecture.
As-of valuation Historical recompute
08

Reliability & Infrastructure

Stateless Workers
Spot-Instance Ready
All workers are stateless — they store no data locally, can be destroyed at any moment, and automatically resume work. Designed for AWS Spot instances. Server failure causes zero data loss.
AWS Spot-ready Auto-resume Zero data loss Idempotent
Task Queue Engine
PostgreSQL + SKIP LOCKED
All jobs routed through PostgreSQL task queues using SELECT FOR UPDATE SKIP LOCKED — an industrial-grade mechanism that is safe under heavy parallel load with full retry logic and lease-based ownership.
Parallel-safe Retry logic Lease queues
"We don't just collect listings —
we create a historical model of the market."
AI Valuation Market Intelligence Investment Analytics Urban Research ML Training Data