Case study

FlowCast: human-in-the-loop demand forecasting for an energy retailer

Client: A regional Japanese energy retailer
Scale: ~1,800 end-customer accounts · 150M+ slot-level demand records
Buyer: Taka Takizawa, CEO
Industry: Energy / utilities (electricity retail)
Geography: Japan
Status: in production

Executive summary

Client. A regional Japanese electricity retailer serving roughly 1,800 end-customer accounts across residential and industrial voltage tiers. CEO Taka Takizawa was the buyer; the system is run by the operations team.

Solution. FlowCast: a human-in-the-loop forecasting system that replaces the daily Excel workflow the operations team had been using to predict next-day 30-minute electricity demand. A Python worker runs Similar Day Search overnight on five years of consolidated actuals and weather data. The operator reviews the forecast on a web dashboard, adjusts where needed, then submits to the operator's existing JEPX-facing demand management system through its CSV submission API.

Outcome. The daily cycle moved from a multi-hour gather-and-build workflow to a focused review-and-submit. The system runs the full production book, syncs actuals daily and weekly, ingests weather actuals and forecasts continuously, and pages Slack when anything breaks. Every forecast is reproducible from a (strategy_version, snapshot_date, worker_image) lineage stamp.

Why it landed.

Pain was concrete and daily. Every weekday, before market close, the operator had to produce 48 30-minute slots per client group from a fragile spreadsheet, with separate browser tabs for weather lookup and submission.
The hard part wasn't the forecasting math. It was the plumbing around it: three actuals sources (legacy CSV, kakuhou confirmed, nippou rolling), two weather locations matching the client's service area, partial-failure-tolerant sync jobs, and a write-back path to the existing CSV submission API.
Human-in-the-loop framing matched the buyer's risk tolerance. Bad market-close submissions are expensive. The algorithm proposes, the operator approves.
The system fit the operator's existing workflow instead of replacing it. Single dashboard, one login, inline edits on the predicted slots. No second vendor portal.

The brief

The operations team ran daily next-day demand forecasting out of a single Excel workbook (BG実績日量ver3.xlsm). Every weekday afternoon, before market close, the cycle was:

Pull yesterday's actuals from the existing demand management system
Look up the day's weather actuals from WeatherNews in a second tab
Find similar historical days by hand
Build next-day predictions for 48 30-minute slots, per client group
Re-enter the result into the submission UI

Every step compounded errors. Weather lookups happened in a separate browser tab. Client-group breakouts depended on someone re-keying numbers. Forecast quality was usable, but the cycle time and the failure surface were the actual cost.

The brief was small and specific: take the spreadsheet workflow, give it a real system, keep the operator in the loop, and make sure nothing submitted was wrong.

What we built

FlowCast is a three-tier system on top of a Postgres system of record:

Review UI (Vercel) → Edge Functions (Supabase) → Prediction Worker (Fly.io Tokyo)
                                ↓
                       Supabase Postgres (Tokyo)
                                ↑
            external APIs: existing demand management system + WeatherNews

The operator's day:

Overnight, a worker pulls confirmed actuals from yesterday and ingests today's weather actuals plus tomorrow's forecast
The worker runs Similar Day Search across five years of consolidated actuals weighted by weather and seasonality, producing 48 slot predictions per client group
The operator opens the dashboard mid-morning, reviews the forecast against the similar days the engine leaned on, adjusts slots that look off, and submits the reviewed forecast to the existing demand management system through its CSV submission API

Behind the scenes, the system carries the load:

Three actuals sources (legacy CSV 2020-2025, confirmed kakuhou 2026+, rolling 2-month nippou) reconciled by a consolidated view so the operator never sees source-switching logic
Yearly partitioned tables keep queries against five years of 30-minute data fast
Cron-triggered sync jobs (daily, weekly, continuous) with partial-failure guards so a single bad day doesn't poison the pipeline
Bearer-token authentication on every edge function (constant-time compare, fails closed) so the API surface isn't open to the internet
Lineage stamps on every forecast run so any prediction can be reproduced from its inputs
Slack watchdogs that page when cron jobs fail, MAPE spikes, or sync windows are missed

How it runs in production

The Similar Day Search algorithm itself is intentionally simple: find historical days resembling the target by weather, day-of-week, season, and client events. Weight them, average. Apply per-group adjustments.

The work that matters is everything around the algorithm:

Reconciling three actuals sources into one trustworthy read path
Two-location weather ingestion to match the client's service geography
A write path to the existing demand management system that matches its high/low CSV split format exactly
A review UI that surfaces the similar days the engine used so the operator can sanity-check the basis for each slot

The operator never sees the reconciliation, the partitioning, or the failure-recovery logic. They see a dashboard, a forecast, an evidence trail, and a submit button.

Outcome

After production cutover, the system is:

Driving daily forecasts for the client's full ~1,800-account book
Submitting reviewed forecasts to the existing demand management system through its CSV API
Self-healing on the common failure modes (partial sync failures, transient API errors, weather data gaps)
Producing forecast history that's reproducible from its lineage tags rather than from screenshots and spreadsheet copies

The headline result is cycle-time compression. The operator now spends focused review time, not data-gathering time. The second-order result is that the forecast is auditable. A prediction made three weeks ago can be re-derived from its inputs without guesswork.

What we deliberately did not do

We did not promise FlowCast would beat the spreadsheet's accuracy out of the gate. We promised it would replace the workflow, leave the operator in control, and stop being a source of fragility. That's what shipped.

A second engagement would focus on:

Temperature-band clustering for sharper similar-day weighting
Per-client-group strategy selection (residential and industrial baselines behave differently enough that one model underfits both)
Tighter integration of scheduled client events (planned outages, industrial-customer event calendars)

These were scoped out of the initial build deliberately. The bar for production cutover was a stable system on a simple algorithm. Tuning surface area gets added after the foundation is trusted.

Stack

Layer	Technology
Review UI	Vite + React 19 + Tailwind 4 + Tremor (Vercel)
API gateway	Supabase Edge Functions (Deno / TypeScript), bearer-token authentication
Prediction worker	Python 3.12 on Fly.io, Tokyo region
Database	Supabase Postgres Pro (Tokyo)
External APIs	Existing demand management system (CSV high/low submission), WeatherNews (multi-location actuals + forecast)
Observability	pg_cron + Slack webhook watchdogs
Lineage	`strategy_version` + `worker_image` + `snapshot_date` stamps on every `forecast_run`

What this means if you're considering something similar

If you have a daily or weekly forecasting, submission, or approval workflow running out of a spreadsheet today, FlowCast is the pattern.

The work is rarely the algorithm. It's the surrounding plumbing: actuals sync, external API integration, operator review UX, lineage for auditability, and alerts for when things break. Most of these are tractable in a 4-6 week build. The operator stays in the loop the whole time, which makes the cutover a workflow upgrade rather than a vendor switch.

If that fits your situation, the diagnostic is the right next step.