Deployment Guide
Overview
Section titled “Overview”Energy Copilot is deployed using Databricks Asset Bundles (DAB), which manages all resources declaratively via YAML. The deployment creates jobs, pipelines, MLflow experiments, and the Databricks App in a single command.
The confirmed working deployment workflow is:
- Build the React frontend with Vite
- Import the built files to the Databricks workspace
- Deploy the app with
databricks apps deploy
Prerequisites
Section titled “Prerequisites”Databricks Workspace Requirements
Section titled “Databricks Workspace Requirements”| Requirement | Notes |
|---|---|
| Databricks Workspace | Serverless-enabled (AWS, Azure, or GCP) |
| Unity Catalog | Enabled on the metastore |
| SQL Warehouse | At least one warehouse (Serverless Starter is sufficient) |
| Databricks Apps | Enabled on the workspace |
| Lakebase | Provisioned (optional — falls back to SQL Warehouse if absent) |
Local Tools
Section titled “Local Tools”| Tool | Install Command | Required Version |
|---|---|---|
| Databricks CLI | brew tap databricks/tap && brew install databricks | v0.209.0+ |
| Node.js + npm | brew install node | v18+ |
| Python | brew install python@3.11 | 3.10+ |
Verify your CLI version:
databricks version# Databricks CLI v0.209.0Environment Variables
Section titled “Environment Variables”The backend reads these environment variables at runtime. Set them via the Databricks Apps configuration or the bundle variables.
| Variable | Description | Required |
|---|---|---|
DATABRICKS_HOST | Workspace URL (e.g. https://myworkspace.cloud.databricks.com) | Yes |
DATABRICKS_TOKEN | Personal access token or M2M OAuth token | Yes |
DATABRICKS_WAREHOUSE_ID | SQL Warehouse ID for fallback queries | Yes |
DATABRICKS_CATALOG | Unity Catalog name (default: energy_copilot) | Yes |
LAKEBASE_INSTANCE_NAME | Lakebase instance name | Optional |
LAKEBASE_HOST | Lakebase Postgres hostname | Optional |
LAKEBASE_PASSWORD | Lakebase Postgres password | Optional |
ANTHROPIC_API_KEY | Only needed if NOT using Databricks FMAPI routing | Optional |
Step-by-Step Deployment
Section titled “Step-by-Step Deployment”Step 1 — Authenticate
Section titled “Step 1 — Authenticate”databricks auth login https://your-workspace.cloud.databricks.com --profile=my-profileVerify authentication:
databricks current-user me --profile=my-profile# {# "user_name": "you@company.com",# "display_name": "Your Name"# }Step 2 — Clone the Repository
Section titled “Step 2 — Clone the Repository”git clone https://github.com/sourabhghose/databricks-energy-copilot.gitcd energy-copilotStep 3 — Validate the Bundle
Section titled “Step 3 — Validate the Bundle”Before deploying, validate that all YAML is correct and all variable references resolve:
databricks bundle validate --target dev --profile=my-profileExpected output: no errors. Warnings about missing experiments or models are normal on first deploy.
Step 4 — Deploy All Resources
Section titled “Step 4 — Deploy All Resources”databricks bundle deploy --target dev --profile=my-profileThis command creates or updates:
| Resource Type | What Is Created |
|---|---|
| Jobs | 14 serverless jobs: setup, ingest (NEMWEB, OpenElec, Weather, Solar, AER, CER LGC, ISP), ML forecast, market summary, data quality, simulator, snapshots |
| DLT Pipelines | 4 serverless Delta Live Tables pipelines |
| MLflow Experiments | 5 experiments: price, demand, wind, solar, anomaly |
Deployment typically takes 2–5 minutes.
Step 5 — Run the Setup Job
Section titled “Step 5 — Run the Setup Job”databricks bundle run job_00_setup --target dev --profile=my-profileThis job runs 8 tasks:
| Task | Description | Duration |
|---|---|---|
create_schemas | Creates bronze, silver, gold, ml, tools schemas | ~30s |
create_tables | Creates all Delta tables + grants SP permissions | ~2 min |
seed_isp_data | AEMO ISP 2024 projects and capacity outlook | ~1 min |
seed_lgc_data | CER LGC registry + spot prices | ~1 min |
seed_tariff_data | AER CDR retail tariffs + components | ~1 min |
seed_facility_data | OpenNEM facility generation timeseries | ~1 min |
seed_deal_data | 5 counterparties, 3 portfolios, 50 sample trades | ~1 min |
backfill | 90 days of synthetic NEM data (~130K rows) | 15–30 min |
Monitor progress:
databricks jobs get-run <RUN_ID> --profile=my-profileStep 6 — Build the Frontend
Section titled “Step 6 — Build the Frontend”cd app/frontendnpx vite build# ✓ built in 45.2s# dist/ created with index.html + assets/The build output goes to app/frontend/dist/.
Step 7 — Import Frontend to Workspace
Section titled “Step 7 — Import Frontend to Workspace”# Import the built frontend to the Databricks workspacedatabricks workspace import-dir app/frontend/dist \ /Workspace/Users/you@company.com/energy-copilot/frontend \ --profile=my-profile \ --overwriteStep 8 — Deploy the App
Section titled “Step 8 — Deploy the App”databricks apps deploy energy-copilot \ --source-code-path /Workspace/Users/you@company.com/energy-copilot \ --profile=my-profileThe app URL is shown in the output:
App deployed: https://energy-copilot-<workspace-id>.aws.databricksapps.comStep 9 — Run Post-Deploy Setup
Section titled “Step 9 — Run Post-Deploy Setup”The post-deploy script handles resources that require the App to be deployed first:
./post_deploy.sh my-profile devThis script:
| Step | Description | Duration |
|---|---|---|
| Get App SP UUID | Looks up the Service Principal assigned to the App | ~5s |
| Grant Lakebase Perms | Grants the App SP SELECT on Lakebase gold schema | ~2 min |
| Create Synced Tables | Sets up continuous Delta-to-Postgres sync | ~3 min |
| Create Genie Spaces | Creates 12 AI/BI Genie spaces | ~2 min |
Step 10 — Start the Simulator
Section titled “Step 10 — Start the Simulator”databricks bundle run job_09_simulator --target dev --profile=my-profileThe simulator writes fresh NEM market data every 30 seconds to all Gold tables. It runs continuously until cancelled.
Step 11 — Verify the Deployment
Section titled “Step 11 — Verify the Deployment”# Check app healthcurl https://<app-url>/health
# Check data sourcecurl https://<app-url>/api/health/datasource
# Check latest pricescurl https://<app-url>/api/prices/latestExpected health response:
{ "status": "ok", "sql_connected": true, "lakebase_connected": true, "lakebase_fresh": true}Production Deployment
Section titled “Production Deployment”For production:
databricks bundle deploy --target prod --profile=prod-profile \ --var="notification_email=ops@mycompany.com"The prod target uses:
mode: production(no[dev]prefixes on resource names)catalog: energy_copilot(production catalog name)
Managing the Deployment
Section titled “Managing the Deployment”Check Deployed Resources
Section titled “Check Deployed Resources”# List all bundle resourcesdatabricks bundle summary --target dev --profile=my-profile
# Check app statusdatabricks apps get energy-copilot --profile=my-profile
# List jobsdatabricks jobs list --profile=my-profile | grep energy-copilotUpdate After Code Changes
Section titled “Update After Code Changes”# Rebuild frontend (if UI changed)cd app/frontend && npx vite build && cd ../..
# Redeploy (idempotent)databricks bundle deploy --target dev --profile=my-profileStop the Simulator
Section titled “Stop the Simulator”databricks jobs list --profile=my-profile | grep "Simulator"databricks jobs cancel-all-runs <JOB_ID> --profile=my-profileDestroy All Resources
Section titled “Destroy All Resources”databricks bundle destroy --target dev --profile=my-profileThis removes all bundle-managed resources. Data in Unity Catalog tables is preserved.
Troubleshooting
Section titled “Troubleshooting”bundle validate fails with “warehouse not found”
Section titled “bundle validate fails with “warehouse not found””The warehouse_id variable uses a name lookup: "Serverless Starter Warehouse". If your warehouse has a different name, override it:
databricks bundle deploy --target dev --var="warehouse_id=<your-warehouse-id>"Find your warehouse ID:
databricks warehouses list --profile=my-profileApp shows mock data instead of live data
Section titled “App shows mock data instead of live data”- Check
/health— verifysql_connected: true - Check
/api/health/datasource— verify queries hit real backends - Verify the SQL warehouse is running:
databricks warehouses list --profile=my-profile - Check app logs in the Databricks Apps UI → Logs tab
Lakebase returns stale data
Section titled “Lakebase returns stale data”- Check synced table status: Unity Catalog > Tables > [table name] > Sync Status
- Re-run synced table creation:
./post_deploy.sh my-profile dev - Verify the App SP has
SELECTon the Lakebase gold schema
Post-deploy script can’t find App SP
Section titled “Post-deploy script can’t find App SP”The service_principal_id field is only populated after the App is fully deployed and active. Wait 2–3 minutes after databricks apps deploy completes before running post_deploy.sh.
Genie spaces not appearing
Section titled “Genie spaces not appearing”Genie Spaces API is in preview. The setup/04_create_genie_spaces.py script includes a REST API fallback. If the SDK method fails:
python setup/04_create_genie_spaces.py --use-rest-api --profile=my-profileNotebook import fails with “already exists”
Section titled “Notebook import fails with “already exists””databricks workspace import /path/to/notebook.py \ --file=local.py \ --format=SOURCE \ --language=PYTHON \ --overwrite \ --profile=my-profile