v0.3  ·  Draft  ·  2026-03-05  ·  Circle Solutions Group / Whalesolution
01 Confirmed Facts 28 confirmed
click any cell to edit
#Fact
02 Open Items 13 unresolved
Every item here is a decision that has not been made. Sections marked PENDING depend on these items and must not be treated as final until resolved.
03 Current State — Circle 1

Deployment topology

Azure-Hosted Venue
SQLAzure SQL Logical Server~200 servers
DBCircle DBone per venue
DBWSS DBone per venue
APPAPI.NET · talks to Circle DB
APPWSS.NET · talks to WSS DB
Local MSSQL Venue
SQLOn-premises SQL Serveron-prem
DBCircle DB
DBWSS DB
APPAPIlocation TBC
APPWSSlocation TBC

Known problems with Circle 1 structure

  • ~200 Azure SQL logical servers is expensive, operationally unmanageable, and does not scale
  • No elastic pool — resources cannot be shared across venues
  • No standardised resource group structure or naming convention
  • No IaC — deployments are manual
  • "Allow Azure Services" is likely enabled on SQL — broad attack surface, credential-only protection
  • No automated migration or decommission tooling
04 Current VPN / Networking Infrastructure

Architecture

Venue PCs  400+ endpoints
Windows 10/11 · WireGuard client runs as Windows service · Full tunnel 0.0.0.0/0
WireGuard UDP · encrypted · all traffic
Azure VM — Netmaker / WireGuard Server
Network
VNet 10.0.0.0/16 · Subnet 10.0.1.0/24
Public IP on NIC · IP forwarding ON
NSG Inbound
WireGuard UDP · SSH · Netmaker API
MQTT · Caddy 80/443
iptables (Netmaker-managed)
NETMAKER-ACL-IN · NETMAKER-ACL-FWD
netmakernat → MASQUERADE → eth0
Live issues
UFW forward: deny (routed) — DRIFTED
CoreDNS: CRASH-LOOPING
netmaker-caddy: running (not in repo)
NAT → internet via single public IP
Azure SQL & API
Public endpoints · whitelisted to VM public IP · "Allow Azure Services" likely enabled on SQL

Critical live VM state issues

IssueCurrent StateImpact on Circle 2
iptables managed by NetmakerChains rebuilt on every netclient restart or peer updateManual iptables rules will not survive. Rules must go in /etc/ufw/before.rules
UFW forward policy: deny (routed)Drifted from cloud-init setting of ACCEPTMust be corrected before VNet forwarding works
CoreDNS crash-loopingUnknown root causeBlocks private DNS resolution for venue PCs — hard blocker
netmaker-caddy runningNot in repo, added manuallyMust be understood before VM changes
cloud-init not authoritativeLive VM has driftedAll changes must be made against live state, not the repo
Key insight: Venue PCs use a full tunnel (0.0.0.0/0) — all traffic already goes through the WireGuard tunnel. Once the VM is configured to forward VNet-bound traffic into the VNet, no changes are needed on the 400+ venue PCs or their WireGuard client configs.
05 Target Architecture — Circle 2

Deployment topology

Shared Infrastructure  — one, permanent
SQL
SQLsql-circle2-prodAzure SQL Logical Serverno public endpoint
POOLElastic Pool(s)all venue databases
DBdb-c2-[venue]Circle 2 app DB · one per venue
DBdb-wss-[venue]WSS DB · one per venue
Networking
VNETAzure VNet10.0.0.0/16
SUBsnet-netmaker10.0.1.0/24existing
SUBsnet-pe10.0.2.0/24 · SQL private endpoint NICnew
SUBsnet-appservice10.0.3.0/23 · App Service VNet Integrationnew
PEPrivate Endpoint→ sql-circle2-prod
DNSPrivate DNS Zoneprivatelink.database.windows.net
Secrets
KVKey Vaultpending OI-07
Per-Venue Resources  — ~400+ sets, one per venue
APPcircle2-api-[venue].NET · VNet Integration → db-c2-[venue]
APPcircle2-wss-[venue].NET · VNet Integration → db-wss-[venue]

How venue PCs reach Circle 2 after the VM change

Venue PC
WireGuard tunnel unchanged · no client config changes needed
encrypted tunnel · all traffic
Netmaker VM (modified)
/etc/ufw/before.rules — forwards traffic destined for 10.0.0.0/16 to eth0 (VNet) instead of NATting to internet
VNet private routing
10.0.2.x
SQL private endpoint
Windows app direct SQL connection
10.0.3.x
App Service VNet subnet
Windows app API calls
No public endpoints. No IP whitelisting. No changes to venue PCs.
06 Networking Design

Subnet plan (within existing 10.0.0.0/16 VNet)

Subnet nameCIDRPurposeStatus
snet-netmaker10.0.1.0/24Netmaker VMExisting
snet-pe10.0.2.0/24Private endpoint NICs (SQL, future)New
snet-appservice10.0.3.0/23App Service VNet IntegrationNew
These CIDRs assume 10.0.2.x and 10.0.3.x/10.0.4.x are currently unallocated. Must be confirmed during Azure portal verification (Section 14).

/etc/ufw/before.rules change on Netmaker VM

Added to the *filter table section of before.rules, above the COMMIT line. Outside Netmaker's chain management entirely — survives reboots and UFW reloads.

# Circle 2 -- forward VPN peers to VNet private IP space (10.0.0.0/16)
-A ufw-before-forward -i wg0 -o eth0 -d 10.0.0.0/16 -j ACCEPT
-A ufw-before-forward -i eth0 -o wg0 -m state --state RELATED,ESTABLISHED -j ACCEPT

Pre-requisite

/etc/default/ufw must have DEFAULT_FORWARD_POLICY="ACCEPT". Verify on the live VM — this setting drifted once already.

Test procedure (single venue first)

  1. Apply before.rules change and verify DEFAULT_FORWARD_POLICY
  2. Run sudo ufw reload
  3. Confirm: sudo ufw status verbose — should show allow (routed)
  4. From one test venue PC: ping 10.0.2.x (SQL private endpoint IP)
  5. Test sqlcmd connection to SQL private endpoint from venue PC
  6. Confirm internet browsing from same venue PC is unaffected
  7. If all pass: roll out applies automatically — no client changes needed

Private DNS resolution path for venue PCs

Venue PC
DNS query for sql-circle2-prod.database.windows.net
all DNS traffic goes through WireGuard tunnel
Netmaker VM
Forwards DNS query to Azure DNS 168.63.129.16 — reachable inside VNet
Depends on CoreDNS being healthy (OI-12)
forwarded to Azure DNS inside VNet
Azure Private DNS Zone
privatelink.database.windows.net — returns A record: 10.0.2.x
private IP returned to venue PC
Venue PC sends TCP to 10.0.2.x
Traffic to 10.0.2.x routes through tunnel → before.rules FORWARD rule → into VNet
VNet private routing
SQL Private Endpoint answers
Connection established · no public internet involved · no IP whitelist needed
Depends on CoreDNS being healthy (OI-12). The CoreDNS crash-loop must be diagnosed and resolved before this resolution path can be tested. If CoreDNS is not in the DNS path for venue PCs, a different forwarding approach may be used — this must be determined as part of the Section 15 investigation.
07 Resource Group Structure Pending OI-06
PENDING — blocked on OI-06
OI-06

Current leading option based on constraints (elastic pool cannot be in a per-venue group):

  • rg-circle2-shared — SQL logical server, elastic pools, VNet subnets, private endpoints, private DNS zones, Key Vault
  • rg-circle2-venues — all per-venue App Services
All resources tagged with venue:[slug] and product:circle2 for cost filtering and access scoping. Tags replace per-venue resource groups for cost allocation purposes.
08 Naming Convention Pending OI-05 OI-06
PENDING — slug format (OI-05) must be confirmed before any resources are named
OI-05OI-06
Resource TypePatternExample
Resource Group (shared)rg-circle2-sharedrg-circle2-shared
Resource Group (venues)rg-circle2-venuesrg-circle2-venues
SQL Logical Serversql-circle2-prodsql-circle2-prod
Elastic Pool (app DBs)ep-circle2-appep-circle2-app
Elastic Pool (WSS DBs)ep-circle2-wssep-circle2-wss
Database (Circle 2 app)db-c2-[slug]db-c2-[slug]
Database (WSS)db-wss-[slug]db-wss-[slug]
App Service (API)circle2-api-[slug]circle2-api-[slug]
App Service (WSS)circle2-wss-[slug]circle2-wss-[slug]
App Service Planasp-circle2-prod-[nn]asp-circle2-prod-01
Key Vaultkv-circle2-prodkv-circle2-prod
Private Endpointpe-sql-circle2pe-sql-circle2
Private DNS Zoneprivatelink.database.windows.netAzure fixed format
VNet subnet (PE)snet-pesnet-pe
VNet subnet (App Service)snet-appservicesnet-appservice
09 Database Layer Pending OI-04
PENDING — pool tier, DTU vs vCore, one pool vs two — all depend on OI-04
OI-04

What is confirmed

  • One Azure SQL logical server: sql-circle2-prod, Australia East (Sydney)
  • Public network access: disabled
  • Private endpoint NIC in snet-pe, IP in 10.0.2.x
  • Private DNS Zone: privatelink.database.windows.net
  • Each venue gets two databases: db-c2-[slug] and db-wss-[slug]
  • Both databases live inside the elastic pool(s), not as standalone databases

What is pending (OI-04)

  • Whether one pool or two (app DBs and WSS DBs separate or combined)
  • Pool pricing model: DTU (Standard/Premium) or vCore (General Purpose)
  • Pool tier and size (e.g. Standard 400 eDTU, or General Purpose 4 vCores)
  • Per-database min/max DTU or vCore allocation within the pool
10 Compute Layer Pending OI-02 OI-03
PENDING — App Service Plan OS depends on OI-02. Plan sharing strategy depends on OI-03.
OI-02OI-03

What is confirmed

  • One App Service per venue for Circle 2 API: circle2-api-[slug]
  • One App Service per venue for WSS: circle2-wss-[slug]
  • Both use VNet Integration into snet-appservice to reach SQL via private endpoint
  • Neither is publicly accessible — no inbound public endpoint required
  • Support and admin staff access WSS via VPN only — no public FQDN needed for WSS

What is pending

  • App Service Plan OS (Windows or Linux) — blocked on OI-02 (.NET version)
  • Number of App Service Plans and venues-per-plan grouping — blocked on OI-03
  • App Service Plan SKU tier — blocked on OI-02 and OI-03
11 Secret Management Pending OI-07
PENDING — approach depends on OI-07
OI-07

What is confirmed

  • Each App Service needs a connection string to its venue's database
  • Connection strings must not be committed to source control or the IaC repo
  • Support team accesses WSS over VPN — their access is network-level, not portal-level

What is pending (OI-07)

  • Whether Azure Key Vault is used — one central vault vs one per venue
  • Whether App Service Application Settings alone are sufficient
  • Who on the support/admin team requires portal-level visibility to secrets
12 Automation Strategy Pending OI-02 03 05 06 07
PENDING — Bicep module design and pipeline structure cannot be finalised until OI-02, OI-03, OI-05, OI-06, OI-07 are resolved
OI-02OI-03OI-05OI-06OI-07

Confirmed tooling

  • IaC: Bicep — native ARM, no state file, first-class Azure support, works natively with Azure DevOps
  • Pipelines: Azure DevOps — extend existing org, not introduce a new tool
  • Service connection: must be created from scratch — nothing exists currently
  • Venue manifest: venues.json — single source of truth for all venue parameters, drives all pipeline loops

Repository structure

circle2-infra/
|-- bicep/
|   |-- modules/
|   |   |-- sql-database.bicep       # creates one venue DB in the elastic pool
|   |   |-- app-service.bicep        # creates one venue App Service
|   |   |-- app-service-plan.bicep   # creates or references a shared App Service Plan
|   |   `-- private-endpoint.bicep   # SQL private endpoint (one-time, shared)
|   |-- shared/
|   |   `-- main-shared.bicep        # VNet subnets, SQL server, pool, PE, DNS, KV
|   `-- venue/
|       `-- main-venue.bicep         # per-venue: databases + App Services
|-- pipelines/
|   |-- deploy-shared.yml            # one-time shared infrastructure pipeline
|   |-- deploy-venue.yml             # per-venue deploy (looped from venues.json)
|   `-- migrate-venue.yml            # per-venue DB migration pipeline
`-- venues/
    `-- venues.json                  # venue manifest -- source of truth

venues.json structure (draft)

[
  {
    "slug": "TBD",
    "displayName": "Venue Full Name",
    "sourceType": "azure",
    "sourceServer": "original-logical-server.database.windows.net",
    "sourceCircleDb": "CircleDB",
    "sourceWssDb": "WSSDB",
    "appServicePlan": "asp-circle2-prod-01",
    "environment": "prod"
  }
]

What can be fully automated

  • Shared infrastructure provisioning — one-time Bicep deploy
  • Per-venue database creation in elastic pool — loop over venues.json
  • Per-venue App Service creation and App Service Plan assignment
  • Connection string injection into Key Vault or App Settings
  • App artifact deployment via Azure DevOps zip deploy
  • Database copy from source Azure SQL logical server to elastic pool (az sql db copy)
  • Post-deployment health check on API endpoint
  • Resource tagging with venue and product tags
  • Decommission of Circle 1 SQL logical server after rollback window (OI-11)

What requires manual steps

  • Netmaker VM iptables/UFW change — one-time, test on one venue first
  • CoreDNS investigation and fix (Section 15) — prerequisite before DNS testing
  • First-time Azure DevOps service connection setup
  • Local MSSQL venue migrations if remote sqlpackage access is not available (OI-09)
  • Cutover coordination communication per venue (maintenance window notification)
13 Migration Strategy
The Circle 1 and Circle 2 schemas are identical. Migration is a copy operation — no ETL, no transformation, no schema upgrade. Each venue migration covers two databases: the Circle 2 app database and the WSS database.

Type A — Azure SQL to Elastic Pool (~200 Azure-hosted venues)

Recommended method: az sql db copy — copies cross-logical-server directly into the elastic pool. No intermediate file, no staging storage. Fully automated in migrate-venue.yml.

# Circle 2 app database
az sql db copy \
  --resource-group <source-rg> \
  --server <source-logical-server> \
  --name CircleDB \
  --dest-resource-group rg-circle2-shared \
  --dest-server sql-circle2-prod \
  --dest-name db-c2-[slug] \
  --elastic-pool ep-circle2-app

# WSS database
az sql db copy \
  --resource-group <source-rg> \
  --server <source-logical-server> \
  --name WSSDB \
  --dest-resource-group rg-circle2-shared \
  --dest-server sql-circle2-prod \
  --dest-name db-wss-[slug] \
  --elastic-pool ep-circle2-wss

Type B — Local MSSQL to Elastic Pool (on-premises venues)

Resolution depends on OI-09 (remote access via Netmaker).

Path B1 — Remote access available (automated)

sqlpackage /Action:Export \
  /SourceServerName:<venue-local-sql-ip> \
  /SourceDatabaseName:CircleDB \
  /TargetFile:venue-[slug]-circle.bacpac

az storage blob upload \
  --file venue-[slug]-circle.bacpac \
  --container-name migration-staging \
  --name [slug]/circle.bacpac

sqlpackage /Action:Import \
  /TargetServerName:sql-circle2-prod.database.windows.net \
  /TargetDatabaseName:db-c2-[slug] \
  /SourceFile:venue-[slug]-circle.bacpac \
  /TargetElasticPoolName:ep-circle2-app

Path B2 — No remote access (manual export required)

  1. Provide venue with a script that exports the database to a .bacpac and uploads to Azure Blob Storage container (st-circle2-migration)
  2. Pipeline monitors the container and triggers the import automatically on upload detection
  3. Import into elastic pool proceeds identically to Path B1 step 3

WSS database migration (recommended early track)

Pending OI-10: if WSS can be migrated while Circle 1 is still live, it should be done as a separate earlier track. This validates the entire Azure stack before the higher-risk Circle 2 database cutover.

  1. Migrate all WSS databases into the elastic pool
  2. Deploy all WSS App Services and confirm connection to WSS DBs via private endpoint
  3. Confirm support/admin staff can reach WSS over VPN
  4. Only then begin Circle 2 app database migrations and venue cutovers

Cutover process (per venue)

Pre-cutover — days in advance, venue still on Circle 1

    Cutover window — venue in maintenance

      Post-cutover

        Rollback

        The Circle 1 database is never modified during migration — only copied. Rollback is a configuration change: repoint the Circle 2 Windows app back to the Circle 1 server. Circle 1 remains live and unchanged throughout.
        14 Required Azure Portal Verification

        Must be confirmed by connecting to the Azure portal before PENDING sections can be finalised.

          15 Pre-Work — VM Issues to Resolve Before Circle 2 4 blockers
          These must be addressed before Circle 2 networking can be tested. They are prerequisite work on the existing VM — not part of the Circle 2 deployment pipeline.

          15.1 Capture live VM state (do this first)

          sudo iptables-save > /tmp/iptables-live-$(date +%Y%m%d).txt
          sudo ufw status verbose > /tmp/ufw-live-$(date +%Y%m%d).txt
          docker ps -a > /tmp/docker-live-$(date +%Y%m%d).txt
          docker compose ls > /tmp/compose-live-$(date +%Y%m%d).txt
          docker exec netmaker-caddy cat /etc/caddy/Caddyfile > /tmp/caddyfile-live-$(date +%Y%m%d).txt

          15.2 Audit netmaker-caddy

          • Inspect: docker inspect netmaker-caddy
          • Identify what ports it is binding (likely 80/443 and Netmaker API port)
          • Confirm what it is proxying: dashboard, API, or both
          • Confirm before.rules changes do not interfere with its port bindings
          • Document the Caddyfile and add it to the repo

          15.3 CoreDNS crash-loop (OI-12) — hard blocker

          Private endpoints require venue PCs to resolve sql-circle2-prod.database.windows.net to the private IP (10.0.2.x). If CoreDNS is in the DNS path for venue PCs, this crash-loop is a hard blocker for Circle 2.
          docker logs --tail 100 netmaker-coredns
          docker inspect netmaker-coredns
          sudo ss -ulnp | grep :53
          systemctl status systemd-resolved
          resolvectl status

          15.4 UFW forward policy

          cat /etc/default/ufw | grep DEFAULT_FORWARD_POLICY
          
          # If not ACCEPT:
          sudo sed -i 's/DEFAULT_FORWARD_POLICY="DROP"/DEFAULT_FORWARD_POLICY="ACCEPT"/' /etc/default/ufw
          sudo ufw reload
          sudo ufw status verbose   # confirm 'allow (routed)'