| # | Fact |
|---|
Deployment topology
Known problems with Circle 1 structure
- ~200 Azure SQL logical servers is expensive, operationally unmanageable, and does not scale
- No elastic pool — resources cannot be shared across venues
- No standardised resource group structure or naming convention
- No IaC — deployments are manual
- "Allow Azure Services" is likely enabled on SQL — broad attack surface, credential-only protection
- No automated migration or decommission tooling
Architecture
0.0.0.0/0VNet 10.0.0.0/16 · Subnet 10.0.1.0/24
Public IP on NIC · IP forwarding ON
WireGuard UDP · SSH · Netmaker API
MQTT · Caddy 80/443
NETMAKER-ACL-IN · NETMAKER-ACL-FWD
netmakernat → MASQUERADE → eth0
UFW forward: deny (routed) — DRIFTED
CoreDNS: CRASH-LOOPING
netmaker-caddy: running (not in repo)
Critical live VM state issues
| Issue | Current State | Impact on Circle 2 |
|---|---|---|
| iptables managed by Netmaker | Chains rebuilt on every netclient restart or peer update | Manual iptables rules will not survive. Rules must go in /etc/ufw/before.rules |
| UFW forward policy: deny (routed) | Drifted from cloud-init setting of ACCEPT | Must be corrected before VNet forwarding works |
| CoreDNS crash-looping | Unknown root cause | Blocks private DNS resolution for venue PCs — hard blocker |
| netmaker-caddy running | Not in repo, added manually | Must be understood before VM changes |
| cloud-init not authoritative | Live VM has drifted | All changes must be made against live state, not the repo |
Deployment topology
How venue PCs reach Circle 2 after the VM change
/etc/ufw/before.rules — forwards traffic destined for 10.0.0.0/16 to eth0 (VNet) instead of NATting to internetWindows app direct SQL connection
Windows app API calls
Subnet plan (within existing 10.0.0.0/16 VNet)
| Subnet name | CIDR | Purpose | Status |
|---|---|---|---|
| snet-netmaker | 10.0.1.0/24 | Netmaker VM | Existing |
| snet-pe | 10.0.2.0/24 | Private endpoint NICs (SQL, future) | New |
| snet-appservice | 10.0.3.0/23 | App Service VNet Integration | New |
/etc/ufw/before.rules change on Netmaker VM
Added to the *filter table section of before.rules, above the COMMIT line. Outside Netmaker's chain management entirely — survives reboots and UFW reloads.
# Circle 2 -- forward VPN peers to VNet private IP space (10.0.0.0/16) -A ufw-before-forward -i wg0 -o eth0 -d 10.0.0.0/16 -j ACCEPT -A ufw-before-forward -i eth0 -o wg0 -m state --state RELATED,ESTABLISHED -j ACCEPT
Pre-requisite
/etc/default/ufw must have DEFAULT_FORWARD_POLICY="ACCEPT". Verify on the live VM — this setting drifted once already.
Test procedure (single venue first)
- Apply
before.ruleschange and verifyDEFAULT_FORWARD_POLICY - Run
sudo ufw reload - Confirm:
sudo ufw status verbose— should showallow (routed) - From one test venue PC:
ping 10.0.2.x(SQL private endpoint IP) - Test
sqlcmdconnection to SQL private endpoint from venue PC - Confirm internet browsing from same venue PC is unaffected
- If all pass: roll out applies automatically — no client changes needed
Private DNS resolution path for venue PCs
sql-circle2-prod.database.windows.net168.63.129.16 — reachable inside VNetDepends on CoreDNS being healthy (OI-12)
privatelink.database.windows.net — returns A record: 10.0.2.xbefore.rules FORWARD rule → into VNetCurrent leading option based on constraints (elastic pool cannot be in a per-venue group):
- rg-circle2-shared — SQL logical server, elastic pools, VNet subnets, private endpoints, private DNS zones, Key Vault
- rg-circle2-venues — all per-venue App Services
venue:[slug] and product:circle2 for cost filtering and access scoping. Tags replace per-venue resource groups for cost allocation purposes.| Resource Type | Pattern | Example |
|---|---|---|
| Resource Group (shared) | rg-circle2-shared | rg-circle2-shared |
| Resource Group (venues) | rg-circle2-venues | rg-circle2-venues |
| SQL Logical Server | sql-circle2-prod | sql-circle2-prod |
| Elastic Pool (app DBs) | ep-circle2-app | ep-circle2-app |
| Elastic Pool (WSS DBs) | ep-circle2-wss | ep-circle2-wss |
| Database (Circle 2 app) | db-c2-[slug] | db-c2-[slug] |
| Database (WSS) | db-wss-[slug] | db-wss-[slug] |
| App Service (API) | circle2-api-[slug] | circle2-api-[slug] |
| App Service (WSS) | circle2-wss-[slug] | circle2-wss-[slug] |
| App Service Plan | asp-circle2-prod-[nn] | asp-circle2-prod-01 |
| Key Vault | kv-circle2-prod | kv-circle2-prod |
| Private Endpoint | pe-sql-circle2 | pe-sql-circle2 |
| Private DNS Zone | privatelink.database.windows.net | Azure fixed format |
| VNet subnet (PE) | snet-pe | snet-pe |
| VNet subnet (App Service) | snet-appservice | snet-appservice |
What is confirmed
- One Azure SQL logical server: sql-circle2-prod, Australia East (Sydney)
- Public network access: disabled
- Private endpoint NIC in
snet-pe, IP in10.0.2.x - Private DNS Zone:
privatelink.database.windows.net - Each venue gets two databases:
db-c2-[slug]anddb-wss-[slug] - Both databases live inside the elastic pool(s), not as standalone databases
What is pending (OI-04)
- Whether one pool or two (app DBs and WSS DBs separate or combined)
- Pool pricing model: DTU (Standard/Premium) or vCore (General Purpose)
- Pool tier and size (e.g. Standard 400 eDTU, or General Purpose 4 vCores)
- Per-database min/max DTU or vCore allocation within the pool
What is confirmed
- One App Service per venue for Circle 2 API:
circle2-api-[slug] - One App Service per venue for WSS:
circle2-wss-[slug] - Both use VNet Integration into
snet-appserviceto reach SQL via private endpoint - Neither is publicly accessible — no inbound public endpoint required
- Support and admin staff access WSS via VPN only — no public FQDN needed for WSS
What is pending
- App Service Plan OS (Windows or Linux) — blocked on OI-02 (.NET version)
- Number of App Service Plans and venues-per-plan grouping — blocked on OI-03
- App Service Plan SKU tier — blocked on OI-02 and OI-03
What is confirmed
- Each App Service needs a connection string to its venue's database
- Connection strings must not be committed to source control or the IaC repo
- Support team accesses WSS over VPN — their access is network-level, not portal-level
What is pending (OI-07)
- Whether Azure Key Vault is used — one central vault vs one per venue
- Whether App Service Application Settings alone are sufficient
- Who on the support/admin team requires portal-level visibility to secrets
Confirmed tooling
- IaC: Bicep — native ARM, no state file, first-class Azure support, works natively with Azure DevOps
- Pipelines: Azure DevOps — extend existing org, not introduce a new tool
- Service connection: must be created from scratch — nothing exists currently
- Venue manifest: venues.json — single source of truth for all venue parameters, drives all pipeline loops
Repository structure
circle2-infra/
|-- bicep/
| |-- modules/
| | |-- sql-database.bicep # creates one venue DB in the elastic pool
| | |-- app-service.bicep # creates one venue App Service
| | |-- app-service-plan.bicep # creates or references a shared App Service Plan
| | `-- private-endpoint.bicep # SQL private endpoint (one-time, shared)
| |-- shared/
| | `-- main-shared.bicep # VNet subnets, SQL server, pool, PE, DNS, KV
| `-- venue/
| `-- main-venue.bicep # per-venue: databases + App Services
|-- pipelines/
| |-- deploy-shared.yml # one-time shared infrastructure pipeline
| |-- deploy-venue.yml # per-venue deploy (looped from venues.json)
| `-- migrate-venue.yml # per-venue DB migration pipeline
`-- venues/
`-- venues.json # venue manifest -- source of truth
venues.json structure (draft)
[
{
"slug": "TBD",
"displayName": "Venue Full Name",
"sourceType": "azure",
"sourceServer": "original-logical-server.database.windows.net",
"sourceCircleDb": "CircleDB",
"sourceWssDb": "WSSDB",
"appServicePlan": "asp-circle2-prod-01",
"environment": "prod"
}
]
What can be fully automated
- Shared infrastructure provisioning — one-time Bicep deploy
- Per-venue database creation in elastic pool — loop over venues.json
- Per-venue App Service creation and App Service Plan assignment
- Connection string injection into Key Vault or App Settings
- App artifact deployment via Azure DevOps zip deploy
- Database copy from source Azure SQL logical server to elastic pool (az sql db copy)
- Post-deployment health check on API endpoint
- Resource tagging with venue and product tags
- Decommission of Circle 1 SQL logical server after rollback window (OI-11)
What requires manual steps
- Netmaker VM iptables/UFW change — one-time, test on one venue first
- CoreDNS investigation and fix (Section 15) — prerequisite before DNS testing
- First-time Azure DevOps service connection setup
- Local MSSQL venue migrations if remote sqlpackage access is not available (OI-09)
- Cutover coordination communication per venue (maintenance window notification)
Type A — Azure SQL to Elastic Pool (~200 Azure-hosted venues)
Recommended method: az sql db copy — copies cross-logical-server directly into the elastic pool. No intermediate file, no staging storage. Fully automated in migrate-venue.yml.
# Circle 2 app database az sql db copy \ --resource-group <source-rg> \ --server <source-logical-server> \ --name CircleDB \ --dest-resource-group rg-circle2-shared \ --dest-server sql-circle2-prod \ --dest-name db-c2-[slug] \ --elastic-pool ep-circle2-app # WSS database az sql db copy \ --resource-group <source-rg> \ --server <source-logical-server> \ --name WSSDB \ --dest-resource-group rg-circle2-shared \ --dest-server sql-circle2-prod \ --dest-name db-wss-[slug] \ --elastic-pool ep-circle2-wss
Type B — Local MSSQL to Elastic Pool (on-premises venues)
Resolution depends on OI-09 (remote access via Netmaker).
Path B1 — Remote access available (automated)
sqlpackage /Action:Export \ /SourceServerName:<venue-local-sql-ip> \ /SourceDatabaseName:CircleDB \ /TargetFile:venue-[slug]-circle.bacpac az storage blob upload \ --file venue-[slug]-circle.bacpac \ --container-name migration-staging \ --name [slug]/circle.bacpac sqlpackage /Action:Import \ /TargetServerName:sql-circle2-prod.database.windows.net \ /TargetDatabaseName:db-c2-[slug] \ /SourceFile:venue-[slug]-circle.bacpac \ /TargetElasticPoolName:ep-circle2-app
Path B2 — No remote access (manual export required)
- Provide venue with a script that exports the database to a .bacpac and uploads to Azure Blob Storage container (st-circle2-migration)
- Pipeline monitors the container and triggers the import automatically on upload detection
- Import into elastic pool proceeds identically to Path B1 step 3
WSS database migration (recommended early track)
Pending OI-10: if WSS can be migrated while Circle 1 is still live, it should be done as a separate earlier track. This validates the entire Azure stack before the higher-risk Circle 2 database cutover.
- Migrate all WSS databases into the elastic pool
- Deploy all WSS App Services and confirm connection to WSS DBs via private endpoint
- Confirm support/admin staff can reach WSS over VPN
- Only then begin Circle 2 app database migrations and venue cutovers
Cutover process (per venue)
Pre-cutover — days in advance, venue still on Circle 1
Cutover window — venue in maintenance
Post-cutover
Rollback
Must be confirmed by connecting to the Azure portal before PENDING sections can be finalised.
15.1 Capture live VM state (do this first)
sudo iptables-save > /tmp/iptables-live-$(date +%Y%m%d).txt sudo ufw status verbose > /tmp/ufw-live-$(date +%Y%m%d).txt docker ps -a > /tmp/docker-live-$(date +%Y%m%d).txt docker compose ls > /tmp/compose-live-$(date +%Y%m%d).txt docker exec netmaker-caddy cat /etc/caddy/Caddyfile > /tmp/caddyfile-live-$(date +%Y%m%d).txt
15.2 Audit netmaker-caddy
- Inspect:
docker inspect netmaker-caddy - Identify what ports it is binding (likely 80/443 and Netmaker API port)
- Confirm what it is proxying: dashboard, API, or both
- Confirm before.rules changes do not interfere with its port bindings
- Document the Caddyfile and add it to the repo
15.3 CoreDNS crash-loop (OI-12) — hard blocker
docker logs --tail 100 netmaker-coredns docker inspect netmaker-coredns sudo ss -ulnp | grep :53 systemctl status systemd-resolved resolvectl status
15.4 UFW forward policy
cat /etc/default/ufw | grep DEFAULT_FORWARD_POLICY # If not ACCEPT: sudo sed -i 's/DEFAULT_FORWARD_POLICY="DROP"/DEFAULT_FORWARD_POLICY="ACCEPT"/' /etc/default/ufw sudo ufw reload sudo ufw status verbose # confirm 'allow (routed)'