New instance

Instructions in this section document the process of deploying a new instance or region.

Note

In order to deploy a new region, you need the following utilities:

If you are working with Cloud Shell these utilities are already available.

Create database

Edit and use the snippet below to deploy the database using az-cli:

LOCATION=centralus
PSQL_NAME=megforms-${LOCATION:?}

PSQL_PASSWORD=$(openssl rand -base64 20)
echo "${PSQL_NAME:?} database password: ${PSQL_PASSWORD:?}"

az postgres flexible-server create \
    --resource-group meg \
    --name ${PSQL_NAME:?} \
    --location ${LOCATION:?} \
    --database-name megforms \
    --admin-user megforms \
    --admin-password "${PSQL_PASSWORD:?}" \
    --storage-size 128 \
    --backup-retention 30 \
    --geo-redundant-backup Enabled \
    --tier Burstable \
    --sku-name Standard_B2s \
    --tags "purpose=production" "region=${LOCATION:?}" \
    --version 14

See also

To get a full list of locations, run az account list-locations -o table.

To learn more about az postgres flexible-server create, visit az-cli documentation

Test connection

You can use the output connection string to try connecting to the database:

psql "postgresql://megforms:${PSQL_PASSWORD}@${PSQL_NAME}.postgres.database.azure.com/postgres?sslmode=require"

Upgrading PostgreSQL Flexible Server

You can upgrade your Azure Database for PostgreSQL Flexible Server to a new major version using the following command:

VERSION=16

az postgres flexible-server upgrade \
  --resource-group meg \
  --name ${PSQL_NAME:?} \
  --version ${VERSION:?}

Important

After completing the major version upgrade, it is mandatory to run the ANALYZE command in the database.

postgres=> ANALYZE;

See the official Azure PostgreSQL Post upgrade documentation for details.

Performance optimization and tweaking

The new database will appear in Azure PostgreSQL flexible servers. You can edit the created database to optimize requirements as needed:

See also

Scaling database

Networking

Tick “Allow public access from any Azure service within Azure to this server”. You can later un-tick it and instead whitelist the kubernetes IP address.

To view the list of IP addresses, use az network public-ip list -o table.

Compute + storage

The above command creates a Burstable database tier, it is suitable for testing and demoing. For production workloads, consider upgrading to GeneralPurpose once clients in the region start using it full time.

Maintenance

Set desired maintenance window based on expected usage times. For example, schedule maintenance for Sunday morning.

Reservations

Create a reservation to commit to using the resource for a minumum time period and reduce cost of the database.

Monitoring

Add the database to the relevant widgets in the monitoring dashboard

Create Kubernetes cluster

Edit and use the following snippet to deploy a new kubernetes cluster:

LOCATION=centralus
NAME=${LOCATION}

kubernetes_version=$(az aks get-versions --query 'orchestrators[-1].orchestratorVersion' -o tsv)

# Create the cluster
az aks create \
    --resource-group meg \
    --name ${NAME} \
    --location ${LOCATION} \
    --load-balancer-sku standard \
    --tier standard \
    --enable-cluster-autoscaler \
    --min-count 2 \
    --max-count 5 \
    --network-plugin kubenet \
    --tags "purpose=production" "region=${LOCATION}" \
    --attach-acr MegForms \
    --auto-upgrade-channel patch \
    --node-vm-size Standard_D2as_v6 \
    --kubernetes-version ${kubernetes_version:?} \
    --nodepool-name meg

# Make it available to the kubectl command
az aks get-credentials \
    --name ${NAME} \
    --resource-group meg

The new kubernetes cluster will appear in Azure Kubernetes services

See also

To learn more about az aks create, visit documentation

Create a static IP address & domain name

Create a public static IP address for the cluster

Creates a static IP address and adds it to DNS
export DOMAIN_NAME="${LOCATION}.qms.megit.com"
export RESOURCE_GROUP=$(az aks show --resource-group meg --name ${NAME:?} --query 'nodeResourceGroup' -o tsv)
# Create the IP address
export IP_ADDRESS=$(az network public-ip create -g ${RESOURCE_GROUP:?} --name kubernetes-prod --sku Standard --allocation-method static --query publicIp.ipAddress -o tsv)
echo "Created IP ${IP_ADDRESS} in ${RESOURCE_GROUP}"
az network dns record-set a add-record --resource-group meg --zone-name qms.megit.com --record-set-name "${LOCATION}" --ipv4-address ${IP_ADDRESS}
echo "Record set added: ${DOMAIN_NAME}

Install Helm charts

Run the following snippet to install ingress controller and cert manager. Note that it relies on IP_ADDRESS and DOMAIN_NAME set in Create a static IP address & domain name, and NAME containing cluster name.

# Ensure you're installing charts on the correct cluster
kubectl config use-context ${NAME:?}

# Add helm repos
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo add jetstack https://charts.jetstack.io

# Install charts
helm install ingress-nginx ingress-nginx/ingress-nginx --set controller.replicaCount=3 --set controller.service.loadBalancerIP="${IP_ADDRESS:?}" --set controller.service.externalTrafficPolicy=Local -n ingress
helm install cert-manager jetstack/cert-manager --set installCRDs=true -n cert-manager

# Deploy Ingress - it is important that DOMAIN_NAME variable is exported so that envsubst can access it
cat kubernetes/deployment/template.yaml | envsubst | kubectl apply -f -

Note

In order to upgrade Helm charts, use the following command:

CERT_MANAGER_VERSION=updated-cert-manager-version
NGINX_VERSION=updated-nginx-version
helm upgrade --version ${CERT_MANAGER_VERSION:?} cert-manager jetstack/cert-manager --set installCRDs=true -n cert-manager
helm upgrade --version ${NGINX_VERSION:?} ingress-nginx ingress-nginx/ingress-nginx --set controller.replicaCount=3 --set controller.service.loadBalancerIP="${IP_ADDRESS:?}" --set controller.service.externalTrafficPolicy=Local -n ingress

NGINX Telemetry

To enable telemetry for the NGINX ingress controller using the OpenTelemetry collector, apply the following manifest. This allows exporting metrics and traces from NGINX to your observability backend.

kubectl apply -f ./kubernetes/nginx-telemetry/nginx-telemetry.yaml

Note

If the OpenTelemetry collector is not deployed when telemetry is enabled in NGINX, metrics and traces will be lost, and observability will be unavailable until the collector is running. NGINX itself is unlikely to crash or become unstable.

Enabling telemetry

By default, OpenTelemetry is disabled. To enable it, you can patch the nginx-ingress-controller ConfigMap using the following command:

kubectl patch configmap nginx-ingress-controller --patch '{"data": {"enable-opentelemetry": "true"}}'

Ensure the Ingress resource includes the nginx.ingress.kubernetes.io/enable-opentelemetry: "true" annotation, and the nginx-ingress-controller ConfigMap is configured with OpenTelemetry settings.

Sampling rate

To adjust sampling rate, set otel-sampler-ratio on the ingress config map:

kubectl patch configmap nginx-ingress-controller --patch '{"data": {"otel-sampler-ratio": "0.01"}}'

Deploy the project into the cluster

Deploy base resources & configuration

Deploy yaml files located in /kubernetes/setup directory, and update secret containing database password. Note that PSQL_PASSWORD variable created in Create database is being used here as well as DOMAIN_NAME from Create a static IP address & domain name.

create base resources
kubectl apply -f kubernetes/setup/providers/azure.yaml -f kubernetes/setup/
Update configuration
POSTGRES_HOST=$(az postgres flexible-server show --resource-group meg --name ${PSQL_NAME:?} --query 'fullyQualifiedDomainName' -o tsv)
kubectl create secret generic database-password --from-literal="POSTGRES_PASSWORD=${PSQL_PASSWORD:?}" --dry-run=client -o yaml | kubectl apply -f -
kubectl patch configmap megforms --patch "{\"data\":{\"POSTGRES_HOST\":\"${POSTGRES_HOST:?}\", \"SITE_DOMAIN\": \"${DOMAIN_NAME:?}\", \"SAML_ENTITY_ID\": \"https://${DOMAIN_NAME:?}/\"}}"

Additional changes can be made to the ConfigMap by using kubectl edit configmap megforms

Add new deployment target to CI & Deploy

Important

Before proceeding with the following changes, you must assign the Contributor role to the Service Principal in Azure to grant GitLab the necessary permissions to make modifications in the new cluster. Run the following command:

az role assignment create --assignee 5ed4ddae-1272-478c-971e-e3a9a97716c3 --role "Contributor" --scope subscriptions/0f14d0e0-e21b-4d40-912a-3111c3a445e2/resourceGroups/${RESOURCE_GROUP}/providers/Microsoft.ContainerService/managedClusters/${NAME}

Add the following section to .gitlab-ci.deploy.yml:

Remember to replace CLUSTER_NAME with actual k8s cluster name
k8s:CLUSTER_NAME:deploy:production:
  extends: .k8s:deploy:production
  variables:
    cluster: CLUSTER_NAME
    DOMAIN_NAME: CLUSTER_NAME.qms.megit.com
  # Remove these after testing
  when: manual
  only: []

Commit and push this change, and trigger the job to deploy current version to the new cluster. Once deployed, you should start seeing the new pods in kubectl get pods.

Important

Remove when and only keys after initial deployment to allow the job to run only when new tags are being released.

Wait for database migrations to complete. Run kubectl.exe logs -f job/migration to observe progress.

Add new server to the Status Page

To ensure the new region is monitored on the status page, update the site list in index.js within the status-page repository.

Set up Azure Backup Media Files

The following instructions will guide you through the process of setting up a backup solution for media files stored in a persistent volume within an Azure Kubernetes Service (AKS) cluster.

Edit and use the snippets below to deploy the backup vault, create a backup policy, and register a file share with the backup vault using az-cli:

# Set variables
RESOURCE_GROUP=meg
LOCATION=westeurope
BACKUP_VAULT_NAME=west-backups
FILE_SHARE_NAME=file-share-name
STORAGE_ACCOUNT_NAME=storage-account-name
POLICY_NAME=west-daily-policy
SCHEDULE_RUN_TIME="2024-07-09T02:00:00+00:00"
RETENTION_DAYS=90

# Create a Backup Vault
az backup vault create \
    --resource-group ${RESOURCE_GROUP:?} \
    --name ${BACKUP_VAULT_NAME:?} \
    --location ${LOCATION:?}

# Create a Daily Backup Policy with 90 Days Retention
az backup policy create \
    --backup-management-type AzureStorage \
    --resource-group ${RESOURCE_GROUP:?} \
    --vault-name ${BACKUP_VAULT_NAME:?} \
    --name ${POLICY_NAME:?} \
    --policy '{
        "name": "'${POLICY_NAME:?}'",
        "properties": {
            "backupManagementType": "AzureStorage",
            "workLoadType": "AzureFileShare",
            "schedulePolicy": {
                "schedulePolicyType": "SimpleSchedulePolicy",
                "scheduleRunFrequency": "Daily",
                "scheduleRunTimes": ["'${SCHEDULE_RUN_TIME:?}'"]
            },
            "retentionPolicy": {
                "retentionPolicyType": "LongTermRetentionPolicy",
                "dailySchedule": {
                    "retentionTimes": ["'${SCHEDULE_RUN_TIME:?}'"],
                    "retentionDuration": {
                        "count": '${RETENTION_DAYS:?}',
                        "durationType": "Days"
                    }
                }
            }
        }
    }'

# Register the File Share with the Backup Vault Using the Daily Backup Policy
az backup protection enable-for-azurefileshare \
    --resource-group ${RESOURCE_GROUP:?} \
    --vault-name ${BACKUP_VAULT_NAME:?} \
    --storage-account ${STORAGE_ACCOUNT_NAME:?} \
    --azure-file-share ${FILE_SHARE_NAME:?} \
    --policy-name ${POLICY_NAME:?}

See also

To learn more about az backup, visit az-cli documentation for az backup

Restore Azure File Share Backup

This section provides instructions on how to restore a backup of an Azure File Share.

Edit and use the snippets below to restore a backup of an Azure File Share using az-cli:

# Set variables
RESOURCE_GROUP=meg
VAULT_NAME=west-backups
CONTAINER_NAME=StorageAccountName
ITEM_NAME=BackupItem
RECOVERY_POINT_NAME=RecoveryPoint
RESTORE_MODE=OriginalLocation
CONFLICT_RESOLUTION=Overwrite

# Restore Azure File Share from backup
az backup restore restore-azurefileshare \
    --resource-group ${RESOURCE_GROUP:?} \
    --vault-name ${VAULT_NAME:?} \
    --container-name ${CONTAINER_NAME:?} \
    --item-name ${ITEM_NAME:?} \
    --rp-name ${RECOVERY_POINT_NAME:?} \
    --resolve-conflict ${CONFLICT_RESOLUTION:?} \
    --restore-mode ${RESTORE_MODE:?}

See also

To learn more about az backup restore restore-azurefileshare, visit az-cli documentation for restore-azurefileshare

Site set-up

Visit the site created in Create a static IP address & domain name. Once migration is completed, you should be able to log in with credentials listed in Test user accounts.

Set-up incoming e-mail hook

Task #26421

  1. Create a new MX domain in godaddy DNS pointing at mx.sendgrid.net

    See also

    SendGrid documentation for adding MX record

  2. Create a new TXT entry for the subdomain with the following value to authorize Sendgrid to send mail from this domain using SPF:

    v=spf1 include:sendgrid.net  -all
    
  3. Add the domain to SendGrid Sender Authentication to enable DKIM

  4. Add the new domain to SendGrid. Enter webhook url https://audits.megsupporttools.com/emails/receive but replace domain name with the new domain.

Set-up SSO with Google for meg staff

  1. Add a new app in Google Admin in Apps ‣ Overview ‣ Web and mobile apps

  2. Click Add app ‣ Add custom SAML app

  3. Add new SAML identity provider django admin

    Settings:
    • Link with existing user account: by e-mail

    • wantNameId: true

    • wantAttributeStatement: false

    • Client-side Entity ID: must match domain name (this is coming from env var SAML_ENTITY_ID)