.. _new instance:
=============
New instance
=============
.. highlight:: shell
Instructions in this section document the process of deploying a new instance or :term:`region`.
.. note::
In order to deploy a new region, you need the following utilities:
* `Azure CLI `_
* `Kubernetes client `_
* `Helm `_
* CLI utilities: :command:`base64`, :command:`openssl`
If you are working with `Cloud Shell `_ these utilities are already available.
.. _create database:
Create database
==================
Edit and use the snippet below to deploy the database using :program:`az-cli`:
.. code-block:: bash
LOCATION=centralus
PSQL_NAME=megforms-${LOCATION:?}
PSQL_PASSWORD=$(openssl rand -base64 20)
echo "${PSQL_NAME:?} database password: ${PSQL_PASSWORD:?}"
az postgres flexible-server create \
--resource-group meg \
--name ${PSQL_NAME:?} \
--location ${LOCATION:?} \
--database-name megforms \
--admin-user megforms \
--admin-password "${PSQL_PASSWORD:?}" \
--storage-size 128 \
--backup-retention 30 \
--geo-redundant-backup Enabled \
--tier Burstable \
--sku-name Standard_B2s \
--tags "purpose=production" "region=${LOCATION:?}" \
--version 14
.. seealso::
To get a full list of locations, run ``az account list-locations -o table``.
To learn more about :command:`az postgres flexible-server create`, visit `az-cli documentation `_
Test connection
-------------------
You can use the output connection string to try connecting to the database::
psql "postgresql://megforms:${PSQL_PASSWORD}@${PSQL_NAME}.postgres.database.azure.com/postgres?sslmode=require"
Upgrading PostgreSQL Flexible Server
--------------------------------------
You can upgrade your Azure Database for PostgreSQL Flexible Server to a new major version using the following command:
.. code-block:: bash
VERSION=16
az postgres flexible-server upgrade \
--resource-group meg \
--name ${PSQL_NAME:?} \
--version ${VERSION:?}
.. important:: After completing the major version upgrade, it is **mandatory** to run the `ANALYZE` command in the database.
.. code-block:: sql
postgres=> ANALYZE;
See the official `Azure PostgreSQL Post upgrade documentation `_ for details.
Performance optimization and tweaking
----------------------------------------
The new database will appear in `Azure PostgreSQL flexible servers `_.
You can edit the created database to optimize requirements as needed:
.. seealso:: :ref:`scaling-database`
Networking
Tick "Allow public access from any Azure service within Azure to this server".
You can later un-tick it and instead whitelist the kubernetes IP address.
To view the list of IP addresses, use :command:`az network public-ip list -o table`.
Compute + storage
The above command creates a *Burstable* database tier, it is suitable for testing and demoing. For production workloads, consider upgrading to *GeneralPurpose* once clients in the region start using it full time.
Maintenance
Set desired maintenance window based on expected usage times. For example, schedule maintenance for Sunday morning.
Reservations
Create a `reservation `_ to commit to using the resource for a minumum time period and reduce cost of the database.
Monitoring
Add the database to the relevant widgets in the `monitoring `_ dashboard
Create Kubernetes cluster
============================
Edit and use the following snippet to deploy a new kubernetes cluster:
.. code-block:: bash
LOCATION=centralus
NAME=${LOCATION}
kubernetes_version=$(az aks get-versions --query 'orchestrators[-1].orchestratorVersion' -o tsv)
# Create the cluster
az aks create \
--resource-group meg \
--name ${NAME} \
--location ${LOCATION} \
--load-balancer-sku standard \
--tier standard \
--enable-cluster-autoscaler \
--min-count 2 \
--max-count 5 \
--network-plugin kubenet \
--tags "purpose=production" "region=${LOCATION}" \
--attach-acr MegForms \
--auto-upgrade-channel patch \
--node-vm-size Standard_D2as_v6 \
--kubernetes-version ${kubernetes_version:?} \
--nodepool-name meg
# Make it available to the kubectl command
az aks get-credentials \
--name ${NAME} \
--resource-group meg
The new kubernetes cluster will appear in `Azure Kubernetes services `_
.. seealso:: To learn more about :command:`az aks create`, visit `documentation `_
.. _create ip address:
Create a static IP address & domain name
----------------------------------------------
Create a public static IP address for the cluster
.. code-block::
:caption: Creates a static IP address and adds it to DNS
export DOMAIN_NAME="${LOCATION}.qms.megit.com"
export RESOURCE_GROUP=$(az aks show --resource-group meg --name ${NAME:?} --query 'nodeResourceGroup' -o tsv)
# Create the IP address
export IP_ADDRESS=$(az network public-ip create -g ${RESOURCE_GROUP:?} --name kubernetes-prod --sku Standard --allocation-method static --query publicIp.ipAddress -o tsv)
echo "Created IP ${IP_ADDRESS} in ${RESOURCE_GROUP}"
az network dns record-set a add-record --resource-group meg --zone-name qms.megit.com --record-set-name "${LOCATION}" --ipv4-address ${IP_ADDRESS}
echo "Record set added: ${DOMAIN_NAME}
Install Helm charts
---------------------
Run the following snippet to install ingress controller and cert manager.
Note that it relies on :const:`IP_ADDRESS` and :const:`DOMAIN_NAME` set in :ref:`create ip address`,
and :const:`NAME` containing cluster name.
.. code-block:: shell
# Ensure you're installing charts on the correct cluster
kubectl config use-context ${NAME:?}
# Add helm repos
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo add jetstack https://charts.jetstack.io
# Install charts
helm install ingress-nginx ingress-nginx/ingress-nginx --set controller.replicaCount=3 --set controller.service.loadBalancerIP="${IP_ADDRESS:?}" --set controller.service.externalTrafficPolicy=Local -n ingress
helm install cert-manager jetstack/cert-manager --set installCRDs=true -n cert-manager
# Deploy Ingress - it is important that DOMAIN_NAME variable is exported so that envsubst can access it
cat kubernetes/deployment/template.yaml | envsubst | kubectl apply -f -
.. note::
In order to upgrade Helm charts, use the following command:
.. code-block:: shell
CERT_MANAGER_VERSION=updated-cert-manager-version
NGINX_VERSION=updated-nginx-version
helm upgrade --version ${CERT_MANAGER_VERSION:?} cert-manager jetstack/cert-manager --set installCRDs=true -n cert-manager
helm upgrade --version ${NGINX_VERSION:?} ingress-nginx ingress-nginx/ingress-nginx --set controller.replicaCount=3 --set controller.service.loadBalancerIP="${IP_ADDRESS:?}" --set controller.service.externalTrafficPolicy=Local -n ingress
NGINX Telemetry
----------------
To enable telemetry for the NGINX ingress controller using the OpenTelemetry collector, apply the following manifest.
This allows exporting metrics and traces from NGINX to your observability backend.
.. code-block:: shell
kubectl apply -f ./kubernetes/nginx-telemetry/nginx-telemetry.yaml
.. note::
If the OpenTelemetry collector is not deployed when telemetry is enabled in NGINX, **metrics and traces will be lost**, and observability will be unavailable until the collector is running. NGINX itself is unlikely to crash or become unstable.
Enabling telemetry
^^^^^^^^^^^^^^^^^^^
By default, OpenTelemetry is disabled. To enable it, you can patch the ``nginx-ingress-controller`` ConfigMap using the following command:
.. code-block:: shell
kubectl patch configmap nginx-ingress-controller --patch '{"data": {"enable-opentelemetry": "true"}}'
Ensure the Ingress resource includes the ``nginx.ingress.kubernetes.io/enable-opentelemetry: "true"`` annotation,
and the ``nginx-ingress-controller`` ConfigMap is configured with OpenTelemetry settings.
Sampling rate
^^^^^^^^^^^^^^
To adjust sampling rate, set ``otel-sampler-ratio`` on the ingress config map:
.. code-block:: shell
kubectl patch configmap nginx-ingress-controller --patch '{"data": {"otel-sampler-ratio": "0.01"}}'
Deploy the project into the cluster
=======================================
Deploy base resources & configuration
--------------------------------------
Deploy yaml files located in :file:`/kubernetes/setup` directory, and update secret containing database password.
Note that :const:`PSQL_PASSWORD` variable created in :ref:`create database` is being used here
as well as :const:`DOMAIN_NAME` from :ref:`create ip address`.
.. code-block:: shell
:caption: create base resources
kubectl apply -f kubernetes/setup/providers/azure.yaml -f kubernetes/setup/
.. code-block:: shell
:caption: Update configuration
POSTGRES_HOST=$(az postgres flexible-server show --resource-group meg --name ${PSQL_NAME:?} --query 'fullyQualifiedDomainName' -o tsv)
kubectl create secret generic database-password --from-literal="POSTGRES_PASSWORD=${PSQL_PASSWORD:?}" --dry-run=client -o yaml | kubectl apply -f -
kubectl patch configmap megforms --patch "{\"data\":{\"POSTGRES_HOST\":\"${POSTGRES_HOST:?}\", \"SITE_DOMAIN\": \"${DOMAIN_NAME:?}\", \"SAML_ENTITY_ID\": \"https://${DOMAIN_NAME:?}/\"}}"
Additional changes can be made to the ConfigMap by using ``kubectl edit configmap megforms``
Add new deployment target to CI & Deploy
-------------------------------------------
.. important:: Before proceeding with the following changes, you must assign the **Contributor** role to the Service Principal in Azure to grant GitLab the necessary permissions to make modifications in the new cluster. Run the following command:
.. code-block:: shell
az role assignment create --assignee 5ed4ddae-1272-478c-971e-e3a9a97716c3 --role "Contributor" --scope subscriptions/0f14d0e0-e21b-4d40-912a-3111c3a445e2/resourceGroups/${RESOURCE_GROUP}/providers/Microsoft.ContainerService/managedClusters/${NAME}
Add the following section to :file:`.gitlab-ci.deploy.yml`:
.. code-block:: yaml
:caption: Remember to replace :const:`CLUSTER_NAME` with actual k8s cluster name
:emphasize-lines: 1-5
k8s:CLUSTER_NAME:deploy:production:
extends: .k8s:deploy:production
variables:
cluster: CLUSTER_NAME
DOMAIN_NAME: CLUSTER_NAME.qms.megit.com
# Remove these after testing
when: manual
only: []
Commit and push this change, and trigger the job to deploy current version to the new cluster.
Once deployed, you should start seeing the new pods in :command:`kubectl get pods`.
.. important:: Remove ``when`` and ``only`` keys after initial deployment
to allow the job to run only when new tags are being released.
Wait for database migrations to complete. Run :command:`kubectl.exe logs -f job/migration` to observe progress.
Add new server to the Status Page
----------------------------------
To ensure the new region is monitored on the status page, update the site list in ``index.js`` within the `status-page `_ repository.
Set up Azure Backup Media Files
================================
The following instructions will guide you through the process of setting up a backup solution for media files stored in a persistent volume within an Azure Kubernetes Service (AKS) cluster.
Edit and use the snippets below to deploy the backup vault, create a backup policy, and register a file share with the backup vault using :program:`az-cli`:
.. code-block:: bash
# Set variables
RESOURCE_GROUP=meg
LOCATION=westeurope
BACKUP_VAULT_NAME=west-backups
FILE_SHARE_NAME=file-share-name
STORAGE_ACCOUNT_NAME=storage-account-name
POLICY_NAME=west-daily-policy
SCHEDULE_RUN_TIME="2024-07-09T02:00:00+00:00"
RETENTION_DAYS=90
# Create a Backup Vault
az backup vault create \
--resource-group ${RESOURCE_GROUP:?} \
--name ${BACKUP_VAULT_NAME:?} \
--location ${LOCATION:?}
# Create a Daily Backup Policy with 90 Days Retention
az backup policy create \
--backup-management-type AzureStorage \
--resource-group ${RESOURCE_GROUP:?} \
--vault-name ${BACKUP_VAULT_NAME:?} \
--name ${POLICY_NAME:?} \
--policy '{
"name": "'${POLICY_NAME:?}'",
"properties": {
"backupManagementType": "AzureStorage",
"workLoadType": "AzureFileShare",
"schedulePolicy": {
"schedulePolicyType": "SimpleSchedulePolicy",
"scheduleRunFrequency": "Daily",
"scheduleRunTimes": ["'${SCHEDULE_RUN_TIME:?}'"]
},
"retentionPolicy": {
"retentionPolicyType": "LongTermRetentionPolicy",
"dailySchedule": {
"retentionTimes": ["'${SCHEDULE_RUN_TIME:?}'"],
"retentionDuration": {
"count": '${RETENTION_DAYS:?}',
"durationType": "Days"
}
}
}
}
}'
# Register the File Share with the Backup Vault Using the Daily Backup Policy
az backup protection enable-for-azurefileshare \
--resource-group ${RESOURCE_GROUP:?} \
--vault-name ${BACKUP_VAULT_NAME:?} \
--storage-account ${STORAGE_ACCOUNT_NAME:?} \
--azure-file-share ${FILE_SHARE_NAME:?} \
--policy-name ${POLICY_NAME:?}
.. seealso::
To learn more about :command:`az backup`, visit `az-cli documentation for az backup `_
Restore Azure File Share Backup
===============================
This section provides instructions on how to restore a backup of an Azure File Share.
Edit and use the snippets below to restore a backup of an Azure File Share using :program:`az-cli`:
.. code-block:: bash
# Set variables
RESOURCE_GROUP=meg
VAULT_NAME=west-backups
CONTAINER_NAME=StorageAccountName
ITEM_NAME=BackupItem
RECOVERY_POINT_NAME=RecoveryPoint
RESTORE_MODE=OriginalLocation
CONFLICT_RESOLUTION=Overwrite
# Restore Azure File Share from backup
az backup restore restore-azurefileshare \
--resource-group ${RESOURCE_GROUP:?} \
--vault-name ${VAULT_NAME:?} \
--container-name ${CONTAINER_NAME:?} \
--item-name ${ITEM_NAME:?} \
--rp-name ${RECOVERY_POINT_NAME:?} \
--resolve-conflict ${CONFLICT_RESOLUTION:?} \
--restore-mode ${RESTORE_MODE:?}
.. seealso::
To learn more about :command:`az backup restore restore-azurefileshare`, visit `az-cli documentation for restore-azurefileshare `_
Site set-up
=============
Visit the site created in :ref:`create ip address`.
Once migration is completed, you should be able to log in with credentials listed in :ref:`test accounts`.
Link to the EU site as a :term:`region`
------------------------------------------
Go to Region admin and `add `_ the new site.
Link to the EU permission groups
------------------------------------------
Go to the Region group description link admin page and `add a link `_ for each permission group you want to sync to the new server.
You can then use the django admin action to sync the newly created links with the downstream server.
Each time the permission group is updated in the EU server it will sync with the downstream permission groups it's linked with.
The group description slug is used as the unique identifier to match groups across regions.
When a permission group is being synced from the EU site, if a matching group description slug doesn't exist in the downstream server a new permission group be automatically be created.
.. _setup-inbound-parse:
Set-up incoming e-mail hook
--------------------------------
:task:`26421`
#. Create a new MX domain in `godaddy DNS `_ pointing at ``mx.sendgrid.net``
.. seealso:: `SendGrid documentation `_ for adding MX record
#. Create a new TXT entry for the subdomain with the following value to authorize Sendgrid to send mail from this domain using :term:`SPF`::
v=spf1 include:sendgrid.net -all
#. Add the domain to `SendGrid Sender Authentication `_ to enable :term:`DKIM`
#. Add the new domain to `SendGrid `_. Enter webhook url ``https://audits.megsupporttools.com/emails/receive`` but replace domain name with the new domain.
Set-up :term:`SSO` with Google for meg staff
-----------------------------------------------
#. Add a new app in `Google Admin `_ in :menuselection:`Apps --> Overview --> Web and mobile apps`
#. Click :menuselection:`Add app --> Add custom SAML app`
#. Add new SAML identity provider :term:`django admin`
Settings:
* Link with existing user account: by e-mail
* ``wantNameId``: true
* ``wantAttributeStatement``: false
* Client-side Entity ID: must match domain name (this is coming from env var :envvar:`SAML_ENTITY_ID`)