Business Intelligence
In order to better understand client behaviour we’ve build a business intelligence analytics system. The key components are as follows:
Mongodb database
MEG celery tasks to push aggregated data to the mongodb database
Mongodb Charts reports which query the mongodb database
Mongodb database
The mongodb database is hosted in Azure, but can be managed via the Mongodb Atlas website. IP restriction is in place to only allow connections to the database from our production servers. The process of adding a new index should be managed via the Mongodb Atlas website.
Development Usage Guide
For development there is a docker compose setup to simulate the production mongodb database. Run the following command to launch the MEG dev container and the mongodb containers:
BI_AGGREGATE=1 docker compose --profile bi up
Once the containers are up and running you can view the mongo database here You will need to enter the username (admin) and password (password) to login to mongo express.
Scheduled Tasks
The following tasks run on a schedule on each server to aggregate data to the Mongodb database.
The tasks only function if the environment variable :envvar:BI_AGGREGATE is enabled.
These tasks can be run manually via the periodic tasks django admin page.
Note
Some tasks wipe the existing data in the Mongodb collection while others append to the existing data. Tasks that append to existing data usually add data for the prior month to the existing collection.
When manually running these tasks it’s advisable to first wipe the data for the collection in Mongodb. You can do this in Mongodb.
- (celery task)client_management.tasks.aggregate_monthly_submissions() None
Monthly task to aggregate submission data for all forms. Divides forms into chunks of 10 to be processed in parallel.
This appends data for the prior month to the existing collection.
Scheduled to run on the first day of each month at 0:00am.
- (celery task)client_management.tasks.aggregate_monthly_sessions() None
Monthly task to aggregate session data for all forms. Divides forms into chunks of 10 to be processed in parallel.
This appends data for the prior month to the existing collection.
Scheduled to run on the first day of each month at 1am.
- (celery task)client_management.tasks.aggregate_quarterly_form_churn() None
Quarterly task to aggregate churn data for all forms.
This task wipes existing data in the collection before syncing.
Scheduled to run on the first day of each quarter at 0:00am.
- (celery task)client_management.tasks.sync_institutions() None
Syncs institutions with the BI database.
This task wipes existing data in the collection before syncing.
Scheduled to run every saturday at 0:00am.
- (celery task)client_management.tasks.sync_forms() None
Syncs forms with the BI database.
This task wipes existing data in the collection before syncing.
Scheduled to run every saturday at 0:00am.
- (celery task)client_management.tasks.sync_security_risk_clients() None
Syncs security details for institutions which handle patient data.
This task wipes existing data in the collection before syncing.
Scheduled to run every saturday at 0:00am.
- (celery task)client_management.tasks.sync_top_users() None
Syncs the top 10 users for the following actions:
document creators
audit schedule creators
issue creators
issue editors
review screen editors
audit submitters
user creators
ward creators
The lookback period is the past 365 days.
This task wipes existing data in the collection before syncing.
Scheduled to run on the first day of every month at 01:00am.