Import document repository

When onboarding a new client, it might be necessary to import hundreds or thousands of documents to MEG Docs. In this case the standard bulk upload functionality won’t suffice. MEG Docs provides an API for programmatically uploading documents.

Creating the metadata spreadsheet

The first step is to download the import template in excel format and fill it out for every document version you wish to upload. Each row in the spreadsheet can represent either:

  • a new document and it’s first version

  • a new version of a document

  • a new source word file for a version

Preparing the files

Once you have created the metadata spreadsheet you need to prepare your files for uploading.

  • create a folder and insert all the pdf documents there

  • create a separate folder and insert the original word files there

Important

You must keep the word files in a separate folder to the pdf files. This is because the script loops through the files in sequence. If the word file and pdf have the same name, then .docx comes before .pdf alphabetically. A word file cannot be uploaded until it’s document has been created. This requires the pdf to be uploaded first.

Using the API

Before using the API you need to get the authentication token. The token is your access key that allows you to use the API. You can obtain the AUTH_TOKEN by logging into the API via your web browser: https://audits.megsupporttools.com/api/v2/login. Alternatively if you access MEG via SSO, a MEG Staff member can access the AUTH_TOKEN in django admin.

Once you have the token and have created the metadata spreadsheet, you can then upload the spreadsheet to the API:

TOKEN='<AUTH_TOKEN>'
FILEPATH='/path/to/your/upload_metadata.xlsx'
FILENAME='upload_metadata.xlsx'  # You can change this to the desired file name
INSTITUTION_ID='<INSTITUTION_ID>'
URL='https://audits.megsupporttools.com/api/v2/document-metadata/'

# POST the file to the API
curl -X POST -H "Authorization: Token ${TOKEN}" -F "data=@${FILEPATH};type=application/vnd.openxmlformats-officedocument.spreadsheetml.sheet;filename=${FILENAME}" -F "institution=${INSTITUTION_ID}" $URL

Note

Please replace AUTH_TOKEN and INSTITUTION_ID with values relevant to your own user account.

If the spreadsheet was uploaded successfully, you will receive a response like this:

{"data":null,"institution":1,"pk":2}

Here the pk is your UPLOAD_ID. Take note of it as you will need it in the next step. If you realize you have made a mistake in the spreadsheet you will need to upload it again and use the new UPLOAD_ID in the following steps. You can verify the uploaded metadata by visiting the admin site.

You can then POST new documents or version files to the API:

DIRECTORY="/path/to/your"
TOKEN='<AUTH_TOKEN>'
UPLOAD_ID='<UPLOAD_ID>'
URL="https://audits.megsupporttools.com/api/v2/document-metadata/${UPLOAD_ID:?}/upload/"

for FILE in "$DIRECTORY"/*; do
    if [ -f "$FILE" ]; then
        echo "Uploading file: $FILE"
        curl -X POST -H "Authorization: Token ${TOKEN:?}" -F "file=@$FILE" $URL
    fi
done

Note

Please replace AUTH_TOKEN and UPLOAD_ID with values relevant to your own user account.

The API will use the metadata previously loaded to handle the file.