Working with CosmosDB JavaScript triggers and stored procedures can be problematic, especially when your application consists of three or four separate development environments. The solution we need is found in the effortless deployment of Azure CosmosDB JS functions with GitLab CI.
The city of CosmosDB is riddled with crime
Working with CosmosDB JavaScript triggers and stored procedures can be problematic, especially when your application consists of three or four separate development environments. After you’ve created your function in Azure’s fun built-in notepad, you have to copy and paste the code to each collection in every environment by hand.
It’s not difficult to imagine the amount of work one has to put in syncing this code across every tier, not to mention that this approach is prone to errors and often results in inconsistencies; it is quite easy to forget to propagate the changes to each environment.
Sometimes your application is branded for different clients. 2 apps in 4 environments with separate databases, each with 3 triggers, 3 stored procedures; you end up copying and pasting the functions roughly 42 times, not to mention logging into different accounts in between. Yikes!
Furthermore, we don’t want functions to just lie around in the Azure Portal. Ideally, we would want to keep them in an external repository, safe from frequent collection deletions and migrations, particularly in the early stages of development.
The hero we need and also deserve
The solution for all this madness is putting to work the GitLab CI! With a bit of help from NodeJS.
With CI configured for our task, we’re not only able to effortlessly deploy the code to all environments via the CosmosDB REST API, but we can also use Babel.js to “minify” the code and make it faster as a result (which matters greatly at scale).
Azure CosmosDB JS functions: the beginning
The project structure
cosmosdb-functions
│
├── dist <- Babel destination directory
│
├── node_modules
│
├── src
│ ├── procedures
│ └── triggers
│
.gitignore
.gitlab-ci.yml
package.json
package-lock.json
synchronizeWithCosmosDB.js
Package.json and its dependencies
First things first, let’s take care of Package.json and its dependencies:
{
"name": "cosmosdb-functions",
"version": "1.0.0",
"description": "Sample CosmosDB triggers and stored procedures",
"scripts": {
"start": "babel src --out-dir dist"
},
"babel": {
"presets": [
[
"minify",
{
"removeConsole": true
}
]
],
"plugins": [
"babel-plugin-loop-optimizer"
]
},
"devDependencies": {
"@babel/cli": "^7.1.2",
"@babel/core": "^7.1.2",
"babel-preset-minify": "^0.5.0",
"babel-plugin-loop-optimizer": "^1.4.1",
"crypto": "^1.0.1"
},
"repository": {
"type": "git",
"url": "ssh://git@git.your-git-repo.com:1337/sample-repo/cosmosdb-functions.git"
}
}
Let’s look at devDependencies
:
- For Babel to work, first we have to add its essence; that is the Babel-CLI & Babel-Core modules. After that, we will add Babel tools, which will perform the actual code transformation.
babel-preset-minify
will “minify” or “uglify” our code, so the output file will be smaller in size, and babel-plugin-loop-optimizer will transform anymap()
orforEach()
to normal for loops, also reducing execution times.- Last but not least, we will add crypto, which will be needed later on.
Babel configuration
The babel
property in package.json
contains all the info regarding how Babel should process your source files. There is only one preset declared: minify
, but with the additional removeConsole
option which will remove all console.log()
occurrences in the code during the minification process. Inside the plugins property we declare the loop optimizer.
What we configured above is ready to be executed with one simple command:babel src --out-dir dist
. With this command we are telling Babel to get everything from the src
directory, and put the effects of the transformation into dist
directory. Easy!
This command has to be put inside the scripts
property. Later we’ll ask the CI runner to use it.
The warm-up
We start by importing all the required modules to our synchronizeWithCosmosDB.js
file and then we implement a function for serving all the http requests, using vanilla NodeJS https module.
const crypto = require('crypto');
const fileSystem = require('fs');
const https = require('https');
async function makeRequest(requestOptions, requestBody) {
return new Promise((resolve, reject) => {
const request = https.request(requestOptions, (response) => {
const chunks = [];
response.on('data', data => {
chunks.push(data);
});
response.on('end', () => {
let responseBody = Buffer.concat(chunks);
responseBody = JSON.parse(responseBody);
resolve(responseBody);
});
});
request.on('error', (error) => {
reject(error);
});
request.write(JSON.stringify(requestBody));
request.end();
});
}
The https
node module can be a little unintuitive to use, but you don’t really need to know how the makeRequest()
function works, only that it takes in two parameters:
a request options:
const procedurePutOptions = {
hostname: HOSTNAME,
path: `/${COLLECTION_RESOURCE_ID}/sprocs/${requestBody.id}`,
method: 'PUT',
headers: {
'x-ms-version': COSMOSDB_API_VERSION,
'Content-Type': 'application/json',
'Authorization': authTokenPUT,
'x-ms-date': DATE
}
};
and the request body:
const requestBody = {
body: procedure, // the code processed by babel goes in here
id: procedureFileName
};
Now that we’re done with the easy part, it’s time for some Microsoft wisdom.
The main villain
You’ve probably noticed that weirdly named authTokenPUT
sitting in headers. For some reason, the CosmosDB REST API requires a new token for every method and for each single resource you’re trying to modify.
But fear thou not, for I am with thee. I have been through this shadowy valley, so you don’t have to do it yourself.
The result of this journey is a token generation function that looks like this:
function getAuthorizationToken(httpMethod, resourcePath, resourceType, date, databasePrimaryKey) {
const params = `${httpMethod.toLowerCase()}\n${resourceType.toLowerCase()}\n${resourcePath}\n${date.toLowerCase()}\n\n`;
const key = Buffer.from(databasePrimaryKey, 'base64');
const body = Buffer.from(params, 'utf8');
const signature = crypto.createHmac('sha256', key).update(body).digest('base64');
return encodeURIComponent(`type=master&ver=1.0&sig=${signature}`);
}
It’s based on their sample function. Do not modify any of this. It has to have exactly two \n
at the end of the params
string.
How does it work? We don’t need to know that! All that we care about is how to use it:
const authTokenPUT = getAuthorizationToken(
'PUT',
`${COLLECTION_RESOURCE_ID}/sprocs/${requestBody.id}`,
'sprocs',
DATE,
DATABASE_PRIMARY_KEY
);
In the first param we specify the http method we want to use with this particular request.
The second param wasn’t so easy to figure out. After some time, the final thing that worked is a path combined of these elements:
COLLECTION_RESOURCE_ID
,the path to your collection in following format:dbs/{database}/colls/{collectionName}
(e.g.dbs/exampleDB/colls/sample_collection
)/sprocs/
, which means stored procedures directory associated with sample_collection- The tzn
id
(the name) of the stored procedure we want to update.
Next, we have another ’sprocs’
, which is needed, for some reason. It’s the resourceType
, and in our case, we will be using two of them: sprocs
and triggers
.
The fourth param, DATE
is the date we generate once as a constant, for all our requests to use. Please notice, that the date has to be exactly the same as the x-ms-date
in the headers. This fact was also omitted in the docs.
The last parameter is your database’s primary key which is confusingly referred to in the docs as “master key”.
The token that we get in return is valid for 15 minutes for this particular resource, plenty of time to do our business.
Now that we have everything we need, let’s try to send our stored procedure to CosmosDB.
Azure functions CosmosDB: first call
The synchronizeWithCosmosDB.js
now looks like this:
DATABASE_PRIMARY_KEY = 'databaseprimarykeythatendswithtwoequalsigns==';
COLLECTION_RESOURCE_ID = 'dbs/exampleDB/colls/sample_collection';
HOSTNAME = 'sample-account-1.documents.azure.com';
const COSMOSDB_API_VERSION = '2017-02-22'; // https://docs.microsoft.com/en-us/rest/api/cosmos-db/ <-latest versions here
const PROCEDURES_PATH = './dist/procedures';
const TRIGGERS_PATH = './dist/triggers';
const DATE = new Date().toUTCString();
try {
tryPutProcedure('TestProcedure.js').then(console.log);
} catch (error) {
console.log(error.message);
}
async function tryPutProcedure(procedureFileName) {
const procedure = fileSystem.readFileSync(`${PROCEDURES_PATH}/${procedureFileName}`, 'utf8');
const requestBody = {
body: procedure,
id: procedureFileName.replace('.js', '')
};
const authTokenPUT = getAuthorizationToken(
'PUT',
`${COLLECTION_RESOURCE_ID}/sprocs/${requestBody.id}`,
'sprocs',
DATE,
DATABASE_PRIMARY_KEY
);
const procedurePutOptions = {
hostname: HOSTNAME,
path: `/${COLLECTION_RESOURCE_ID}/sprocs/${requestBody.id}`,
method: 'PUT',
headers: {
'x-ms-version': COSMOSDB_API_VERSION,
'Content-Type': 'application/json',
'Authorization': authTokenPUT,
'x-ms-date': DATE
}
};
let response = await makeRequest(procedurePutOptions, requestBody);
return response;
}
// the makeRequest and getAuthorizationToken functions are somewhere below
We are reading the minified file synchronously with the help of the nodeJS file system. It is important to set the encoding to ‘utf-8’.
In the requestBody, the parameter id
receives the name of the file minus the file extension – for aesthetics.
In this snippet, I have provided the name of the procedure in tryPutProcedure()
manually. We’ll take care of this later.
Let’s head to the terminal and say npm start
, and then node synchronizeWithCosmosDB
.
If everything goes right, we should see a result similar to this:
{ body: 'async function testProcedure(a){try{}catch(a){}}',
id: 'TestProcedure',
_rid: 'dr4DAMsTT78EAAAAAAAAgA==',
_self: 'dbs/dr4DAA==/colls/dr4DAMsTT78=/sprocs/dr4DAMsTT78EAAAAAAAAgA==/',
_etag: '"00007fd4-0000-0000-0000-5bcf327d0000"',
_ts: 1540305533 }
But there is a problem with that and there is a reason why we have named the function tryPutProcedure()
. When we make a PUT request, we are assuming that the procedure code is already in CosmosDB, and most times this assumption will prove to be true. But sometimes we’ll get a response that looks like this:
{ code: 'NotFound',
message:
'Message: {"Errors":["Resource Not Found"]}\r\nActivityId: 756b133f-9b2c-40a1-8f26-ea3f9847719a, Request URI: /apps/f3d2423f-cec0-4d0f-ba12-5146d90faaa7/services/5d583876-e9c2-4794-be01-a5aa1169e7b0/partitions/d78876e1-7ad6-4b49
-aa73-9e1711672df1/replicas/131841768030204810p/, RequestStats: \r\nRequestStartTime: 2018-10-24T09:51:49.5582642Z, Number of regions attempted: 1\r\n, SDK: Microsoft.Azure.Documents.Common/2.0.0.0' }
Let’s fix that, by adding a condition right after our makeRequest()
call:
//...more code...//
}
};
let response = await makeRequest(procedurePutOptions, requestBody);
if (response.code === 'NotFound') {
response = await postProcedure(requestBody);
}
return response;
}
This time, when we get a response with NotFound
status, we’ll be ready. The implementation of postProcedure()
:
async function postProcedure(requestBody) {
const authTokenPOST = getAuthorizationToken(
'POST',
COLLECTION_RESOURCE_ID,
'sprocs',
DATE,
DATABASE_PRIMARY_KEY
);
const procedurePostOptions = {
hostname: HOSTNAME,
path: `/${COLLECTION_RESOURCE_ID}/sprocs`,
method: 'POST',
headers: {
'x-ms-version': COSMOSDB_API_VERSION,
'Content-Type': 'application/json',
'Authorization': authTokenPOST,
'x-ms-date': DATE
}
};
return makeRequest(procedurePostOptions, requestBody);
}
Take a look at the token generation. Now we’re providing only the path to the collection, and the resource type, because now all we are doing is create a new resource in /sprocs
. Of course, all of this could be done differently; we could get a list of stored procedures belonging to our collection and then compare the local and remote data and send only the data that needs to be sent. But we don’t want to use a sledgehammer to crack a nut, right?
Map all the things!
All we need to do now, is to take all the files in .dist/procedures
and send them to the API:
async function synchronizeProcedures() {
const allProcedures = fileSystem.readdirSync(PROCEDURES_PATH);
return Promise.all(allProcedures.map(tryPutProcedure));
}
Piece of cake!
In the try-catch we used before, we can say:
synchronizeProcedures()
.then(console.log)
.catch(error => {
ci_job_fail_with_error(error); //does not matter for now
}
);
Done! Now all your stored procedures are chillin’ at CosmosDB’s crib.
The triggers
Trigger functions are a bit harder to upload, but that’s nothing we can’t deal with.
As you know, to define a trigger, we need to specify additional info about its behavior.
First, we tell it when it should fire, i.e. the Trigger Type
(Pre, Post), and second, what events it should act upon – the Trigger Operation
(All, Create, Replace, Delete).
Where do we keep that information?
In the file name.
And hence the naming convention of trigger files shall be:
TriggerName-TriggerType-TriggerOperation.
Example: TheBestTrigger-Post-Delete.js
Now, we will construct the payload of the trigger like this:
async function tryPutTrigger(triggerFileName) {
const trigger = fileSystem.readFileSync(`${TRIGGERS_PATH}/${triggerFileName}`, 'utf8');
const triggerInfo = triggerFileName.split('-');
const requestBody = {
body: trigger,
id: triggerInfo[0],
triggerType: triggerInfo[1],
triggerOperation: triggerInfo[2].replace('.js', '')
};
//...the remaining code...
The rest of the code is essentially the same, except that during the token generation we need to replace the 'sprocs'
in path and the resource type with 'triggers'
. The same applies to the construction of the request options.
GitLab CI setup for CosmosDB
GitLab CI variables enter the scene
Eventually, we will want to execute the syncing program with the intent to update all the code in every development environment on the GitLab CI runner.
To make that happen, we have to keep the keys and paths to each of them somewhere, and that place is the GitLab CI Variables.
The plan here is simple, we take for instance the DATABASE_PRIMARY_KEY
and we add the environment prefix – DEV
, STG
, PROD
etc. so it becomes DEV_DATABASE_PRIMARY_KEY
.
The runner we will be using to execute the program has access to these variables, and the way we tell our code to get them is easy:
const DATABASE_PRIMARY_KEY = process.env[`DEV_DATABASE_PRIMARY_KEY`];
We also want to be able to specify the environment to which we wish to deploy the code
from outside of the syncing program in following fashion:node synchronizeWithCosmosDB DEV
.
Here’s how to do it:
try {
ENVIRONMENT = process.argv[2];
DATABASE_PRIMARY_KEY =
process.env[`${ENVIRONMENT}_DATABASE_PRIMARY_KEY`];
COLLECTION_RESOURCE_ID = process.env[`${ENVIRONMENT}_COLLECTION_RESOURCE_ID`];
HOSTNAME = process.env[`${ENVIRONMENT}_HOSTNAME`];
synchronizeProcedures()
.then(console.log)
.catch(error => {
ci_job_fail_with_error(error);
}
);
synchronizeTriggers()
.then(console.log)
.catch(error => {
ci_job_fail_with_error(error);
}
);
} catch (error) {
console.log(`
Usage example:
node synchronizeWithCosmosDB DEV
`);
ci_job_fail_with_error(error);
}
function ci_job_fail_with_error(error) {
console.log(error.message);
process.exit(1);
}
The essence is in ENVIRONMENT = process.argv[2]
. It takes the last parameter after the command and assigns it to the constant, so we can utilize it in constructing the environment string.
Now, we can use our program with node synchronizeWithCosmosDB DEV
and that enables us to take the final step.
Note:ci_job_fail_with_error()
ensures that our CI job will fail whenever something funny happens.
The GitLab CI pipelines
A CI pipeline is basically a group of jobs that run one after another, whenever we push the code to our GitLab repo. Of course, for that to happen, first we need to have at least one Docker CI runner available. On how to setup a CI runner, there’s plenty of information around.
Next, we define the .gitlab-ci.yml
file containing all the instructions for the runner to execute.
Azure CosmosDB JS functions: finish the job
In our .gitlab-ci.yml
file, we should define a few things before we call it a day:
image: node:latest
stages:
- build
- deploy
cache:
paths:
- node_modules/
build:
stage: build
script:
- npm install
- npm start
artifacts:
paths:
- dist/
- node_modules/
expire_in: 15 minutes
deploy_dev:
stage: deploy
only:
- develop
script:
- node synchronizeWithCosmosDB DEV
deploy_stg:
stage: deploy
only:
- /^release\/.*$/
script:
- node synchronizeWithCosmosDB STG
deploy_prod:
stage: deploy
only:
- master
script:
- node synchronizeWithCosmosDB PROD
Here we have specified two separate stages that the pipeline will want to complete on each code push – build
and deploy
.
The build stage is intended for downloading all the dependencies with the help of
the npm install
command. After this task is finished, the job will proceed to the next one
– npm start
where the transformation of our source files occurs.
The important thing in the build stage is to keep artifacts of all the downloaded dependencies in node_modules
and all the minified files in the dist
directory for the deploy
job, which will then use the dependencies to execute the syncing program and upload the transformed files. We do this with the help of the artifacts
.
The deploy stage is, first and foremost, dependent on the branch name to which the code is pushed.
Any code pushed to a branch with a name that starts with release/
will inevitably end up in the staging environment.
$ git commit -m "That's one small pipeline for GitLabCI, one giant improvement for the development process"
$ git push
That’s it! We’re done!
PS. Don’t forget about a proper .gitignore
file!
Effortless deployment of Azure CosmosDB JS functions – additional considerations
What about the CosmosDB SDK?
Yes, there are many CosmosDB SDKs available. We could, instead of writing our own syncing code just use a few things from NodeJS CosmosDB SDK and upload the functions more easily (I suppose) with their help.
But of course, that is an additional dependency, and this dependency relies on other dependencies, and the more dependencies we use, the slower the pipelines. And by the way, what happened to our old-fashioned DIY culture?
Deletions?
Our approach doesn’t support deletions, so whenever we delete a function from our repo, that function will still be present in CosmosDB. There are multiple ways this could be solved. You are free to explore the options, improve and expand on this deployment idea for better results. If you have something interesting, please let me know in the comments below.
Alternative CI tools
But wait a minute, do I even need GitLab for this? Of course not! Any CI system will do just fine, so you can choose whichever CI you, your friends and family like.
And speaking of choices, remember that you have a whole variety of Babel plugins available for your code optimization, which is great if your functions have to perform some heavy lifting.
Azure CosmosDB JS functions with GitLab CI: summary
If your application is branded for different clients and you’d end up copying and pasting the functions roughly 42 times and logging on different accounts in between. The solution for all this madness is the effortless deployment of Azure CosmosDB JS functions with GitLab CI. More effective work equals more time to learn new things (check out our another tutorial)!
You can find the source code on my Github.
Popular posts
From Hype to Hard Hats: Practical Use Cases for AI chatbots in Construction and Proptech.
Remember the multimedia craze in the early 2000s? It was everywhere, but did it truly revolutionize our lives? Probably not. Today, it feels like every piece of software is labeled "AI-powered." It's easy to dismiss AI chatbots in construction as just another tech fad.
Read moreFears surrounding external support. How to address concerns about outsourcing software development?
Whether you’ve had bad experiences in the past or no experience at all, there will always be fears underlying your decision to outsource software development.
Read moreWhat do you actually seek from external support? Identify what’s preventing you from completing a project on time and within budget
Let’s make it clear: if the capabilities are there, a project is best delivered internally. Sometimes, however, we are missing certain capabilities that are required to deliver said project in a realistic timeline. These may be related to skills (e.g. technical expertise, domain experience), budget (hiring locally is too expensive) or just capacity (not enough manpower). What are good reasons for outsourcing software development?
Read more