GitLab CI¶
Before we start diving into writing CI configuration files we're going to cover what a CI configuration looks like and go over some basics.
We'll cover concepts like:
- Pipelines versus Stages versus Jobs
- Pipeline configuration
artefacts:
andcache:
blocksrules:
andscript:
blocks- The
dependencies:
block
And we'll use the Terraform .gitlab-ci.yml
file as the example, along with some other examples and some visuals to help us along the way.
Let's get started.
Pipelines & Stages & Jobs¶
In GitLab CI we have three things we need to be aware of: pipelines, stages and jobs.
A pipeline contains stages. Stages contain jobs. Jobs contains configuration that tells GitLab CI what it is you want to do.
We define a single pipeline by providing a .gitlab-ci.yml
file. The file itself represents the pipeline. We then define stages in this pipeline (by adding them to the file) and each stage in turn has a single job added to it. Each job defines the stage it belongs to, the rules that decide if the job should execute, and the script that is executed.
It's important to understand the difference between these elements, so let's visualise this with a simple example:
graph LR
a1 --> discord1
a2 --> b1
b1 --> c1
c1 --> d1
subgraph Stage A
a1[Discord Notification]
a2[Test Code]
end
subgraph Stage B
b1[Compile Code]
end
subgraph Stage C
c1[Package Code]
end
subgraph Stage D
d1[Deploy Code]
end
subgraph Discord API
discord1[API]
end
Here we have four stages:
Stage A
Stage B
Stage C
Stage D
In stage A
we have two jobs: Discord Notification
and Test Code
. These jobs run in parallel under certain conditions, but we're not going to cover that at this point in time. In a future update to the book we will cover parallel execution. For now let's keep things simple.
Now we have stages B
, C
, and D
. These are going to run in that order, precisely, and each execute a single job. Each stage depends on the previous to do some work or produce some artefact that we need in the next stage(s).
So once the Test Code
job in Stage A
has completed it called the next job (Compile Code
) in the next stage (Stage B
). This repeats: B
-> C
and finally C
-> D
, until the whole pipeline is completed.
The whole diagram represents a complete pipeline.
So remember: a pipeline contains stages, stages contain jobs and jobs contain configuration instructing GitLab CI to execute stuff for us.
As YAML¶
To put this into an example more closely aligned with reality, let's write out the above as actual YAML configuration:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
|
This is valid YAML and a valid pipeline configuration. It contains the stages we mentioned above and their associated jobs.
Because each job is in its own stage the whole thing will run in a linear manner (minus Stage A
, that has two jobs that will attempt to run in parallel). We also further ensure a linear progression where it matters by using the artefacts:
and dependencies:
keywords, which create an explicit dependency between the jobs, thus forcing a linear execution.
Let's now review the contents of a real CI configuration file. It's the file we'll be writing to configure our Terraform pipeline.
Pipeline Configuration¶
We've covered the differences between a pipeline, a stage and a job. Let's now start looking at the Terraform .gitlab-ci.yml
file and begin to understand the keywords used to construct the whole pipeline.
Here are the very first few lines of our Terraform's .gitlab-ci.yml
file, which constitute the pipeline's global configuration as well as some default values:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
All of this is configuring the pipeline to behave in a particular way and do some tasks for use ahead of each stage. Let's review each of the configuration options above.
image:
¶
This configures the entire pipeline to run all script:
configurations (explained below) in a Docker container using a specific image: registry.gitlab.com/gitlab-org/terraform-images/stable:latest
.
This particular image is perfect for our needs not just because it provides Terraform but because it's suitable for us inside of GitLab CI pipelines due to some bootstrapping thats being done around Terraform. This will become more clear later on.
variables:
¶
This configuration keyword allows us to define variables that are available for use across the entire pipeline, in all stages, and can be used for all kinds of things.
cache:
¶
Using the cache:
keyword we can have the pipeline cache certain files and or directories between stages/jobs, and even across pipelines themselves. For us this is important because after we call terraform init
we need to copy the .terraform/
to the other stages in the pipeline. If we didn't we would have to call terraform init
for every job.
before_script:
¶
In our stages we use the script:
keyword to define the functionality of each stage and actually get our work done. The before_script:
configuration is used to have a script execute before the script inside of each of our script:
blocks. We're using the GitLab CI provided Terraform Docker image, so we need to use this feature to move into the TF_ROOT
location.
So as an example if we had the following before_script:
:
1 |
|
We're defining a script that makes sure a directory we need in each stage always exists. If we then used the following script:
in a stage inside of our pipeline:
1 |
|
Before the stage's script executed our before_script:
would run, which means we'd effectively be getting (as I'm sure you've guessed):
1 2 |
|
If you had two or more stages that needed this directory, then of course it would get repetitive having to provide the same command every time. Plus if the name of the directory changed you could use a variable and also change the mkdir
call in a single place.
Stages¶
Our Terraform pipeline has the following stages:
1 2 3 4 5 |
|
These stages are stepped through, one by one, in the order shown. We have four stages:
validate
plan
apply
destroy
I believe if we explain the jobs behind the validate
, plan
and apply
stages, and their respective job configurations, then we'll have enough information to successfully write the actual files themselves. The destroy
stage will be understandable after you've studied the others.
Note
The GitLab CI documentation covers stages in more detail.
Validate¶
This is the configuration of a single job inside the validate
stage:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
Let's break this down into its core components.
Rules¶
When we use the rules:
keyword we're telling GitLab CI that our job (not the pipeline or even the stage as a whole) has a list of rules from which one must equal "true" (with a short-circuit effect in place) before this job will be included in its particular stage.
If none of the rules evaluate to "true", then this job does not execute, but the rest of the stage may very well if another job inside of said stage does evaluate to "true".
Let's look at this visually with a simple, contrived example (and assume all jobs are in a single stage):
graph LR
a[Job 1] --> b
b[Job 2] --> c
c[Job 3]
If we have the following rules for each job, we can make adjustments to them to alter the above flow...
a => rules: A_VAR==1
b => rules: B_VAR==2
c => rules: C_VAR==3
If we execute the pipeline and we set A_VAR=1
and B_VAR=2
, but we set C_VAR=99
, then the pipeline will look like this:
graph LR
a[Job 1] --> b
b[Job 2]
If we flip that logic on its head entirely, setting A_VAR
and B_VAR
to 99
, and C_VAR=3
, then the pipeline will look like this:
graph LR
a[Job 3]
Put another way: if a job's rules exclude it from the stage, then GitLab CI moves on to the next job looking for one that evaluates to "true" which is then included in the stage (which means the stage is included in the pipeline).
All of our stages only have a single job defined in them.
So what rules do we have in our validate
job?
1 2 3 |
|
We're using an exists:
keyword to determine if a file (.destroy
) exists or not. If it does then the when:
keyword determines what should happen, and in this case never
means this stage should never
be included in pipeline.
1 2 |
|
Finally we're asking GitLab CI to check for changes to a list of pattern matches. In our case we're looking for changes to any files that match *.tf
, or Terraform configuration files. In the event such changes do exist then this rule evaluates to true and the stage is included in the pipeline.
The final rule explains why I've opted to include the first rule, the if:
keyword: what if there are no changes to the Terraform files? How do I run the pipeline? By including this if:
check means I can "override" the other rules and have the stage included in all cases.
This raises another important point about rules:
in GitLab CI: the first rule in the list to evaluate to true determines if the stage is included in the pipeline or not. No other rules are evaluated after this point. That's why the if:
clause is included first - it means all other rules are ignored if RUN_ANYWAY = YES
.
Script¶
The script:
keyword is basically the backbone of most CI configurations. It's how we define the actual functionality of the job within the stage. There are other things we can do with a job, like triggering other remote pipelines, but what you'll see the most is a script:
keyword being used to execute some shell code.
In our script
we init
the Terraform installation. Then we validate
that the syntax of the code is valid. If not then the stage will fail and the pipeline will come to a halt.
In the above script we're using the gitlab-terraform
Plan¶
Now let's review the same thing for the plan
stage - it's job configuration:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
|
We have a keyword here - artefacts:
- that we haven't seen before. Let's go over what it does.
artefacts¶
With the artefacts:
keyword we're telling GitLab CI to create two artefacts: the plan file for Terraform to use at later stages, and the JSON version that gets pushed into the back end of the GitLab CI Terraform solution.
Note
We'll ignore the magic behind the latter part of this stage and instead just focus on the first part.
The artefacts:
is a bit like the cache:
keyword we saw earlier - it stores items of interest for us. However with artefacts we decide what stages get the artefacts themselves, where as using cache:
means whatever is cached is included in all stages. Not every stage needs whatever we store as artefacts, which is why we use them.
As we need to generate a Terraform plan so that our apply
can do its job, we use the artefacts:
keyword to store it for later recovery.
Apply¶
And finally we'll review the one job we've configured for the apply
stage:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
|
With another unique keyword we've not seen yet: dependencies:
.
Dependencies¶
Here we encounter another new keyword: dependencies:
.
In the plan
stage we used the artefacts:
keyword to create a downloadable artefact from our Terraform plan file (plan.cache
). Now we're using the dependencies:
keyword to tell the job what artefacts to download from what job. In this case it's the plan
job, as defined in the code above.
This is how we move objects between jobs, stages and even pipelines: we use artefacts:
and dependencies:
(among other features available to use too.)
Conclusion¶
We've gone over the basics of a simple GitLab CI configuration file. We've now got a feel for the formatting and some of the basic keywords being used. This is enough to work with for the time being, but if you want to know more or just simply explore what's available (tinkering is a good idea!) then checkout the GitLab CI configuration file reference.
In the next section we're going to discuss the Terraform pipeline configuration and then begin to actually start writing out files.