1 | Continuous Integration
|
---|
2 | ======================
|
---|
3 |
|
---|
4 | GitLab CI
|
---|
5 | ---------
|
---|
6 |
|
---|
7 | GitLab provides a convenient framework for running commands in response to Git pushes.
|
---|
8 | We use it to test merge requests (MRs) before merging them (pre-merge testing),
|
---|
9 | as well as post-merge testing, for everything that hits ``main``
|
---|
10 | (this is necessary because we still allow commits to be pushed outside of MRs,
|
---|
11 | and even then the MR CI runs in the forked repository, which might have been
|
---|
12 | modified and thus is unreliable).
|
---|
13 |
|
---|
14 | The CI runs a number of tests, from trivial build-testing to complex GPU rendering:
|
---|
15 |
|
---|
16 | - Build testing for a number of configurations and platforms
|
---|
17 | - Sanity checks (``meson test``)
|
---|
18 | - Most drivers are also tested using several test suites, such as the
|
---|
19 | `Vulkan/GL/GLES conformance test suite <https://github.com/KhronosGroup/VK-GL-CTS>`__,
|
---|
20 | `Piglit <https://gitlab.freedesktop.org/mesa/piglit>`__, and others.
|
---|
21 | - Replay of application traces
|
---|
22 |
|
---|
23 | A typical run takes between 20 and 30 minutes, although it can go up very quickly
|
---|
24 | if the GitLab runners are overwhelmed, which happens sometimes. When it does happen,
|
---|
25 | not much can be done besides waiting it out, or cancel it.
|
---|
26 | You can do your part by only running the jobs you care about by using `our
|
---|
27 | tool <#running-specific-ci-jobs>`__.
|
---|
28 |
|
---|
29 | Due to limited resources, we currently do not run the CI automatically
|
---|
30 | on every push; instead, we only run it automatically once the MR has
|
---|
31 | been assigned to ``Marge``, our merge bot.
|
---|
32 |
|
---|
33 | If you're interested in the details, the main configuration file is ``.gitlab-ci.yml``,
|
---|
34 | and it references a number of other files in ``.gitlab-ci/``.
|
---|
35 |
|
---|
36 | If the GitLab CI doesn't seem to be running on your fork (or MRs, as they run
|
---|
37 | in the context of your fork), you should check the "Settings" of your fork.
|
---|
38 | Under "CI / CD" → "General pipelines", make sure "Custom CI config path" is
|
---|
39 | empty (or set to the default ``.gitlab-ci.yml``), and that the
|
---|
40 | "Public pipelines" box is checked.
|
---|
41 |
|
---|
42 | If you're having issues with the GitLab CI, your best bet is to ask
|
---|
43 | about it on ``#freedesktop`` on OFTC and tag `Daniel Stone
|
---|
44 | <https://gitlab.freedesktop.org/daniels>`__ (``daniels`` on IRC) or
|
---|
45 | `Emma Anholt <https://gitlab.freedesktop.org/anholt>`__ (``anholt`` on
|
---|
46 | IRC).
|
---|
47 |
|
---|
48 | The three GitLab CI systems currently integrated are:
|
---|
49 |
|
---|
50 |
|
---|
51 | .. toctree::
|
---|
52 | :maxdepth: 1
|
---|
53 |
|
---|
54 | bare-metal
|
---|
55 | LAVA
|
---|
56 | docker
|
---|
57 |
|
---|
58 | Farm management
|
---|
59 | ---------------
|
---|
60 |
|
---|
61 | .. note::
|
---|
62 | Never mix disabling/re-enabling a farm with any change that can affect a job
|
---|
63 | that runs in another farm!
|
---|
64 |
|
---|
65 | When the farm starts failing for any reason (power, network, out-of-space), it needs to be disabled by pushing separate MR with
|
---|
66 |
|
---|
67 | .. code-block:: console
|
---|
68 |
|
---|
69 | git mv .ci-farms{,-disabled}/$farm_name
|
---|
70 |
|
---|
71 | After farm restore functionality can be enabled by pushing a new merge request, which contains
|
---|
72 |
|
---|
73 | .. code-block:: console
|
---|
74 |
|
---|
75 | git mv .ci-farms{-disabled,}/$farm_name
|
---|
76 |
|
---|
77 | .. warning::
|
---|
78 | Pushing (``git push``) directly to ``main`` is forbidden; this change must
|
---|
79 | be sent as a :ref:`Merge Request <merging>`.
|
---|
80 |
|
---|
81 | Application traces replay
|
---|
82 | -------------------------
|
---|
83 |
|
---|
84 | The CI replays application traces with various drivers in two different jobs. The first
|
---|
85 | job replays traces listed in ``src/<driver>/ci/traces-<driver>.yml`` files and if any
|
---|
86 | of those traces fail the pipeline fails as well. The second job replays traces listed in
|
---|
87 | ``src/<driver>/ci/restricted-traces-<driver>.yml`` and it is allowed to fail. This second
|
---|
88 | job is only created when the pipeline is triggered by ``marge-bot`` or any other user that
|
---|
89 | has been granted access to these traces.
|
---|
90 |
|
---|
91 | A traces YAML file also includes a ``download-url`` pointing to a MinIO
|
---|
92 | instance where to download the traces from. While the first job should always work with
|
---|
93 | publicly accessible traces, the second job could point to an URL with restricted access.
|
---|
94 |
|
---|
95 | Restricted traces are those that have been made available to Mesa developers without a
|
---|
96 | license to redistribute at will, and thus should not be exposed to the public. Failing to
|
---|
97 | access that URL would not prevent the pipeline to pass, therefore forks made by
|
---|
98 | contributors without permissions to download non-redistributable traces can be merged
|
---|
99 | without friction.
|
---|
100 |
|
---|
101 | As an aside, only maintainers of such non-redistributable traces are responsible for
|
---|
102 | ensuring that replays are successful, since other contributors would not be able to
|
---|
103 | download and test them by themselves.
|
---|
104 |
|
---|
105 | Those Mesa contributors that believe they could have permission to access such
|
---|
106 | non-redistributable traces can request permission to Daniel Stone <[email protected]>.
|
---|
107 |
|
---|
108 | gitlab.freedesktop.org accounts that are to be granted access to these traces will be
|
---|
109 | added to the OPA policy for the MinIO repository as per
|
---|
110 | https://gitlab.freedesktop.org/freedesktop/helm-gitlab-infra/-/commit/a3cd632743019f68ac8a829267deb262d9670958 .
|
---|
111 |
|
---|
112 | So the jobs are created in personal repositories, the name of the user's account needs
|
---|
113 | to be added to the rules attribute of the GitLab CI job that accesses the restricted
|
---|
114 | accounts.
|
---|
115 |
|
---|
116 | .. toctree::
|
---|
117 | :maxdepth: 1
|
---|
118 |
|
---|
119 | local-traces
|
---|
120 |
|
---|
121 | Intel CI
|
---|
122 | --------
|
---|
123 |
|
---|
124 | The Intel CI is not yet integrated into the GitLab CI.
|
---|
125 | For now, special access must be manually given (file a issue in
|
---|
126 | `the Intel CI configuration repo <https://gitlab.freedesktop.org/Mesa_CI/mesa_jenkins>`__
|
---|
127 | if you think you or Mesa would benefit from you having access to the Intel CI).
|
---|
128 | Results can be seen on `mesa-ci.01.org <https://mesa-ci.01.org>`__
|
---|
129 | if you are *not* an Intel employee, but if you are you
|
---|
130 | can access a better interface on
|
---|
131 | `mesa-ci-results.jf.intel.com <http://mesa-ci-results.jf.intel.com>`__.
|
---|
132 |
|
---|
133 | The Intel CI runs a much larger array of tests, on a number of generations
|
---|
134 | of Intel hardware and on multiple platforms (X11, Wayland, DRM & Android),
|
---|
135 | with the purpose of detecting regressions.
|
---|
136 | Tests include
|
---|
137 | `Crucible <https://gitlab.freedesktop.org/mesa/crucible>`__,
|
---|
138 | `VK-GL-CTS <https://github.com/KhronosGroup/VK-GL-CTS>`__,
|
---|
139 | `dEQP <https://android.googlesource.com/platform/external/deqp>`__,
|
---|
140 | `Piglit <https://gitlab.freedesktop.org/mesa/piglit>`__,
|
---|
141 | `Skia <https://skia.googlesource.com/skia>`__,
|
---|
142 | `VkRunner <https://github.com/Igalia/vkrunner>`__,
|
---|
143 | `WebGL <https://github.com/KhronosGroup/WebGL>`__,
|
---|
144 | and a few other tools.
|
---|
145 | A typical run takes between 30 minutes and an hour.
|
---|
146 |
|
---|
147 | If you're having issues with the Intel CI, your best bet is to ask about
|
---|
148 | it on ``#dri-devel`` on OFTC and tag `Nico Cortes
|
---|
149 | <https://gitlab.freedesktop.org/ngcortes>`__ (``ngcortes`` on IRC).
|
---|
150 |
|
---|
151 | .. _CI-job-user-expectations:
|
---|
152 |
|
---|
153 | CI job user expectations
|
---|
154 | ------------------------
|
---|
155 |
|
---|
156 | To make sure that testing of one vendor's drivers doesn't block
|
---|
157 | unrelated work by other vendors, we require that a given driver's test
|
---|
158 | farm produces a spurious failure no more than once a week. If every
|
---|
159 | driver had CI and failed once a week, we would be seeing someone's
|
---|
160 | code getting blocked on a spurious failure daily, which is an
|
---|
161 | unacceptable cost to the project.
|
---|
162 |
|
---|
163 | To ensure that, driver maintainers with CI enabled should watch the Flakes panel
|
---|
164 | of the `CI flakes dashboard
|
---|
165 | <https://ci-stats-grafana.freedesktop.org/d/Ae_TLIwVk/mesa-ci-quality-false-positives?orgId=1>`__,
|
---|
166 | particularly the "Flake jobs" pane, to inspect jobs in their driver where the
|
---|
167 | automatic retry of a failing job produced a success a second time.
|
---|
168 | Additionally, most CI reports test-level flakes to an IRC channel, and flakes
|
---|
169 | reported as NEW are not expected and could cause spurious failures in jobs.
|
---|
170 | Please track the NEW reports in jobs and add them as appropriate to the
|
---|
171 | ``-flakes.txt`` file for your driver.
|
---|
172 |
|
---|
173 | Additionally, the test farm needs to be able to provide a short enough
|
---|
174 | turnaround time that we can get our MRs through marge-bot without the pipeline
|
---|
175 | backing up. As a result, we require that the test farm be able to handle a
|
---|
176 | whole pipeline's worth of jobs in less than 15 minutes (to compare, the build
|
---|
177 | stage is about 10 minutes). Given boot times and intermittent network delays,
|
---|
178 | this generally means that the test runtime as reported by deqp-runner should be
|
---|
179 | kept to 10 minutes.
|
---|
180 |
|
---|
181 | If a test farm is short the HW to provide these guarantees, consider dropping
|
---|
182 | tests to reduce runtime. dEQP job logs print the slowest tests at the end of
|
---|
183 | the run, and Piglit logs the runtime of tests in the results.json.bz2 in the
|
---|
184 | artifacts. Or, you can add the following to your job to only run some fraction
|
---|
185 | (in this case, 1/10th) of the dEQP tests.
|
---|
186 |
|
---|
187 | .. code-block:: yaml
|
---|
188 |
|
---|
189 | variables:
|
---|
190 | DEQP_FRACTION: 10
|
---|
191 |
|
---|
192 | to just run 1/10th of the test list.
|
---|
193 |
|
---|
194 | For Collabora's LAVA farm, the `device types
|
---|
195 | <https://lava.collabora.dev/scheduler/device_types>`__ page can tell you how
|
---|
196 | many boards of a specific tag are currently available by adding the "Idle" and
|
---|
197 | "Busy" columns. For bare-metal, a gitlab admin can look at the `runners
|
---|
198 | <https://gitlab.freedesktop.org/admin/runners>`__ page. A pipeline should
|
---|
199 | probably not create more jobs for a board type than there are boards, unless you
|
---|
200 | clearly have some short-runtime jobs.
|
---|
201 |
|
---|
202 | If a HW CI farm goes offline (network dies and all CI pipelines end up
|
---|
203 | stalled) or its runners are consistently spuriously failing (disk
|
---|
204 | full?), and the maintainer is not immediately available to fix the
|
---|
205 | issue, please push through an MR disabling that farm's jobs according
|
---|
206 | to the `Farm Management <#farm-management>`__ instructions.
|
---|
207 |
|
---|
208 | Personal runners
|
---|
209 | ----------------
|
---|
210 |
|
---|
211 | Mesa's CI is currently run primarily on packet.net's m1xlarge nodes
|
---|
212 | (2.2Ghz Sandy Bridge), with each job getting 8 cores allocated. You
|
---|
213 | can speed up your personal CI builds (and marge-bot merges) by using a
|
---|
214 | faster personal machine as a runner. You can find the gitlab-runner
|
---|
215 | package in Debian, or use GitLab's own builds.
|
---|
216 |
|
---|
217 | To do so, follow `GitLab's instructions
|
---|
218 | <https://docs.gitlab.com/ee/ci/runners/runners_scope.html#create-a-project-runner-with-a-runner-authentication-token>`__
|
---|
219 | to register your personal GitLab runner in your Mesa fork. Then, tell
|
---|
220 | Mesa how many jobs it should serve (``concurrent=``) and how many
|
---|
221 | cores those jobs should use (``FDO_CI_CONCURRENT=``) by editing these
|
---|
222 | lines in ``/etc/gitlab-runner/config.toml``, for example:
|
---|
223 |
|
---|
224 | .. code-block:: toml
|
---|
225 |
|
---|
226 | concurrent = 2
|
---|
227 |
|
---|
228 | [[runners]]
|
---|
229 | environment = ["FDO_CI_CONCURRENT=16"]
|
---|
230 |
|
---|
231 |
|
---|
232 | Docker caching
|
---|
233 | --------------
|
---|
234 |
|
---|
235 | The CI system uses Docker images extensively to cache
|
---|
236 | infrequently-updated build content like the CTS. The `freedesktop.org
|
---|
237 | CI templates
|
---|
238 | <https://gitlab.freedesktop.org/freedesktop/ci-templates/>`__ help us
|
---|
239 | manage the building of the images to reduce how frequently rebuilds
|
---|
240 | happen, and trim down the images (stripping out manpages, cleaning the
|
---|
241 | apt cache, and other such common pitfalls of building Docker images).
|
---|
242 |
|
---|
243 | When running a container job, the templates will look for an existing
|
---|
244 | build of that image in the container registry under
|
---|
245 | ``MESA_IMAGE_TAG``. If it's found it will be reused, and if
|
---|
246 | not, the associated ``.gitlab-ci/containers/<jobname>.sh`` will be run
|
---|
247 | to build it. So, when developing any change to container build
|
---|
248 | scripts, you need to update the associated ``MESA_IMAGE_TAG`` to
|
---|
249 | a new unique string. We recommend using the current date plus some
|
---|
250 | string related to your branch (so that if you rebase on someone else's
|
---|
251 | container update from the same day, you will get a Git conflict
|
---|
252 | instead of silently reusing their container)
|
---|
253 |
|
---|
254 | When developing a given change to your Docker image, you would have to
|
---|
255 | bump the tag on each ``git commit --amend`` to your development
|
---|
256 | branch, which can get tedious. Instead, you can navigate to the
|
---|
257 | `container registry
|
---|
258 | <https://gitlab.freedesktop.org/mesa/mesa/container_registry>`__ for
|
---|
259 | your repository and delete the tag to force a rebuild. When your code
|
---|
260 | is eventually merged to main, a full image rebuild will occur again
|
---|
261 | (forks inherit images from the main repo, but MRs don't propagate
|
---|
262 | images from the fork into the main repo's registry).
|
---|
263 |
|
---|
264 | Building locally using CI docker images
|
---|
265 | ---------------------------------------
|
---|
266 |
|
---|
267 | It can be frustrating to debug build failures on an environment you
|
---|
268 | don't personally have. If you're experiencing this with the CI
|
---|
269 | builds, you can use Docker to use their build environment locally. Go
|
---|
270 | to your job log, and at the top you'll see a line like::
|
---|
271 |
|
---|
272 | Pulling docker image registry.freedesktop.org/anholt/mesa/debian/android_build:2020-09-11
|
---|
273 |
|
---|
274 | We'll use a volume mount to make our current Mesa tree be what the
|
---|
275 | Docker container uses, so they'll share everything (their build will
|
---|
276 | go in _build, according to ``meson-build.sh``). We're going to be
|
---|
277 | using the image non-interactively so we use ``run --rm $IMAGE
|
---|
278 | command`` instead of ``run -it $IMAGE bash`` (which you may also find
|
---|
279 | useful for debug). Extract your build setup variables from
|
---|
280 | .gitlab-ci.yml and run the CI meson build script:
|
---|
281 |
|
---|
282 | .. code-block:: console
|
---|
283 |
|
---|
284 | IMAGE=registry.freedesktop.org/anholt/mesa/debian/android_build:2020-09-11
|
---|
285 | sudo docker pull $IMAGE
|
---|
286 | sudo docker run --rm -v `pwd`:/mesa -w /mesa $IMAGE env PKG_CONFIG_PATH=/usr/local/lib/aarch64-linux-android/pkgconfig/:/android-ndk-r21d/toolchains/llvm/prebuilt/linux-x86_64/sysroot/usr/lib/aarch64-linux-android/pkgconfig/ GALLIUM_DRIVERS=freedreno UNWIND=disabled EXTRA_OPTION="-D android-stub=true -D llvm=disabled" DRI_LOADERS="-D glx=disabled -D gbm=disabled -D egl=enabled -D platforms=android" CROSS=aarch64-linux-android ./.gitlab-ci/meson-build.sh
|
---|
287 |
|
---|
288 | All you have left over from the build is its output, and a _build
|
---|
289 | directory. You can hack on mesa and iterate testing the build with:
|
---|
290 |
|
---|
291 | .. code-block:: console
|
---|
292 |
|
---|
293 | sudo docker run --rm -v `pwd`:/mesa $IMAGE meson compile -C /mesa/_build
|
---|
294 |
|
---|
295 | Running specific CI jobs
|
---|
296 | ------------------------
|
---|
297 |
|
---|
298 | You can use ``bin/ci/ci_run_n_monitor.py`` to run specific CI jobs. It
|
---|
299 | will automatically take care of running all the jobs yours depends on,
|
---|
300 | and cancel the rest to avoid wasting resources.
|
---|
301 |
|
---|
302 | See ``bin/ci/ci_run_n_monitor.py --help`` for all the options.
|
---|
303 |
|
---|
304 | The ``--target`` argument takes a regex that you can use to select the
|
---|
305 | jobs names you want to run, eg. ``--target 'zink.*'`` will run all the
|
---|
306 | zink jobs, leaving the other drivers' jobs free for others to use.
|
---|
307 |
|
---|
308 | Conformance Tests
|
---|
309 | -----------------
|
---|
310 |
|
---|
311 | Some conformance tests require a special treatment to be maintained on GitLab CI.
|
---|
312 | This section lists their documentation pages.
|
---|
313 |
|
---|
314 | .. toctree::
|
---|
315 | :maxdepth: 1
|
---|
316 |
|
---|
317 | skqp
|
---|
318 |
|
---|
319 |
|
---|
320 | Updating GitLab CI Linux Kernel
|
---|
321 | -------------------------------
|
---|
322 |
|
---|
323 | GitLab CI usually runs a bleeding-edge kernel. The following documentation has
|
---|
324 | instructions on how to uprev Linux Kernel in the GitLab CI ecosystem.
|
---|
325 |
|
---|
326 | .. toctree::
|
---|
327 | :maxdepth: 1
|
---|
328 |
|
---|
329 | kernel
|
---|
330 |
|
---|
331 |
|
---|
332 | Reusing CI scripts for other projects
|
---|
333 | --------------------------------------
|
---|
334 |
|
---|
335 | The CI scripts in ``.gitlab-ci/`` can be reused for other projects, to
|
---|
336 | facilitate reuse of the infrastructure, our scripts can be used as tools
|
---|
337 | to create containers and run tests on the available farms.
|
---|
338 |
|
---|
339 | .. envvar:: EXTRA_LOCAL_PACKAGES
|
---|
340 |
|
---|
341 | Define extra Debian packages to be installed in the container.
|
---|