Ad-hoc repairs to a failed gitlab-ce upgrade (12.8 -> 13.0.8)


While attempting to upgrade a dockerized instance of giblab-ce I found a number of error messages like this that caused the upgrade to fail and a rollback to the previous version to fail:

7/6/2020 11:07:37 AMRunning handlers:
7/6/2020 11:07:37 AMThere was an error running gitlab-ctl reconfigure:
7/6/2020 11:07:37 AM
7/6/2020 11:07:37 AMrunit_service[redis] (redis::enable line 66) had an error: Errno::ENOENT: template[/var/log/gitlab/redis/config] (/opt/gitlab/embedded/cookbooks/cache/cookbooks/runit/libraries/provider_runit_service.rb line 136) had an error: Errno::ENOENT: No such file or directory @ realpath_rec - /opt/gitlab/sv/redis/log/config

References

Rollback

Since the docker container immediately exited after this error, I was unable to exec inside for configuration troubleshooting. After fumbling around I decided to try and add empty files in the locations that error out in the error log. The files appear to be in 'log' directories, so it probably doesn't matter what the contents are. Fortunately for me, docker cp works on stopped containers, so I was able to run these commands to copy over empty files to the locations that fail:

# touch config
# docker cp config f0a53fbfc5ff07:/opt/gitlab/sv/redis/log/config
# docker cp config f0a53fbfc5ff07:/var/log/gitlab/gitaly/config
# docker cp config f0a53fbfc5ff07:/opt/gitlab/sv/gitaly/log/config
# docker cp config f0a53fbfc5ff07:/opt/gitlab/sv/postgresql/log/config

With these commands executed, I could roll-back to gitlab-ce 12.8. Attempting to roll-forward to a new version resulted in errors like these:


7/6/2020 11:52:45 AM    * template[/var/log/gitlab/redis/config] action create
7/6/2020 11:52:45 AM      
7/6/2020 11:52:45 AM      ================================================================================
7/6/2020 11:52:45 AM      Error executing action `create` on resource 'template[/var/log/gitlab/redis/config]'
7/6/2020 11:52:45 AM      ================================================================================
7/6/2020 11:52:45 AM
7/6/2020 11:52:45 AM      Errno::ENOENT
7/6/2020 11:52:45 AM      -------------
7/6/2020 11:52:45 AM      No such file or directory @ realpath_rec - /opt/gitlab/sv/redis/log/config
7/6/2020 11:52:45 AM
7/6/2020 11:52:45 AM      Cookbook Trace:
7/6/2020 11:52:45 AM      ---------------
7/6/2020 11:52:45 AM      /opt/gitlab/embedded/cookbooks/cache/cookbooks/runit/libraries/provider_runit_service.rb:255:in `block in <class:RunitService>'
7/6/2020 12:34:40 PMRunning handlers:
7/6/2020 12:34:40 PMThere was an error running gitlab-ctl reconfigure:
7/6/2020 12:34:40 PM
7/6/2020 12:34:40 PMMultiple failures occurred:
7/6/2020 12:34:40 PM* Errno::ENOENT occurred in chef run: template[/var/log/gitlab/gitaly/config] (/opt/gitlab/embedded/cookbooks/cache/cookbooks/runit/libraries/provider_runit_service.rb line 136) had an error: Errno::ENOENT: No such file or directory @ realpath_rec - /opt/gitlab/sv/gitaly/log/config
7/6/2020 12:34:40 PM* Errno::ENOENT occurred in delayed notification: ruby_block[restart_log_service] (/opt/gitlab/embedded/cookbooks/cache/cookbooks/runit/libraries/provider_runit_service.rb line 69) had an error: Errno::ENOENT: template[/var/log/gitlab/gitaly/config] (/opt/gitlab/embedded/cookbooks/cache/cookbooks/runit/libraries/provider_runit_service.rb line 136) had an error: Errno::ENOENT: No such file or directory @ realpath_rec - /opt/gitlab/sv/gitaly/log/config
7/6/2020 12:34:40 PM

Upgrade

To successfully upgrade, it looks like I need to copy the contents of some of these 'config' files. I pulled the gitlab 12.10 docker image, execed into a fresh instance and pulled the configs like this:

# docker exec -it ebf58 /bin/bash
# cat /var/log/gitlab/redis/config 
s209715200
n30
t86400
!gzip

# cat /opt/gitlab/sv/gitaly/log/config
cat: /opt/gitlab/sv/gitaly/log/config: No such file or directory

# cat /opt/gitlab/sv/postgresql/log/config
cat: /opt/gitlab/sv/postgresql/log/config: No such file or directory

(** I just used an empty file for these ones since they doesn't exist **)

I uploaded the configs (one with data, 2 'empty' ones) to the new docker container that will upgrade/replace the old 12.8 version. This time the upgrade to 12.10 completed successfully!

Considerations

I think my approach of bind-mounting certain gitlab directories is what caused this issue. Up to now I've been able to upgrade at the click of a button to 'latest' without issue- clearly something changed and there is a persistence issue somewhere(?). I'll attempt an upgrade to 13.x at some point, which will hopefully work out ok.