Hello,
due to the way file systems work, merely using the existence of a
file as a lock is not robust as has been observed.
Please make the time to review a patch for amprolla3 that makes
sure only one amprolla-related operation is running at any given
time and does not have lingering-lock-file-breaks-everything as an
issue.
man 2 flock explains this in better detail than I could.
The patch itself:
https://git.devuan.org/devuan-infrastructure/amprolla3/merge_requests/2
Then there is a bit of an issue with the orchestrate.sh script,
and indeed, with locking in general.
The way I understand things, there are three processes (A,B,C)
A. amprolla_update is run with orchestrate.sh (very often!):
1. there is always a consistent working set in merged which
usually points to merged-production
2. before amprolla_update, merged switches to merged-staging (why
here and not after amprolla_update?)
3. amprolla_update works against -volatile, during this process
that directory is not necessarily fully consistent
4. after amprolla_update, merged-volatile is synchronised to
merged-production
5. merged switches to merged-production
6. merged-volatile is synchronised to merged-staging
7. merged-production is synchronised to pkgmaster
B. Sometimes amprolla_merge is run
C. Sometimes amprolla_merge_contents + amprolla_merge are run
The *intent* of the implemented locks appears to be that only one
process out of A,B,C is active at a time, but the reality is that
this is not formally provided, timing just happens to match, but
shouldn't be relied upon:
Existence of an active lock does prevent A, B or C from starting.
But in A the lock is only active while point A.3. is executed,
everything else runs without an active lock, so if e.g. A.4 to A.7
take long enough that A.4 is running again while A.7 is happening,
bogus data will be synchronised to pkgmaster.
Since A.7 is a network operation and A.5 and A.6 are disk
operations, delays *could* happen and they could lead to this
scenario (maybe it has happened at some point).
That's why I added the support for the --no-lock-I-am-sure
argument to amprolla_update, and instead obtain the lock as a step
A.0 that is valid throughout all process A in the orchestrate.sh
script.
With that, I think there is now only need for 2 directories:
-volatile and -production, with A being redefined as:
A. amprolla_update is run with orchestrate.sh (very often!):
0. obtain amprolla lock and exit if it can't be obtained
1. there is always a consistent working set in merged which
usually points to merged-production
2. amprolla_update --no-lock-I-am-sure works against -volatile,
during this process that directory is not necessarily fully
consistent
3. merged switches to merged-volatile
4. merged-volatile is synchronised to merged-production
5. merged switches to merged-production
7. merged-production is synchronised to pkgmaster
I have adapted orchestrate.sh and therefore merged-staging is not
used with these patches and could be removed in theory.
I hope this makes sense.
--
Evilham