Ticket #245 (closed defect: fixed)
Fix Software module failures under serveral situations
|Reported by:||juruen@…||Owned by:||juruen@…|
Description (last modified by juruen@…) (diff)
So far we have identified the following errors in the behaviour of the module:
When an update for apache is going to take place, the process is forked. This is necessary for the twisted thing we are doing which is: upgrading apache within apache. When this is detected, the process in charge of serving the http request forks and returns immediately. From the user's point of view, the upgrading has already finished and eBox allows him to upgrade another module or system package. If the user does so, eBox will fail because apt will be busy doing the previous upgrade and will not allow another apt-get install.
The quick answer to this is that the parent process should wait() till its child process (the one which is doing the real upgrading) is done, but we have to keep in mind that the parent process will die as soon as the apache package is restarted. Also, if the upgrade is pretty big then it will take too long and we will end up either with a timeout or the user refreshing his browser which leads us to the same situation. We could mark somewhere that we are in the middle of an update and we do not allow the user to do certain tasks such as upgrading packages to avoid that.
Apart from this obvious issue, I have experienced several problems when the perl libraries are upgraded, having to restart apache manually to get a functional server again. This needs further investigation, but it could be related to the use of Apache::Reload, and maybe a possible workaround would be to disable the reload of modules until the whole upgrade is done. It's likely that even doing that it does not work and we have to restart apache.
There is yet another problem that we have to face. Currently, we have in the apt.sources file the apt sources provided by us where the ebox and extra packages are. And, as usual for a Debian stable version, the security sources, at the moment the ones for sarge.
So right now, for example, let's say we have a package foobar, which has a security update availabe from Debian, and foobar is a service which is managed by eBox through one of its modules.
Let's say that the new package needs to restart the foobar service through /etc/init.d/foobar, but foobar is now managed through ebox using runnit, so in the middle of the process of upgrading the postinst script of foobar gets stuck forever trying to restart the service, and the user never sees the interface back or gets a timeout.
This is not a Debian's fault, because, just off the top of my head, in Debian you can define a local policy for the packages' init scripts, and as long as the maintainer uses invoke-rc.d the above explained could work. However, it happens that we did not bear in mind this issue, and, right now, some installations out there will break because of this. What we have to achieve is to minimize the risk of having these unpleasant surprises.
To overcome this issue, what we suggest is not to allow security updates from Debian, but from our servers. This way, when there's a new security update we'll check that it does not break anything, in that case, we upload the update straight to our security repository. But in case it breaks something we will fix the package or the ebox packages before make it available.
I think it is very important that our upgrading mechanism works smoothly and in the most reliable way possible. I strongly reckon that we have to achieve this as soon as possible in our roadmap.
I am going to be working on this topic for a few days, and I think we should be willing to trim down some of our cool features in the software module if we do not get something stable enough. In case we do not work out soon the above issues we will release a version just with automatic upgrading at a given time, or triggered by the user, stopping apache, upgrading the system and restarting apache again. Of course, this does not mean we should leave the cool approach of selecting individually the packages and stuff.
- Description modified (diff)
- Reporter changed from anonymous to juruen@…
- priority changed from normal to high
- Summary changed from module software fails under serveral situations to Fix Software module failures under serveral situations
- hours changed from 0.0 to 20.0
- Status changed from new to closed
- Resolution set to fixed
- totalhours changed from 0.0 to 20.0