Ticket #3082 (closed defect: fixed)
LDAP Slaves fail to sync on Zentyal Beta 2.2-rc1
| Reported by: | r.hoelzemer@… | Owned by: | cperez@… |
|---|---|---|---|
| Milestone: | 2.2 | Component: | users |
| Severity: | normal | Keywords: | ldap master slave replication |
| Cc: |
Description
Tested with three (vitual) machines, a master ("ldap") and two slaves ("gateway" and "server"), and followed the wiki HOWTO here. I already have a master/slave scenario with EBox 1.5.x in production at work, so i know the pifalls of that setup.
It might also be worth to note that I did this whole setup three times now to make sure it is indeed a bug and not some silly mistake I might have made.
Here's what I have done to get things going:
- made sure to uninstall apparmor and reboot on each machine
- made sure that all machines are resolvable via dns
- made sure the master only has "UsersAndGroups?" module installed
- set the incoming ldap rule on the master to accept traffic
Configuring the master and joining the slaves afterwards was a piece of cake. Basically, everything went fine up to this point. I can see the slaves listed in the master and the slaves themselves seem to be connected to the master.
After successfully joining the slaves, here's what i had done:
- create a new group names "samba" ( this went ok, group is visible on the slaves)
- create a new user "bob" ( failed! is listed in the slave operations list on the master for both slaves)
From this point on, all Operations, either new users or new groups, fail to replicate to the slaves.
Attached are the logfiles of all three machines.
Attachments
Change History
comment:1 Changed 22 months ago by r.hoelzemer@…
Sorry for the doublepost! Trac wouldn't let me upload a zipfile of all logs because it thought it was some sort of spam. :)
Also, I tried to attach all nine logfiles, but apparently there is a six file maximum for attachments here. The posted ones are from the master ("ldap") and one slave ("gateway"). If you still need logs from the other slave ("server"), just ping me. :)
comment:2 Changed 22 months ago by cperez@…
- Status changed from new to closed
- Resolution set to fixed
comment:3 Changed 22 months ago by r.hoelzemer@…
- Status changed from closed to reopened
- Resolution fixed deleted
Hi cperez,
unfortunately changeset 22624 didn't actually fix this bug. I tested this in a similar environment as mentioned above, but manually removed the line
$self->_manageService('stop');
in UsersAndGroups?.pm right after installing the module and before configuring anything. Before and after configuring/joining the master and the slaves, i made sure that slapd is indeed running on all machines.
openldap 11568 0.0 1.4 161956 7084 ? Ssl 13:38 0:00 /usr/sbin/slapd -d 0 -h ldap://0.0.0.0:1389/ -u openldap -g openldap -F /etc/ldap/slapd-replica.d openldap 11589 0.0 1.4 229772 7140 ? Ssl 13:38 0:00 /usr/sbin/slapd -d 0 -h ldap://127.0.0.1:1390/ -u openldap -g openldap -F /etc/ldap/slapd-translucent.d openldap 11609 0.0 1.4 147780 7196 ? Ssl 13:38 0:00 /usr/sbin/slapd -d 0 -h ldap://0.0.0.0/ ldapi://%2fvar%2frun%2fslapd%2fldapi/????x-mod=0777 -u openldap -g openldap -F /etc/ldap/slapd-frontend.d
The result is exactly the same as before. The errors i get are already available in the last logs i posted.
In the master logfile line 77-78:
77 2011/07/31 19:46:05 ERROR> Ldap.pm:701 EBox::Ldap::_errorOnLdap - $VAR1 = 'cn=master,dc=example,dc=de'; 78 2011/07/31 19:46:05 ERROR> Ldap.pm:703 EBox::Ldap::_errorOnLdap - Unknown error at EBox::UsersAndGroups::__ANON__ No such object
and the slave logfile line 190-195
190 2011/07/31 19:49:53 ERROR> Ldap.pm:701 EBox::Ldap::_errorOnLdap - $VAR1 = 'ou=Users,dc=example,dc=de'; 191 2011/07/31 19:49:53 ERROR> Ldap.pm:703 EBox::Ldap::_errorOnLdap - Unknown error at EBox::UsersAndGroups::__ANON__ Referral received 192 2011/07/31 19:49:53 ERROR> Ldap.pm:701 EBox::Ldap::_errorOnLdap - $VAR1 = 'ou=Groups,dc=example,dc=de'; 193 2011/07/31 19:49:53 ERROR> Ldap.pm:703 EBox::Ldap::_errorOnLdap - Unknown error at EBox::UsersAndGroups::__ANON__ Referral received 194 2011/07/31 19:49:53 ERROR> Ldap.pm:701 EBox::Ldap::_errorOnLdap - $VAR1 = 'cn=__USERS__,ou=Groups,dc=example,dc=de'; 195 2011/07/31 19:49:53 ERROR> Ldap.pm:703 EBox::Ldap::_errorOnLdap - Unknown error at EBox::UsersAndGroups::__ANON__ Referral received
and line 216-217
216 2011/07/31 20:32:55 ERROR> Ldap.pm:701 EBox::Ldap::_errorOnLdap - $VAR1 = 'cn=samba,ou=Groups,dc=example,dc=de'; 217 2011/07/31 20:32:55 ERROR> Ldap.pm:703 EBox::Ldap::_errorOnLdap - Unknown error at EBox::UsersAndGroups::__ANON__ Referral received
If I am missing something here please let me know. For now i reopen the ticket for reference.
comment:4 Changed 22 months ago by cperez@…
Hi,
Have you applied all my patches in that branch? I left a package in my public dir:
http://people.zentyal.org/~exekias/zentyal-users_2.1.7_all.deb
If you want you can try it first reinstall users module with:
/usr/share/zentyal-users/reinstall (This will remove all your data so do not do that on production)
Then install the download package with:
dpkg -i zentyal-users_2.1.7_all.deb
and configure and enable the module as usual
comment:5 Changed 22 months ago by r.hoelzemer@…
Ok. I knew i was missing something. Sorry, I assumed fix was the above changeset alone.
I just did another test with the above mentioned package and the error is still the same.
Here's what i did:
- gone back to a clean install on all three machines
- uninstalled apparmor everywhere
- installed the neccessary modules plus the new zentyal-users_2.1.7 package on all machines
- run /usr/share/zentyal-users/reinstall on all machines
- made sure the master has ldap ports enabled
- made sure master has only the users module installed
- made sure slapd is running everywhere
Still no go, unfortunately.
comment:6 Changed 22 months ago by cperez@…
Sorry, you did this:
- installed the neccessary modules plus the new zentyal-users_2.1.7 package on all machines
- run /usr/share/zentyal-users/reinstall on all machines
The problem is that reinstall script does reinstall zentyal-users module, taking it from official repository, so you should install with dpkg after running that script :)
comment:7 Changed 22 months ago by r.hoelzemer@…
Hmmm, I am pretty sure i checked that Version 2.1.7 was installed prior to configuring master/slaves. After installing with dpkg, wouldn't the new package be in the apt cache and picked up by an reinstall/upgrade anyway?
Ok. I'll do another test. :)
comment:8 Changed 22 months ago by r.hoelzemer@…
Nope! Out of interest, I did both scenarios - first dpkg -i zentyal-users_2.1.7, then /usr/share/zentyal-users/reinstall or vice versa. Both times, the installed Version of zentyal-users is 2.1.7 and also both scenarios give the exact same error as before.
I did however find something unusual. After some investigation, i decided to activate the firewall logs and discovered that on the master, incoming packets on port 389 are dropped. Then doublechecked the firewall settings - everything fine there.
iptables -L confirms that the port is open:
... Chain iglobal (1 references) target prot opt source destination ACCEPT tcp -- anywhere anywhere tcp dpt:ldap state NEW drop tcp -- anywhere anywhere tcp dpt:6677 state NEW ACCEPT udp -- anywhere anywhere udp dpt:ntp state NEW ACCEPT tcp -- anywhere anywhere tcp dpt:ssh state NEW ACCEPT tcp -- anywhere anywhere tcp dpt:https state NEW ...
that's ok, i guess. Then why are ldap packets dropped by the master? Even syslog confirmes the drop:
Aug 2 19:37:37 ldap kernel: [ 7090.392991] ebox-firewall drop IN=eth0 OUT= MAC=08:00:27:6e:c5:9a:08:00:27:6b:e7:27:08:00 SRC=10.0.0.1 DST=10.0.0.5 LEN=40 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=34268 DPT=389 WINDOW=0 RES=0x00 RST URGP=0 MARK=0x1
My suspicion was that those packets are dropped by the global input INVALID filter, which itself goes into the drop chain and creates that message in syslog.
-A INPUT -m state --state INVALID -j idrop ... -A drop -m limit --limit 50/min --limit-burst 10 -j LOG --log-prefix "ebox-firewall drop " --log-level 7 -A drop -j DROP
So to distinguish an invalid packet drop from other drop events in the log, I made a copy of the drop chain with a custom "invalid" message just for that first filter.
-A INPUT -m state --state INVALID -j iinvalid ... -A invalid -m limit --limit 50/min --limit-burst 10 -j LOG --log-prefix "ebox-firewall invalid " --log-level 7 -A invalid -j DROP
Here's the result:
Aug 2 19:46:32 ldap kernel: [ 7625.351966] ebox-firewall invalid IN=eth0 OUT= MAC=08:00:27:6e:c5:9a:08:00:27:6b:e7:27:08:00 SRC=10.0.0.1 DST=10.0.0.5 LEN=40 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=51787 DPT=389 WINDOW=0 RES=0x00 RST URGP=0 MARK=0x1
So it seems that ldap packets that arrive at the master are INVALID and therefor dropped by the firewall. :) This is basically as far as I could get for now. Hopefully this helps further invesigating the issue.
comment:9 Changed 22 months ago by cperez@…
Uhm, that's weird.
Just a question, if you disable the firewall does the master-slave stuff works?
If the answer is yes we should close this ticket and work in the firewall issue (in this or another one actually)
comment:10 Changed 22 months ago by r.hoelzemer@…
Just tested this with the firewall disabled on all machines. Still the same error.
comment:11 Changed 22 months ago by jsalamero@…
Some things to check:
- NTP module is installed on all machines, check the time is synchronized with date
- from the master you can resolve slaves hostname, you even can connect to them to the soap port (same than zentyal interface) using openssl s_client -conect slave:443
- from slaves you can connect to the master ldap port using ldapsearch -h master -b 'dc=foo,dc=bar' -x -w
comment:12 Changed 22 months ago by r.hoelzemer@…
Yes, I made sure time is synchronized on all machines by installing the virtualbox guest additions plus the NTP module. Also checked before every action that time is the same everywhere.
The whole network is resolvable by IP, name and FQDN. I can connect from the slaves to the master with openssl s_client -connect ldap:443. No problem here.
ldapsearch fails with
ldap_bind: Invalid credentials (49)
I assume the password is the one displayed in the web ui? Made sure the password was correct. Also tried to read it from file with the "-y" switch:
PING ldap.example.de (10.0.0.5) 56(84) bytes of data. 64 bytes from ldap.example.de (10.0.0.5): icmp_seq=1 ttl=64 time=0.485 ms 64 bytes from ldap.example.de (10.0.0.5): icmp_seq=2 ttl=64 time=0.669 ms 64 bytes from ldap.example.de (10.0.0.5): icmp_seq=3 ttl=64 time=0.489 ms
ldapsearch -h ldap.example.de -b 'dc=example,dc=de' -x -w *master_password* ldap_bind: Invalid credentials (49)
ldapsearch -h ldap.example.de -b 'dc=example,dc=de' -x -y /var/lib/zentyal/conf/ebox-ldap.passwd Warning: Password file /var/lib/zentyal/conf/ebox-ldap.passwd is publicly readable/writeable ldap_bind: Invalid credentials (49)
comment:13 Changed 22 months ago by cperez@…
comment:14 Changed 22 months ago by r.hoelzemer@…
Yes! After adding a new user at the master i have many entries like this on the slaves.
Aug 3 17:07:28 gateway slapd[9021]: syncrepl_message_to_entry: rid=110 mods check (objectClass: value #3 invalid per syntax) Aug 3 17:07:28 gateway slapd[9021]: do_syncrepl: rid=110 rc 21 retrying (4 retries left)
comment:15 Changed 22 months ago by cperez@…
Ok,
Now I know where is the problem, low level replication is not working well, for sure something related with different schemas between master and slave.
I'm going to work on fixing it! Thank you for your patience and effort :)
Will keep this ticket updated
comment:16 Changed 22 months ago by r.hoelzemer@…
Ahhh, finally some progress!
Thank you aswell for investigating and dealing with my neverending poking sessions. :)
I am looking forward for a bugfix.
comment:17 Changed 22 months ago by cperez@…
- Status changed from reopened to closed
- Resolution set to fixed
comment:18 Changed 22 months ago by cperez@…
Thank you very much!
This last commit truly fix the problem, Quota schemas were moved to users module (from samba) and caused all this mess.
You will need to apply the patch to the package and reinstall slaves (rembember /usr/share/zentyal-users/reinstall)
comment:19 Changed 22 months ago by cperez@…
I updated the package if you want to use mine:
http://people.zentyal.org/~exekias/zentyal-users_2.1.7_all.deb
comment:20 Changed 22 months ago by r.hoelzemer@…
Fix confirmed!
Thanks again and have a nice day :)
