As part of the migration of my MacOS Server to Linux the next service to migrate is my PostgreSQL engine. Although PostgreSQL had already been hidden in MacOS Server for some time, it still was included (as internal services like ProfileManager and Calendar and Addressbook Server depend on it. Despite it being hidden, I had still enabled it (manually) and hosted my PostgreSQL databases on my MacOS Server for ages. Despite migrations sometimes being a pain (i.e. not automatic) this worked well so far, including integrating it with the MacOS Server way of using transaction logs for offline backups. (so I will also have to look for a new way to do this). Continue reading “MacOS Server Replacement #2 – Migrating PostgreSQL”
Today I noticed that my phone could no longer create any new calendar items. With Server.app I noticed that the Calendar (and AddressBook) services were no longer running and when checking their status, it took forever for the panel to load. Enabling the service again also took forever to not start (and unfortunately without any error message).
After some digging I found that the PostgreSQL server the Apple CalDav service uses internally was no longer running and issues starting In the logfiles in /var/log/caldavd/postgresql/ I found messages like:
2015-03-14 12:59:33.665 CET  LOG: unexpected pageaddr 0/5DC82000 in log segment 000000010000000000000061, offset 13115392 2015-03-14 12:59:33.665 CET  LOG: invalid primary checkpoint record 2015-03-14 12:59:33.679 CET  LOG: unexpected pageaddr 0/5DC7C000 in log segment 000000010000000000000061, offset 13090816 2015-03-14 12:59:33.679 CET  LOG: invalid secondary checkpoint record 2015-03-14 12:59:33.679 CET  PANIC: could not locate a valid checkpoint record
I suspect these were caused by a crash a few days ago of my NAS that serves the iSCSI disks where the postgres data is stored. I spent a lot of time today to look for a solution (including trying to restore a backup and set it up from scratch, which all failed). In the end I found a clue in the manual page of pg_resetxlog:
pg_resetxlog clears the write-ahead log (WAL) and optionally resets some other control information stored in the pg_control file. This function is sometimes needed if these files have become corrupted. It should be used only as a last resort, when the server will not start due to such corruption.
This pretty closely matched my situation so (after making a backup of the DB folder) I executed the following command in the folder where Server.app stores it data (by default that is /Library/Server/Calendar and Contacts but in my case that’s /Volumes/Data/Library/Server/Calendar and Contacts as I store all data on a RAID5 container on my NAS)
sudo -u _calendar pg_resetxlog -f Data/Database.xpg/cluster.pg/
After running this command the PostgreSQL for Services started again and my Calendar (and AddressBook) services were running again. So far it looks like I did not lose any data apart from a calendar entry that I had added on my Macbook in iCal.I am glad it is resolved, but I have to look into how backups are made so that the next time I at least know that I can get my calendar and contacts back…