This post is a technical one. I am posting for my own notes and hopefully to assist anyone else who might run into the same issue.
SharePoint 2013 Farms can be configured with high-availability using the new SQL Server 2012 Always-On clustering feature. Initially Microsoft had very limited support for databases that were provisioned in an Availability Group, since SharePoint 2013 SP1 most of the service application databases are now supported in both Async-Commit and Sync-Commit modes. Sync-Commit mode has the most support for SharePoint 2013 databases, although a problem occurs when executing the SharePoint Products Configuration Wizard after deploying CU’s or Security patches where an un-handled exception is thrown near the end of the wizard (Step 8 of 10 or Step 9 of 10). As a best practice, you should always monitor ULS logs while a patching process is executing. After reviewing ULS logs an error such as this might be found:
“… Timer job is exiting due to exception: System.Data.SqlClient.SqlException (0x80131904): The operation cannot be performed on database “WSS_Logging” because it is involved in a database mirroring session or an availability group. Some operations are not allowed on a database that is participating in a database mirroring session or in an availability group. ALTER DATABASE statement failed. at ….”
Or something similar to this depending on the name of your Usage Logging database.
Cause of Problem
During a patching exercise a range of things occur when SharePoint Configuration wizard executes. The issue with databases that live in an Always-On Availability Group is that they can’t be “Altered” in terms of SQL statements that is. Therefore to use an ALTER DATABASE statement, the database in question needs to be removed from the Availability Group first. In this particular case the SharePoint Usage Logging database requires this each time the Configuration Wizard runs an upgrade process.
Database Supported in Always-On Availability Groups from Microsoft
Solution for this Issue
So the solution for this particular problem scenario is to remove the Usage Logging database from the Availability Group while you are patching. You can do this safely, assuming that you have all your service applications configured to point to the availability group listener so that you can continue to access the databases while patching. The other important points are that you must remove the database from the primary replica and that you make a backup of the primary replica just in case something goes wrong.
Once your patching has been successful across all nodes in the farm, you must then re-add the Usage Logging database back to the Availability Group. Microsoft are still not recommending this database be a member of an AO-AG although assuming you understand what is happening and can workaround issues such as this one. Therefore it can be considered safe enough to have it be a member of the group.
Tips and Notes
- This post might not be the root cause of your issue, the configuration wizard can fail for a number of reasons. It’s always advised to carefully study/monitor the ULS logs to fully understand what might be causing your upgrade/patching issue.
- Other common patching issues can be related to permissions. Remember to execute the Configuration Wizard with an account that has farm-level administration access and is also a local administrator of the node that your patching.
- My tip and what I will do generally is use my SP-Farm login and temporarily add it to the local administrator group of the SQL instances and SharePoint nodes while patching and then remove once complete. I have it set so it’s just a matter of group changes in Active Directory an then use the Run As option to execute the configuration wizard after deploying the patch files first.
- Reference Slides on SharePoint Business Continuity Management with SQL Server Always On can be found here – http://channel9.msdn.com/Events/SharePoint-Conference/2014/SPC343