Exchange 2007 CCR Failover Process

Posted on Updated on

Before failing over in a CCR environment, it is highly recommended that you verify that the status of the replication is in a healthy state. If it is not in a healthy state, it will fail over but the databases will fail to mount.

 

To check the state of the replication, open the Exchange 2007 Management Console. Browse to Server Configuration è Mailbox. In the window, it should show the Storage Groups and Databases in a Mounted state with a Copy Status of Healthy. If either of these conditions are not present, then troubleshooting to determine why the database is not mounted, or why the state of the replication is not in a healthy condition.

 

The cmdlet that is use d to review replication status is:

Get-StorageGroupCopyStatus (Use the |FL switch to get full details)

 

Possible CCR States

The possible CCR states include:

·         Not Supported The current configuration does not support local continuous replication.

·         Disabled – Storage group does not have a configured copy. There is no passive node configured for this clustered mailbox server.

·         Failed – Verification failed, or the storage group is only partially configured for CCR.

·         Seeding – Full database seeding is in progress.

·         Stopped – Transaction log copying is stopped.

·         Suspended – Transaction log copying and replay is stopped.

·         Healthy – The CCR copy is healthy and normal, and nothing is blocking or blocked.. They are

 

Failed

A Failed state indicates CCR is currently broken and needs to be resynced. The section Recovering from Failed CCR Logging (Reseeding the database) will walk you through the process of re-initializing CCR.

 

Seeding

Seeding is the process of making available a baseline copy of a database on the current passive node. Depending on the situation, seeding can be an automatic process or a manual process in which you initiate the seeding. You can use the procedure in situations where you determine seeding is required. The size of the database being copied directly correlates to the amount of time it takes for the seeding task to complete.

 

Seeding is required under the following conditions:

·         When a new passive node is introduced into a cluster continuous replication (CCR) configuration, and the first log file of the production storage group is not available.

·         After a failover occurs in which data is lost as a result of the now passive node having become diverged and unrecoverable.

·         When the system has detected a corrupted log file that cannot be replayed into the database copy.

·         After an offline defragmentation of the production database occurs.

·         After a page scrubbing of a database on the active node occurs, and you want to propagate the changes to the passive node.

·         After the log generation sequence for the storage group has been reset back to 1.

 

You can perform seeding in Microsoft Exchange Server 2007 by using the following methods:

·         Automatic seeding   An automatic seed produces a copy of a storage group’s database on the target. Automatic seeding requires that log1 be available on the source. Automatic seeding only occurs during the creation of a new server, creation of a new storage group and database, or on a database that has never been backed up.

·         Seeding using the Update-StorageGroupCopy cmdlet   You can use the Update-StorageGroupCopy cmdlet in the Exchange Management Shell to seed a storage group copy.

·         Manually copying the offline database   This process dismounts the database and copies the database file to the same location on the passive node. If you use this method, there will be an interruption in service because the procedure requires you to dismount the database.

 

This information was copied from Microsoft’s Technet Site. For more details please follow this link:

http://technet.microsoft.com/en-us/library/bb124706.aspx

 

Healthy

This is the state that is most desirable. This indicates that replication is in place and functioning as expected.

 

CCR Failover Process

If both the database is mounted and the copy status is healthy you can proceed to fail over from one instance to the other using the following cmdlet:

Move-ClusteredMailboxServer – Identity SERVER {ie Exchange2K7} –TargetMachine TARGET SERVER NODE {ie P3EX7M1}

You will be prompted to enter Move Comments and to confirm the move. Enter whatever relevant details for the move comments and enter ‘A’ to confirm

Note: Though it is possible to use the GUI based cluster manager to fail over with CCR logging it is not recommended by Microsoft. If you are performing this action and a database is unmounted because replication is not functioning properly, you can perform this action with the –IgnoreDismounted switch.

You can watch the failover take place either in Cluster Manager or in the Exchange System Manager. Once it is complete, go to the Exchange System Manager to verify that the databases properly mounted and the Copy Status is healthy.

 

Recovering from failed CCR Logging (Reseeding the data)

If you run into a situation where the Copy Status is in a Failed state. Failover to the alternate node will not work as expected. The failover will occur but the database will not mount. CCR must be reestablished for the database to successfully mount without data loss. To reestablish CCR logging take re-seed the failed Storage Group using the following process:

 

1.     On the inactive passive node of the cluster open the Exchange Management Shell type:
Suspend-StorageGroupCopy – Identity ‘EXCHANGECLUSTERNAME\STORAGEGROUPNAME

2.     Browse to the location of the logs and the database on the passive mode and back up both of those directories (moving the files to a location outside of that directory).

3.     On the inactive passive node of the cluster open the Exchange Management Shell type:
-Update-StorageGroupCopy – Identity ‘EXCHANGECLUSTERNAME\STORAGEGROUPNAME
Note: This process will take several hours. In our case, it took about 6 hours for Exchange to fully replicate a 22GB database.

4.     Once the process is complete, go into the Exchange Management Shell and type:
Resume-StorageGroupCopy –Identity ‘EXCHANGECLUSTERNAME\STORAGEGROUPNAME

5.     At that point in time, you should be able to go in the Exchange Management Console and verify that CCR is healthy, working correctly, and capable of a proper failover from one node to another.

 

Verifying CCR Copy Using ESEUtil

If you want to verify that logs and database are in a consistent state, ESEUtil can be used to check the replicated logs on the passive server. This process requires halting the CCR Copy and restarting it once the process is complete.

1.     On the inactive passive node of the cluster open the Exchange Management Shell type:
Suspend-StorageGroupCopy – Identity ‘EXCHANGECLUSTERNAME\STORAGEGROUPNAME

2.     Open a cmd prompt and navigate to the bin directory (where eseutil.exe resides)

3.     To verify the state of the logs files: eseutil /k <LogFilePrefix>

4.     To verify the state of the Database file: eseutil /k <Path and filename of database file>

5.     If ESEUtil does not report any problems, proceed to resume replication by running the following cmdlet: Resume-StorageGroupCopy –Identity ‘SERVERNAME\STORAGEGROUPNAME

6.     If ESEUtil report errors (on the passive node only – not the active node) the database should be re-seeded from the active node (instructions are provided in the previous section to accomplish this.

 

Summary

 

When working with CCR logging it is important to verify that data is in sync and replicating properly. Always verify the status of replication prior to performing any failover activity.

 

Interesting Technet links:

1.     How to halt CCR Copy:
http://technet.microsoft.com/en-us/library/aa998041.aspx

2.     How to restart Repication on a CCR Copy:
http://technet.microsoft.com/en-us/library/bb123565.aspx

3.     How to restore after Database Corruption:
http://technet.microsoft.com/en-us/library/bb124499.aspx

4.     How to view Failover CCR:
http://technet.microsoft.com/en-us/library/bb124499.aspx

5.     How to move the CCR server between nodes:
http://technet.microsoft.com/en-us/library/aa998282.aspx

6.     General CCR Details:
http://technet.microsoft.com/en-us/library/aa997676.aspx

 

 

Provided by Alan Watt

About these ads

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s