The server got restarted, all the replicats we had set up were in status "Starting...", but none was actually doing anything.
Attempting to stop them got the following error:
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
REPLICAT STARTING REPLICAT1 00:00:00 00:35:16
REPLICAT STARTING REPLICAT2 00:00:00 00:35:08
GGSCI (serv7) 8> stop r*
Sending STOP request to REPLICAT REPLICAT1 ...
ERROR: opening port for REPLICAT REPLICAT1 (Connection refused).
Sending STOP request to REPLICAT REPLICAT2 ...
ERROR: opening port for REPLICAT REPLICAT2 (Connection refused).
Stopping/Starting the manager service didn't help either - they still said "Starting" and were unresponsive. Before I even attempted to start the replicat for the first time, it said "Starting", and an attempt to start it gave me "ERROR: REPLICAT REPLICAT2 is already running.".
The cause was the replicat process status file, located in the DIRPCS folder under the Goldengate home - there should be a file for each replicat that's currently running giving details about the status. When a replicat stops, this file is deleted. Since all of the current replicats weren't doing anything (they were all sitting at the end of the previous trail file), they should have been stopped. I renamed the PCR files for the affected replicat processes, and then manager reporting "ABENDED" - at that point, I was able to start up each replicat without issue.
$ ls -lrt
total 12
-rwxr----- 1 dba oracle 66 May 29 16:49 REPLICAT1.pcr
-rwxr----- 1 dba oracle 66 May 29 16:50 REPLICAT2.pcr
-rwxr----- 1 dba oracle 56 May 29 16:57 MGR.pcm
prddb1:serv7:prddb1:(392) /dev/prddb1/ggs/12.1.2.1.0/dirpcs
$ mv REPLICAT1.pcr REPLICAT1.pcr.old
prddb1@PRD:serv7:prddb1:(397) /dev/prddb1/ggs/12.1.2.1.0/dirpcs
$ mv REPLICAT2.pcr REPLICAT2.pcr.old
GGSCI (serv7) 1> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
REPLICAT ABENDED REPLICAT1 00:00:00 00:38:55
REPLICAT ABENDED REPLICAT2 00:00:00 00:38:47
GGSCI (serv7) 2> start R*
Sending START request to MANAGER ...
REPLICAT REPLICAT1 starting
Sending START request to MANAGER ...
REPLICAT REPLICAT2 starting
GGSCI (serv7) 3> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
REPLICAT RUNNING REPLICAT1 00:00:00 00:44:48
REPLICAT RUNNING REPLICAT2 00:00:00 00:44:40
GGSCI (serv7) 1> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
REPLICAT RUNNING REPLICAT1 00:06:56 00:00:00
REPLICAT RUNNING REPLICAT2 00:00:02 00:00:08
No comments:
Post a Comment