A Sparx DBMS to DBMS Project Transfer (Oracle to SQL Server) was previously successful in our DEVELOPMENT environment, but a recent attempt to migrate PRODUCTION Oracle to PRODUCTION SQL Server has failed.
The project transfer failed between migrating table t_diagramlinks and table t_diagramobjects. Details below.
I suspect that SQL Server was busy building indexes for table t_diagramlinks in SQL Server, then Oracle got tired of waiting for next SQL query on table t_diagramobjects, so Oracle timed out with “ORA-03113: end of file on communication channel”. One of the most frequently reported causes of an ORA-03113 results from the presence of a firewall. This can end the Oracle connection after a certain period of time.
Here are my questions.
1. Would you recommend utilizing a sqlnet.expire_time parameter? See explanation below.
2. After target SQL Server table t_diagramlinks has the same record count as the source Oracle table t_diagramlinks, why does it take a long time before any records start to appear in target SQL Server table t_diagramobjects?
3. After target SQL Server table t_diagramlinks has the same record count as the source Oracle table t_diagramlinks, what is happening during that period before any records appear in target SQL Server table t_diagramobjects?
4. After target SQL Server table t_diagramlinks has the same record count as the source Oracle table t_diagramlinks, does the project Transfer process stop talking to the source DB (Oracle) for a period of time before retrieving t_diagramobjects records from the source DB (Oracle)?
Details of Project Transfer
I launched a Sparx Project Transfer (Oracle to SQL Server) on Thursday 22nd August. I monitored progress by counting the records in the target DB and comparing with records in the source DB. The following record counts illustrate the size of the migration task, by listing just some of the major tables in Oracle. On Monday 23rd August at 18:00 (5 days & 4 hours since the launch) the following progress was noted.
· t_xref (181,891 > 181,891) fully migrated
· t_authors (270 > 270) fully migrated
· t_stereotypes (776 > 776) fully migrated
· t_object (119, 111 > 119,111) fully migrated
· t_package (6,821 > 6,821) fully migrated
· t_objectproperties (1,180,808 > 1,180,808) fully migrated
· t_operation (10,774 > 10,774) fully migrated
· t_attribute (20,959 > 20,959) fully migrated
· t_objectscenarios (3,350 > 3,350) fully migrated
· t_connector (127,944 > 127,944) fully migrated
· t_diagramlinks (78,764 > 78,764) fully migrated
At this progress rate, it was estimated that migration will finish in the evening of Monday 26th August.
After t_diagramlinks was fully migrated, there was a long pause. Nothing seemed to be happening. Maybe new indexes were being built in t_diagramlinks in SQL Server. Then, before migrating any records to t_diagramobjects, the process crashed.
The following errors were displayed
· OraOLEDB [-2147467259]
· ORA-03113: end of file on communication channel
As instructed by a Sparx EA message, I pressed OK to continue table upgrade, but the session ended with message 'Migration Complete'. No more data was migrated.
In Oracle SQL Developer I received message 'DB connection has been reset' at the same time as the Sparx migration failed due to 'ORA-03113: end of file on communication channel'.
No records were migrated to the following remaining tables.
· t_diagramobjects (102,981 > )
· t_snapshot (43,130 > )
I suspect SQL Server was busy building indexes for table t_diagramlinks in SQL Server, then Oracle got tired of waiting for next SQL query on table t_diagramobjects.
One of the most frequently reported causes of an ORA-03113 results from the presence of a firewall. This can end the Oracle connection after a certain period of time. For this, utilize a sqlnet.expire_time parameter. What this will do is enable the server to detect a dead connection after a specified period of time, and if not connected the server process will exit. This will effectively render the firewall to consider that the connection is still live.
Our DB Team suggested that Firewall timeouts will terminate inactive sessions after a fixed period. DB server connections are first received on port 1570 , then rerouted to 1571 or 1561 depending on the config. In the firewall, all these ports have max timeout threshold.
sqlnet.expire_time was not set. I have requested a sqlnet.expire_time parameter change to sqlnet.expire_time=1, but the DB team have said this would impact many other systems, so they will not make the change without extensive testing of all systems.