Author: Nicholas Dritsas Contributers: Hae Jin Park Technical Reviewers: Denny Lee, Burzin Patel
A large online gaming company tested a number of methods to quickly and efficiently move data between an Oracle 9i/UNIX system and SQL Server 2005. Their goal was to move 200‑GB of daily user activity data within two hours. The data was to move without the use of transformations from Oracle’s OLTP system to the SQL Server Operational Data Store (ODS) database. An ODS database is used in data warehouse architecture to hold an exact and intact copy of the source data, representing the only true data and history. The ODS data would be used later by SSIS packages to transform and load data into a staging database, a DW database, and an OLAP cube for reporting purposes.
Note This Technical Note does not describe a suggested method to fully migrate data from Oracle to SQL Server in all cases. Depending on the application, third-party tools and wizards may be a better fit for your migration.
The company started by using a 64-bit server with 32 and 64 cores, testing both x64 and ia64. They installed SQL Server, Analysis Services and SSIS on that server. Since it was a 64-bit platform, the Microsoft OLE DB Provider for Oracle was ruled out because it does not yet support 64-bit environments. The project team decided to use Oracle’s 10.2.0.2.2 version provider. Here are the initial tests and findings:
Solution 1: The first solution they tried was using the SQL Server Integration Services (SSIS) OLE DB Source task to move the Oracle 200-GB data into SQL Server. However, the speed was very slow and they could not meet their goal of two hours so they stopped the test. In this scenario, they used the 64-bit server and the 64-bit provider from Oracle and the projected time of completion was about 18 hours.
Solution 2: The next solution was to use a separate 32-bit server (8 cores, 20-GB RAM) for SSIS only and 32-bit Oracle’s OLEDB Provider for Oracle. They downloaded the Oracle data into a binary file (using the network drive on the 64-bit server) by using the SSIS Raw File destination task. From there they uploaded the data into SQL Server staging tables, and then converted it into SQL ODS tables by using the SSIS Raw File source task.
Result: 2 hours 47 min elapsed.
Issue : Large staging file and staging DB (not ODS) size
Even though the size of the Oracle data was 200 GB, the binary file size was over 600 GB. Then, the 600 GB of data had to be uploaded into SQL Server tables without data conversion. After that, it was necessary to convert it into SQL Server target ODS table format. The process required at least 1.4 terabytes of additional disk space for the binary file and staging tables.
Solution 3: Next they tried using DataDirect Technologies’ Oracle provider with wire protocol on the x64 server (it seems DataDirect does not have an ia64 provider). The achieved speed results were very impressive. In a comparison test, they loaded 10,000 records from Oracle to SQL Server. In that test, the DataDirect provider took 2 seconds versus 22 seconds when using the Oracle provider. However, the overall test failed because of conversion errors.
Issue: They tried to retrieve the Oracle data by using the SSIS OLE component with DataDirect’s provider (x64) but there was a code page conversion error for the SQL Server code page CP949 (Korean character set). They tried to use Unicode conversion by using an output column (in SSIS), but this did not work either. They plan to open a support case with DataDirect. It seems there are some special Korean characters that do not get mapped properly in CP949 and the driver fails.
Solution 4: The company finally decided to use the SSIS OLE DB source and target components to move data from Oracle to SQL Server directly through an 8‑core 32‑bit server using Oracle’s32‑bit OLEDB provider (ver. 10.2.0.2.2) and Korean_90_Bin collation (with the SQL Server ODS database). When Korean_90_bin is used, they do not have to convert Korean data explicitly in Oracle, thus reducing response time.
Result: The response time was 2 hr 33m 32 sec (without the need of an additional 1.4 terabytes of disk space).
· Oracle OLEDB Provider performance appears to be better optimized on 32‑bit compared to x64 or IA64. (Oracle driver speed: x86 > x64 > IA64). So, the company had to use 32‑bit middle server to meet the target time.
· It seems that the 32-bit middle server's spec is important to the data move time. Based on this test result, if they used a 32-bit server with more than 8 cores, it is possible they could reduce the time even more. But, based on the tests, when they went from 8 cores to 16 cores, the difference was small. The total time went from 2h33min to 2h11min.
Following are the test results on an HP superdome IA64 with 16- and 32-way plus a Unisys ES7000 16‑way and 32-way x64 based on EMC and Hitachi storage. The SQL Database DW size is 47 GB and the Analysis Services database with 20 cubes is 19 GB. It appears that a Unisys x64 32-way EMC may be the right option for their requirements.
Based on these results for the specific requirements of this company it appears the best solution is Solution 4. The company plans to get the following servers to maximize price/performance ratio:
· A 4 dual-core 32‑bit server with 20‑GB of RAM using Oracle’s latest 32‑bit OLE DB Provider as a dedicated SSIS server
· An 8 dual-core x64 server with 32‑GB of RAM and EMC SAN to host SQL Server RBMS and Analysis Services
For more information about SSIS connectivity options, see Connectivity and SQL Server 2005 Integration Services.
Pingback from Moving Large Amounts of Data Between Oracle and SQL Server: Findings… | Online Gaming
Oh no...did I just say the O word, oh well I have to live in real world and accept that there are many
Pingback from Websites tagged "olap" on Postsaver
Did you try fastreader for same purpose? It performs direct path. The extract of 200GB with fast storage and multiple cores could take about 10 minutes and then quick load with bcp
Pingback from Bring Oracle Data into SQL Server « Blog Home for MSSQLDUDE
Pingback from Windows Server 2008 R2 Maximum Processor Limit Confusion Drija