Authors: Nicholas Dritsas, Danny TambsReviewers: Mike Ruthruff, Alexei Khalyako

Executive Summary

To aid Contoso’s goal in establishing an enterprise data warehouse (EDW) with the objective of creating a single version of the truth, Microsoft has participated in a proof of concept (POC) to demonstrate the performance, scalability, and value of the Microsoft SQL Server application platform. The Microsoft team used our Fast Track Data Warehouse solution based on HP hardware and SQL Server 2008 R2 RDBMS. This involved conversion work from the current Oracle production system to the test SQL Server system. In summary, we built a Fast Track configuration that scaled according to the demands put on it by the requirements. We were able to deliver excellent performance numbers for both pre and post optimization serial runs. Additionally the performance numbers for volumized data scaled according to the growth in size of data. We did additional testing on different approaches for loading the data, that is, BULK INSERT vs. the bcp utility. We also loaded data into databases with tables using nonclustered and clustered primary keys and were able to show the workload performance numbers for both. The compression scenario delivered impressive compression ratios and indicated areas where it could be used for query performance optimization. The calculated compression ratio before compression was a near estimate of the actual table compression size. The customer’s data compressed better than the estimate predicted.The backup scenario proved that the speed and backup compression will be key wins for manageability of the data warehouse.When testing updates in place, we optimized the performance numbers by setting the fill factor appropriately to avoid page splits. Early feedback indicated that this was much faster than Oracle and other competing vendors, and an order of magnitude faster than the current production system.Finally we were able to conclude that the performance of the Fast Track configuration was better than current production by many orders of magnitude across all the testing options.Given the opportunity to deliver the enterprise data warehouse, the Microsoft Fast Track Data Warehouse can deliver workload performance far surpassing expectations with the latest G7 hardware. We believe we can deliver the best performance through our expertise in optimization and migrating code from different database platforms. 

Introduction

Contoso, one of the biggest banks in Malaysia, called for a POC among Oracle (Teradata), IBM (DB2 on zSeries), Greenplum, and Microsoft. They developed a very elaborate and complete POC scope that can be broken into five tracks:1.       Track A is a simulation of the current environment. To create Track A, load two months of information from an Oracle database. Migrate objects and data to SQL Server but do not do any optimizations. That way, get a baseline of a direct migration by running serially specific queries and stored procedures (six fixed queries, four stored procedures, and 20 or so ad hoc queries)2.       Track B. Apply optimizations to database and queries to Track A. Run the six queries and four stored procedures in parallel while updating the customer master table. Also, switch in a new partition and ensure that there are no issues with currently running queries or dirty data.3.       Track C. Start with Track B, and then multiply all the dimension tables by three and the fact details table to seven years (that is, an additional 82 months). Record load times. 4.       Track D. Modify queries from Track B to get the new bigger data set.

Hardware Configuration

Configuration: HP DL785 G5 with 8 socket quad core: 32 AMD Opteron 8376 HE, 2300 MHz and 256 GB ram. (NOTE: Even though this system is not in the list of the published HP Fast Track configurations, it followed the rules of fast track to achieve a well-balanced system between cores and LUNs.)I/O: Five HP Smart Array P411 controllers and 10 D2700 enclosures direct attached (DAS). Each enclosure had 25 disks x 146 GB each 10k rpm SAS 6G dual port. We used four controllers for FT with 32 RAID 1 LUNs of 2 disks each. The fifth controller, which was used for backup and other storage, had three RAID 5 LUNs of 8 disks each.Operating system: Windows Server 2008 R2 Enterprise with hotfixes KB 2155311, 977977, 976700, and 982383. These are all I/O related fixes for Windows Server2008 R2 and multi core systems.DBMS: SQL Server 2008 R2 Enterprise plus Cumulative Update package 3. We used –E and -T1118 trace flags. Trace flag 1118 forces uniform extent allocations instead of mixed page allocations. It is commonly used to assist in tempdb scalability by avoiding SGAM and other allocation contention points.We allocated a maximum of 250 GB RAM to SQL Server.Database configurationWe used one filegroup for the Fast Track database with 32 data files of equal size, one in each LUN because we had 32 cores. We used the same configuration for tempdb. For the transaction log, we had four extra RAID 1 LUNS of two disks, one in each enclosure.

Track A – Nonclustered Indexes as Primary Keys

Notes

·         This test ensured adherence to the loading of data in waves. The PK indexes are nonclustered, in keeping with Contoso’s wishes.·         Primary key duplicates are ignored by the process and not inserted into the tables. There are other bad-data errors. These are logged in the error log files.·         The data errors were left in the load files and stripped on load to show the worst-case scenario.Here is an example of the command we used to load data in from the command batch.bcp TrackA.dbo.[<Table>] in <Table>.txt -c -F2  -r \n -m 999999 -t"|"  -b 100000  -U %USERNAME%  -S %SERRVERNAME% -P %PASSWORD% -e <Table>.txt_Errors.log -h "TABLOCK"

Loaded Row Count

 
Table Name Actual Loaded Row Count
ACCT_HT       5,592,356
ACCT_XREF    22,898,855
CARD_CREDIT_LINE_HT       1,217,097
CERT_DEPOSIT_HT       1,146,640
CODE_HT            39,796
COLLAT_ACCT_REL_HT       2,744,101
COLLAT_CONS_HT                    71
COLLAT_CUST_REL_HT       3,955,228
COLLAT_FIN_HT          847,391
COLLAT_GUAR_HT          491,094
COLLAT_MACH_HT               7,059
COLLAT_MVEH_HT       1,246,607
COLLAT_OASST_HT               3,660
COLLAT_OVEH_HT                  174
COLLAT_PROP_HT          205,294
COLLATERAL_HT       2,789,530
COMMERCIAL_LOAN_HT          225,983
CONSUMER_INST_LN_HT 1,254,016
CREDIT_PROVISIONS_HT 1,645,043
CUST_ACCT_REL_HT 7,642,494
CUST_CUST_REL_HT 1,073,973
CUST_HT 7,165,192
CUST_NON_PERS_HT 392,467
CUST_PERS_HT 6,773,268
CUST_XREF 12,311,195
PRD_DIMENSION               4,017
PRODUCT 1,497
RETAIL_CHECKING_HT 165,041
RETAIL_CREDIT_LN_HT               8,048
RETAIL_SAVING_HT 1,583,574
TRAN_DETAIL    19,107,244
 

Comments

The total load time was 6,393 seconds. This was due to the fact that all the indexes were kept and the data was not precleaned to remove corrupt rows and duplicate primary key data. 

Track A - Clustered Indexes as Primary Keys – FAST LOAD

Notes

·         The primary key indexes were changed to clustered indexes, and primary key data and bad data were removed from the files to ensure clean fast loads.·         The nonclustered indexes were built on the data after it was loaded in a parallel batch.Here is an example of the command we used to load data in from the command batch.bcp TrackA.dbo.[<Table>] in <Table>.txt -c -F2  -r \n -m 999999 -t"|"  -b 100000  -U %USERNAME%  -S %SERRVERNAME% -P %PASSWORD% -e <Table>.txt_Errors.log -h "TABLOCK"

Comments

The total load time (to build the entire database) was 2,485 seconds. This time is made up of 2,166 seconds for the data load and 319 seconds for the creation of all the nonclustered indexes. This was much faster than the 6,393 seconds it took to load the data using nonclustered indexes as primary keys. The 2,166 seconds can be further reduced if the declarative referential integrity (DRI) constraints are removed. This was not tested because Contoso had a requirement to keep the DRI constraints in the database.

Track A - Best Practices

-                      Normally you should not have constraints on the DW tables. Foreign keys and logical integrity of the data to be loaded should be handled at the ETL layer to minimize data load performance issues. In our case, we could not avoid the constraints due to the POC requirements. -                      SQL Server Integration Services can provide a better means of cleaning the data and taking care of slowly changing dimensions. -                      You need to determine what you want to focus on: load performance for historical data or query performance. Sometimes you cannot optimize for both at the same time. In our case, we optimized for query performance.-                      Consider loading data into partitioned tables for incremental loads.-                      Use the BCP utility to load data in and specify the TABLOCK and ORDER hints. Ensure that the order of the data is the same as it is in the clustered index.

Query and Stored Procedure - Execution Summary

 

Results Before and After Optimization

This table shows the numbers before and after optimization for the execution of procedures and queries. All executions were performed serially.That means the following:
  1. Track A database was used for the before-optimization number, that is, where nonclustered indexes used for primary keys. The queries and stored procedures were migrated from Oracle and not optimized.
  2. Track B database was used for the after-optimization number. The queries and stored procedures were optimized for the after-optimization numbers.

Item

Execution time in current Oracle production Execution time (before optimization) Hr:Min:Sec Execution time (after optimization) Hr:Min:Sec
       
Q1 – Coreplan 00:16:00 00:02:00 00:00:02
Q2 – FINS_CASHDEP 00:05:00 00:03:02 00:00:02
Q3 – FINS_FDPLACE 00:03:00 00:00:02 00:00:00 (1 ms)
Q4 – MUTIARA_CPS 34:07:00 00:02:14 00:01:18
Q5 – PIDM_APP1 > 48 hours 00:00:04 00:00:02 
Q6 – PIDM_APP2 > 48 hours 00:00:17 00:00:14 
Stored Proc 1 -  CTSM_ME_PROD_PARTITION 00:23:00 00:01:06 00:00:35 
Stored Proc 2 -  MUTIARA_MAIN_JUL10 02:51:00 00:11:39 00:06:40
Stored Proc 3 -  DCS_MERGED_IND_SP 03:20:00 Runs for over 3 hours - terminated 00:03:00 - 00:05:00 (3-5 mins)
Stored Proc 4 -  Kpi_Run_On_Age 00:27:00  01:15:41 00:02:00 - 00:04:00 
 

Optimizations Performed – High Level Summary

·         Rewrite CASE and WHERE statements as joins.·         Clean key join column to remove leading and trailing spaces.·         Add nonclustered indexes.·         Create statistics.·         Build derived columns for substring joins.·         Convert Oracle-like function calls from SQL Server Migration Assistant (SSMA) to native function calls, such as:
ssma_oracle.to_char_date(sysdatetime(), 'DD-MON-YYYY HH:MI:SS AM')to CONVERT(varchar(32), sysdatetime(), 109)

Results for Seven Years Data Serial Query Execution

Description Min:Sec
Q1 – Coreplan 00:30
Q2 – FINS_CASHDEP 39:36
Q3 – FINS_FDPLACE 0:02
Q4 – MUTIARA_CPS 10:1
Q5 – PIDM_APP1 0:05
Q6 – PIDM_APP2 10:50
 The results seem proportional to the data volume and the fact that the queries were run in parallel. A data set expanded three times by account base and seven years by time will have an impact on the number of rows to be scanned as well.
Description Hr:Min:Sec
Stored Procedure 1 - CTSM_ME_PROD_PARTITION 0:12:24
Stored Procedure 2 -  MUTIARA_MAIN_JUL10 0:09:24
Stored Procedure 3 -  DCS_MERGED_IND_SP 1:00:35
Stored Procedure 4 -  Kpi_Run_On_Age 0:09:25
 The volumization of data seems proportional to the query run time in some instances. Heavy processing via function calls in search conditions and the SELECT clause of the queries within procedures lengthened the run time of procedures. NOTE: The Fins_Cashdep query has a time window of five days on Tran_Detail but the substring function applied on TXD_ACCT_NO column of TRAN_DETAIL will cause the table to be scanned. Generally, we have seen when that when functions such as CONVERT, SUBSTRING, CAST, LTRIM, and RTRIM are used, the plan becomes serial and performance can be affected negatively. For more information about queries and stored procedures execution plans and optimizations, see Appendix B.

Volume Growth and Scalability

Notes

The purpose of Track C is to measure the load time for volumized data. The data volume was increased in keeping with volumization rules set by Contoso. 

Load Process

The diagram shows the load sequence of bulk copying of the files using the BCP utility. After the BCP files are loaded, the data is switched. The file load is synchronized using a flag-file method.    

Results

 
STEP Table name TIME(ms) TIME(min)
STEP1
Parallel
ACCT_HT.LOG 1,913,961 32
CUST_HT.LOG 1,701,456 28
CUST_NON_PERS_HT.LOG 45,365 1
CUST_PERS_HT.LOG 772,750 13
Subtotal STEP1 1,913,961 32
STEP2 CUST_ACCT_REL_HT 2,807,191 47
Subtotal STEP2 2,807,191 47
STEP3
Parallel
CARD_CREDIT_LINE_HT 800,581 13
CERT_DEPOSIT_HT 248,494 4
CONSUMER_INST_LN_HT 550,090 9
CREDIT_PROVISIONS_HT 413,043 7
RETAIL_CHECKING_HT - -
RETAIL_SAVING_HT 329,147 5
Subtotal STEP3 800,581 13
STEP4-1
Serial
TRAN_DETAIL_ACCT_83 1,923,914 32
TRAN_DETAIL_ACCT_84 1,446,005 24
Subtotal STEP4-1 3,369,919 56
    

Tran_Detail Import Performance

We did two passes of the import; the results are shown for both. Pass 1 uses BCP. Here is the command.bcp %DB_NAME%.dbo.TRAN_DETAIL_ACCT_%NUM% in %3\TRAN_DETAIL_ACCT_%NUM%.txt  -c -r \n -t"|" -S %SERVER% -U%USER% -P %PASS% -b 100000 -h "TABLOCK, order(TXD_CYC_DT ASC, TXD_CYC_FREQ ASC, TXD_A_HLD_ORG_CD ASC, TXD_ACCT_ID ASC, TXD_A_FIN_INST_NO ASC, TXD_APPL_SYS_ID ASC, TXD_ACCT_NO ASC, TXD_TRAN_DT ASC, TXD_TRAN_SEQ_NO ASC, TXD_TRAN_CD ASC)"  -o %LOG_DIR%\STEP4_TRAN_DETAIL_ACCT_%NUM%.log-          This test loads the data into the table unsorted, and the server has to sort it.-          The average import time per table was 1,945 seconds (that is, 32 minutes). The longest tables import was 5,006 seconds (that is, 83 minutes).Pass 2 uses BULK INSERT. Here is the command.osql -S %SERRVERNAME% -d %DATABASE% -U %USERNAME% -P %PASSWORD% -Q "BULK INSERT %DB_NAME%.dbo.tran_detail_acct_%NUM% FROM '%3\TRAN_DETAIL_ACCT_%NUM%.txt' WITH (TABLOCK ,CODEPAGE = 'RAW',FIELDTERMINATOR ='|',ROWS_PER_BATCH = 100000, ORDER(TXD_CYC_DT ASC, TXD_CYC_FREQ ASC, TXD_A_HLD_ORG_CD ASC, TXD_ACCT_ID ASC, TXD_A_FIN_INST_NO ASC, TXD_APPL_SYS_ID ASC, TXD_ACCT_NO ASC, TXD_TRAN_DT ASC, TXD_TRAN_SEQ_NO ASC, TXD_TRAN_CD ASC))”-          The BULK INSERT time also includes the time it took to build the index. The import tables have indexes defined.-          The average import time per table was 1,620 seconds (that is, 27 minutes). The longest table’s import was 2,591 seconds (that is, 43 minutes).-          We used the TABLOCK and –h “ORDER()” parameters in the BCP command for optimization. This decreased the load time almost by half.


 

 

Loaded Row Count

 
Table Name Loaded Bulked Up Row Count
ACCT_HT 5,592,356 16,777,068
CARD_CREDIT_LINE_HT 1,217,097 3,405,424
CERT_DEPOSIT_HT 1,146,640 3,220,152
CONSUMER_INST_LN_HT 1,254,016 3,501,549
CREDIT_PROVISIONS_HT 1,645,043 4,495,969
CUST_ACCT_REL_HT 7,642,494 23,276,637
CUST_HT 7,165,192 21,495,576
CUST_NON_PERS_HT 392,467 1,177,401
CUST_PERS_HT 6,773,268 20,319,804
RETAIL_CHECKING_HT 165,041 490,279
RETAIL_SAVING_HT 1,583,574 4,341,972
TRAN_DETAIL 19,107,244 2,300,000,000
 

Comments

The best approach is to use BULK INSERT with the primary key in place. 

Fast Track Backup Performance – Bonus Item

The performance we achieved was 2,107.8 MB per second.

Command

BACKUP DATABASE [TrackC] TO  DISK = N'C:\MountR5\R5A\Backups\TrackC_F1.bak',  DISK = N'C:\MountR5\R5B\Backups\TrackC_F2.bak',  DISK = N'C:\MountR5\R5C\Backups\TrackC_F3.bak' WITH NOFORMAT, NOINIT,  NAME = N'TrackC-Full Database Backup', SKIP, NOREWIND, NOUNLOAD, COMPRESSION,  STATS = 10GO

Results

We backed up a 1.3-terabyte database in 10:44 minutes. The size of the backup file was 203 GB.

Compression of Tables – Extra Item

Compression Summary

A number of tables were compressed using page compression. The tables show the actual and estimated compression ratio.
Track D - Compression of Big Tables          
  Actual Values     Estimated    
  Before Compression After Compression Actual Compression Ratio Before Compression Estimated Compression Size Estimated Compression Ratio
ACCT_HT 18757 7262 61% 18757 9446 50%
CUST_HT 17644 9241 48% 17644 10790 39%
CUST_ACCT_REL_HT 9762 1928 80% 9762 5451 44%
TRAN_DETAIL 2102900 277507 87% 2102900 699720 67%
 
Items in red show actual compression ratio and sizes. This is higher than the estimated compression.
 Note: For TRAN_DETAIL its actual size is 85 times the partition size, because there are 85 partitions.    

Ad-hoc Query Performance

A set of ad-hoc queries ran in compressed and uncompressed state for the four tables.

Results

Description CompressedHr:Min:Sec Uncompressed Hr:Min:Sec
Query1                               2:21:4 2:10:54
Query2                               2:50:47 2:43:48
Query3                                             0:17:56 0:11:51
_ERR Fixed Sub query Track D Query4                                     0:02:02 0:02:18
Query5                                           0:00:33 0:01:20
Query6                                           0:02:47 0:02:54
Query7 0:11:54 0:12:41
Query8 0:12:9 2:51:17

 

CUST_HT, ACCT_HT, and CUST_ACCT_REL_HT were uncompressed, and TRAN_DETAIL was compressed.

Comments

Different query characteristics and different tables being used in the joins impact the run time differently. Query8 runs for a much shorter time with compression turned on, whereas other queries display different behavior. In most cases, compression turned on for the four key tables seems to work better.The key difference between Query1, Query2, and Query8 seems to be the additional CUST_PERS_HT table involved in the join and the selection of different grouping sets. The compression setting should be set based on the most common queries.

Conclusion

This paper discusses a Fast Track configuration that we built. The configuration scaled well according to the demands put on it by the different tracks. We were able to deliver excellent performance numbers for serial runs both before and after optimization. Additionally, the performance numbers for volumized data scaled according to the growth in size of the data. We did additional testing on different approaches for loading the data, that is, BULK INSERT vs. the bcp utility. We also loaded data into databases with nonclustered and clustered primary keys and were able to show the workload performance numbers for both. The compression scenario delivered impressive compression ratios and indicated areas where it could be used for query performance optimization. The backup scenario proved that the speed and backup compression will be key wins for manageability of the data warehouse.We were able to conclude that the performance of the Fast Track configuration was better than current production by many orders of magnitude.We were able to deliver all of the above with a small team of six and complete the work of setting up the Fast Track config, translating the Oracle code, writing and optimizing the code, running tests, and documenting results within a short span of time. The performance was delivered by an older generation DL785 G5 Fast Track configuration. We believe we have satisfied and even exceeded the criteria of the POC by demonstrating great performance, amazing price/performance, and scalability.Given the opportunity to deliver the enterprise data warehouse, we can deliver workload performance far surpassing your expectations with the latest G7 hardware. Fast Track 3.0 specification with the G7 family of hardware takes advantage of the latest CPU technologies such as additional cores, higher clock speeds and cache sizes, plus larger disk sizes (600 GB vs. 300 GB) that provide more capacity with the same number of enclosures. This specification is expected to be out soon.We believe we can deliver the best performance through our expertise in optimization and migrating code from different database platforms. 

Appendix A – References

 1.       Fast Track Data Warehouse 2.0 Architecture 2.       The Data Loading Performance Guide 

Appendix B – Fast Track Queries, Their Plans, and Tuning Suggestions

 

Q1 – Coreplan – QueryPlan – Not Tuned

  

Execution Results

The execution time was 2 minutes.

 

Q1 – Coreplan

Optimization - Data Fixes

Remove leading and trailing spaces ahead of time rather than as part of the query.

Query Plan

   

Execution Results

The execution time was 2 seconds.  

Q2 – FINS_CASHDEP - Query Plan – Not Tuned

  

Execution Results

The execution time was 3 minutes and 2 seconds.  

Q2 – FINS_CASHDEP

Optimization – Code Changes

Remove the function on the right side of the search condition. Create a parameter and assign the value from the function to it. Refer to the parameter directly.declare @startdate datetimeset @startdate = DATEADD(D, -5, '31-jul-10') -- old way            --(TRAN_DETAIL.TXD_BUS_DT BETWEEN DATEADD(D, -5, ssma_oracle.to_date2('31-jul-10', 'dd-mon-yy')) AND ssma_oracle.to_date2('31-jul-10', 'dd-mon-yy')             -- better way             --(TRAN_DETAIL.TXD_BUS_DT BETWEEN DATEADD(D, -5, '31-jul-10') AND '31-jul-10'              --Best way - eliminate the function altogether.            (TRAN_DETAIL.TXD_BUS_DT BETWEEN @startdate AND '31-jul-10'            

Query Plan

  

Execution Results

The execution time was 2 seconds.

Q3 – FINS_FDPLACE – Query – Not Tuned

 

Query Plan

 

Execution Results

The execution time was 2 seconds.

Q3 – FINS_FDPLACE

Optimization - Schema Changes

Precompute the columns that are used in function calls as part of search condition. Ideally, heavy functions should not be run on columns (especially in large tables) as part of a query. The calculated columns commonly used in queries should be setup as part of an ETL operation or as a computed column, persisted or otherwise.ALTER TABLE dbo.TRAN_DETAIL ADD      DT_CHOPPED_TXD_ACCT_NO_9_3 varchar(3) NULLgo      ALTER TABLE dbo.TRAN_DETAIL ADD      DT_CHOPPED_TXD_ACCT_NO_10_3 varchar(3) NULLgo update TRAN_DETAIL set DT_CHOPPED_TXD_ACCT_NO_9_3 = substring(TRAN_DETAIL.TXD_ACCT_NO, 9, 3),DT_CHOPPED_TXD_ACCT_NO_10_3 = substring(TRAN_DETAIL.TXD_ACCT_NO, 10, 3) 

Optimization - Code Changes

Refer to the precomputed column directly in the query.

Query Plan

 

Execution Results

The execution time was 1 second.

Q4 – MUTIARA_CPS - Query Plan – Not Tuned

 

Execution Results

There were multiple attempts. The best execution time was 2 minutes and 14 seconds.

Q4 – MUTIARA_CPS

Optimization - Code Changes

Replace references to Oracle-like function calls from the SQL Server Migration Assistant (SSMA) tool with native Transact-SQL calls. (ROUND(ssma_oracle.datediff(A.ACCT_BUS_CYC_DT, A.ACCT_LIMIT_APPR_DT) / 30,0)) AS MD_MTH_ON_BOOK, (ROUND(ssma_oracle.datediff(A.ACCT_BUS_CYC_DT, A.ACCT_LIMIT_APPR_DT) / 30,0)) AS MD_LN_AGE, was changed to (ROUND((convert(float(53),convert(datetime,A.ACCT_BUS_CYC_DT) - convert(datetime,A.ACCT_LIMIT_APPR_DT)) ) / 30,0)) AS MD_MTH_ON_BOOK, (ROUND((convert(float(53),convert(datetime,A.ACCT_BUS_CYC_DT) - convert(datetime,A.ACCT_LIMIT_APPR_DT)) ) / 30,0)) AS MD_LN_AGE,  Replace the old-style join with a JOIN statement.FROM    dbo.ACCT_HT  AS A       LEFT OUTER JOIN dbo.PRODUCT  AS P       ON A.ACCT_PROD_CD = P.PROD_PROD_CD,    dbo.CARD_CREDIT_LINE_HT  AS BWHERE    A.ACCT_ACCT_NO = B.VM_ACCT_NO AND was changed to FROM          dbo.ACCT_HT AS A      LEFT OUTER JOIN dbo.PRODUCT AS P    ON A.ACCT_PROD_CD = P.PROD_PROD_CD    JOIN   dbo.CARD_CREDIT_LINE_HT  AS B   ON A.ACCT_ACCT_NO = B.VM_ACCT_NO WHERE     Another possible fix would have been to create a temp table or variable table that contained the values for the IN clause and forming a join rather than a subquery. The temp/variable table would have contained an inclusion list (IN) rather than an exclusion list (NOT IN). This fix was not put in place for the query code due to time limitation.

 

 

Query Plan

  

Execution Results

The execution time was 1 minute and 18 seconds. 

Q5 – PIDM_APP1 - Query Plan – Not Tuned

  

Execution Results

The execution time was 4 seconds.

Q5 – PIDM_APP1

Optimization-Code Changes

There are multiple CASE statements like this one that can be replaced. These statements should be replaced by ETL processes that build the necessary computed columns if the computed columns are frequently used. It’s not ideal to have multiple CASE statements within a query statement.    CASE       WHEN A.ACCT_APPL_SYS_ID = 'STS' AND substring(A.ACCT_ACCT_NO, 9, 3) IN ( '001', '003', '007', '008' ) THEN 2      WHEN A.ACCT_APPL_SYS_ID = 'IMS' AND substring(A.ACCT_ACCT_NO, 10, 3) IN (  '201' ) THEN 2      WHEN A.ACCT_APPL_SYS_ID = 'STS' AND substring(A.ACCT_ACCT_NO, 9, 3) IN ( '002', '004', '013', '009' ) THEN 1      WHEN A.ACCT_APPL_SYS_ID = 'IMS' AND substring(A.ACCT_ACCT_NO, 10, 3) IN (  '202' ) THEN 1   END AS BANK_TYP The following is a low-impact change that precalculates the CONVERT statement in a variable and feeds it into the search condition.WHERE    A.ACCT_ACCT_ID = B.CAR_R_ACCT_ID AND    C.CUS_KEY = B.CAR_CUST_KEY AND    A.ACCT_APPL_SYS_ID IN ( 'STS', 'IMS' ) AND    B.CAR_PRIME_NO = 'Y'/*onli take primary holder*/ AND    (B.CAR_EXPIRATION_DT >= '31-JUL-2010' OR B.CAR_EXPIRATION_DT = CONVERT(datetime2, '01/01/0001', 101))  We did not apply these changes to our query because it was already high-performing.

Query Plan

  

Execution Results

The execution time was 2 seconds.

Stored Procedures

Comments

CTSM_ME_PROD_PARTITION performed approximately 18 times faster than it did in the existing production environment without optimizations. MUTIARA_MAIN_JUL10 performed approximately 21 times faster without optimizations. DCS_MERGED_IND_SP consists of three heavy procedures being called by a large base proc. The three procedures; all used cursors to perform precalculations. This is not an ideal way to perform the operations. Kpi_Run_On_Age performs heavy CASE-based operations that will affect the time taken to execute the queries.

Stored Procedure 1 - CTSM_ME_PROD_PARTITION

Execution Results

The execution time was 01:06 (1 minute and 6 seconds).

Optimization – Code Changes

We converted the execute statements to direct Transact-SQL statements. For example, EXECUTE ('TRUNCATE TABLE CTSM') was changed to TRUNCATE TABLE CTSM.We also converted old-style joins to Transact-SQL joins. We applied the query hint FORCE ORDER to specify that the join order indicated by the query syntax should be preserved during query optimization.We converted Oracle-like function calls from SSMA to direct Transact-SQL calls. INSERT dbo.CTSM_LOG(LOG_TIME, LOG_PROCESS_NAME) VALUES (ssma_oracle.to_char_date(sysdatetime(), 'DD-MON-YYYY HH:MI:SS AM'), 'CTSM START') was changed toINSERT dbo.CTSM_LOG(LOG_TIME, LOG_PROCESS_NAME) VALUES (CONVERT(varchar(32), sysdatetime(), 109), 'CTSM START')

Execution Results

The execution time was 1 minute and 17 seconds.

Stored Procedure 2 - MUTIARA_MAIN_JUL10

Execution Results

The execution time was 11 minutes and 29 seconds.

Optimization – Code Changes

We converted Oracle-like SSMA function calls to native Transact-SQL calls.We converted old-style joins to Transact-SQL join statements.We consolidated individual UPDATE statements into a single statement.

Execution Results

The execution time was 06:40 (6 minutes and 40 seconds).

Stored Procedure 3 - DCS_MERGED_IND_SP

Execution Results

The query ran for a long time, and it was terminated after 3 hours.

Optimization - Data Fixes

We fixed data by removing leading and trailing spaces. The data contained either leading or training spaces. This forced the code to constantly trim left and right in this fashion. We converted EXECUTE statements to direct Transact-SQL statements.As a second phase of fixes, code using cursors in the procedures DCS_IND_2B_DUP_STS, DCS_IND_2B_DUP_IMS, and DCS_IND_2A_DUP_IMS were replaced with straight Transact-SQL statements.

Execution Results

The execution time was between 3 and 5 minutes.

Stored Proc 4 - Kpi_Run_On_Age

Execution Results

The execution time was 1 hour, 15 minutes, and 41 seconds.

Optimization – Code Fixes

We replaced calls to Oracle-like SSMA function calls with a custom view.WHEN floor(MONTHS_BETWEEN_PER_BUS_CYC_DT_PER_CUST_DOB / 12) BETWEEN  0 AND 12 THEN '12 YRS AND BELOW'The following views were created to emulate the MONTHS_BETWEEN function.CREATE VIEW CUST_PERS_HT_MONTHS_BETWEEN_PER_BUS_CYC_DT_PER_CUST_DOBASSELECT -- MONTHS_BETWEEN PER_BUS_CYC_DT PER_CUST_DOBcase when ( (dateadd(dd, - day(dateadd(mm,1,PER_BUS_CYC_DT)), dateadd(mm,1,PER_BUS_CYC_DT)) = PER_BUS_CYC_DT) and (dateadd(dd, - day(dateadd(mm,1,PER_CUST_DOB)), dateadd(m,1,PER_CUST_DOB)) = PER_CUST_DOB) ) or (day(PER_BUS_CYC_DT) = day(PER_CUST_DOB))     then 12*(year(PER_BUS_CYC_DT) - year(PER_CUST_DOB)) + month(PER_BUS_CYC_DT) - month(PER_CUST_DOB)     else 12*(year(PER_BUS_CYC_DT) - year(PER_CUST_DOB)) + month(PER_BUS_CYC_DT) - month(PER_CUST_DOB)          +convert(numeric(38,19), PER_BUS_CYC_DT - dateadd(mm,datediff(mm,0,PER_BUS_CYC_DT),0) - (PER_CUST_DOB - dateadd(mm,datediff(mm,0,PER_CUST_DOB),0)) ) / 31end as MONTHS_BETWEEN_PER_BUS_CYC_DT_PER_CUST_DOBFROM CUST_PERS_HTgo

Execution Results

The execution time was between 2 and 4 minutes.