SQL Server Customer Advisory Team - SQL Server Best Practices

Enabling SQL Server customers to navigate the most challenging frontiers of large scale data management.

Technical Notes

The Many Benefits of Money…Data Type!

Authors: Denny Lee, Thomas Kejser, Matt Neerincx
Contributors / Reviewers: Peter Carlin, Brandon Berg, Eric Jacobsen, Burzin Patel

 

Background

Our initial reason for looking at the money data type can be found within the Precision Considerations for Analysis Services Users white paper. In this white paper, we provide extensive examples of the types of precision issues when your SQL relational data source and your Microsoft® SQL Server® Analysis Services cube have different non-matching data types (e.g., if you query one way you get the value 304253.3251, but run the query in another way and you get the value 304253.325100001).

To avoid these types of problems, you need to ensure that your SQL relational data source and Analysis Services measure groups have matching data types. By default, when you create an Analysis Services measure on a money data type, Microsoft Visual Studio® Business Intelligence Development Studio will set the data type reference to double. To avoid precision loss and have faster performance, you should change the data type to currency within the Source Properties as noted in the screen shot below.

image

 

Show Me The Money! ...Data Type for Faster Processing Performance

Working on customer implementations, we found some interesting performance numbers concerning the money data type. For example, when Analysis Services was set to the currency data type (from double) to match the SQL Server money data type, there was a 13% improvement in processing speed (rows/sec). To get faster performance within SQL Server Integration Services (SSIS) to load 1.18 TB in under thirty minutes, as noted in SSIS 2008 - world record ETL performance, it was observed that changing the four decimal(9,2) columns with a size of 5 bytes in the TPC-H LINEITEM table to money (8 bytes) improved bulk inserting speed by 20%. Note that within SSIS, the equivalent of the money data type is DT_CY, which currently does not support fast parse. Hence, getting money out of text files may incur additional cost.

Note that these tests were performed on 64-bit systems. Relative performance may be different in the 32-bit editions of SQL Server because of differences in the way it performs 64-bit integer (or money) operations.

 

Money vs. Decimal vs. Float Decision Flowchart

Below is a high-level decision flowchart to help you decide which data type you should use. Note that this is a generalization that may not be applicable to all situations. For a more in-depth understanding, you can always refer to Donald Knuth’s The Art of Computer Programming – Volume 1.

 

image

 

As well, remember that different data types have different client API mappings. Some more in-depth references to this include SQL Server Data Types and ADO.NET and A Money type for the CLR.

 

Money (Data Type) Internals

The reason for the performance improvement is because of SQL Server’s Tabular Data Stream (TDS) protocol, which has the key design principle to transfer data in compact binary form and as close as possible to the internal storage format of SQL Server. Empirically, this was observed during the SSIS 2008 - world record ETL performance test using Kernrate; the protocol dropped significantly when the data type was switched to money from decimal. This makes the transfer of data as efficient as possible. A complex data type needs additional parsing and CPU cycles to handle than a fixed-width type.

Let’s compare the different data types that are typically used with money (data types).

 

Breakdown

 

money

 

decimal

 

float

 

Simple/complex data type:

 

Simple data types align more directly to native processor types. Complex data types require CPU to review type metadata and to perform branching.

 

Simple

 

Complex

 

Simple

 

Fixed/variable length writers: Because a variable-length data type may incur a memcpy when moving, causing additional CPU overhead, use a fixed 8-byte or 4-byte integer assignment if possible.

 

Fixed

 

Variable

 

Fixed

 

Storage format: Incurs less overhead if the data type is composed of native literals (e.g., int, uint, long, ulong) instead of approximate data types (e.g., float).

 

8-byte integer

 

Scaled integer (one sign byte plus one to four ulong depending on precision)

 

8-byte integer

 

Comments: This row lists other issues of concern.

 

None

 

TDS wire format is always packed, so extra overhead is required to pack and unpack this data type.

 

Approximate data are types more expensive to compare/convert than native literals; there may be precision issues on conversion.

 

 

The key here is that the money data type is a simple fixed-length integer-based value type with a fixed decimal point. Composed of an 8-byte signed integer (note that small money is a single 4-byte integer) with the 4-byte CPU alignment, it is more efficient to process than its decimal and floating point counterparts. The other side of the coin is that floating points (but not decimal) can be more quickly calculated in the floating point unit of your CPU than money. However, bear in mind the precision issues of float as noted above.

 

Saving (Space for) Your Money!

In the context of SQL Server data compression, the money and small money data types tend to compress well when the absolute value is low (e.g., values between -0.0128 and 0.0127 compress to 1 byte, while values between -3.2768 and 3.2767 compress to 2 bytes). It is the absolute value that matters for compression, not the number of significant digits; both 1,000,000 and 1,234,567.8901 will take 5 bytes compressed. On the other hand, decimal will compress better when there are fewer significant digits. For example, both 1,000,000 and .0001 will compress to 2 bytes, but 1,234,567.8901 will take several more bytes because it has more significant digits.

 

$ummary

There will be many scenarios where you preferred option will still be to use data types such as decimal and float. But before skipping over this detail, take a look at your data and see if you can change your schema to the money data type. After all, a 13% improvement in Analysis Services processing speed and 20% improvement in SSIS processing isn’t chump change.

…and that’s our $0.02.

Comments

 

Henrik Staun Poulsen said:

One word of causion, please:

In this thread groups.google.com/.../8926f7fa31fa4832

Joe Celko shows an example where he gets a calculation error of 5 in 10,000

This only applies if you want to use your money field for multiplication or division.

October 2, 2008 6:55 AM
 

denny.lee said:

that's true, but you will get that problem if you do not choose the right data type, whether it be money or decimal.   When you start doing various calculations, you will invariably run into precision issues - please refer to the Precision Considerations for Analysis Services Users at: sqlcat.com/.../precision-considerations-for-analysis-services-users.aspx

October 2, 2008 9:54 PM
 

Microsoft SQL Server Development Customer Advisory Team said:

After the great work from the ETL World Record (for more information, refer to our other blog at: blogs.msdn.com/.../scaling-heavy-network-traffic-with-windows.aspx

October 9, 2008 1:11 PM
sql, server, best practices, whitepapers, analysis services, data mining, olap, datawarehouse, datawarehousing, availability, clustering, capacity, collation, data types, data warehouse, database, design, index, mirroring, optimization, partitions, performance, precision, processing, querying, scalability, security, reporting services, integration services
Copyright 2008 Microsoft Corporation. All Rights Reserved.