If a filegroup in SQL Server contains more than one file, SQL Server will “stripe” allocations across the files by using a proportional fill algorithm. If the files in the group have the same size (which we recommend), the allocation is essentially a round-robin. The “stripe size” of this round-robin is by default one extent - 64KB. Hence, the first allocated extent goes to the first file in a filegroup, the second extent to the second and so on. This striping mechanism can be quite useful, because you can spread your I/O over several LUNs by allocating a data file to each. You should strive to have all files in a filegroup be of equal size. Using Files and Filegroups on TechNet rovides more background on filegroup and allocation.
If a filegroup receives a lot of insert activity, the pressure on the PFS and GAM/SGAM page access becomes significant. At some point, this becomes a bottleneck, effectively slowing down the insert throughput. In a data warehouse load, the contention is typically on PFS pages.
The format of the resource_description column is: DBID:FILEID:PAGEID. You can use the resource_description to look up the page in sys.dm_os_buffer_descriptors to see if your wait is on a PFS page.
Increasing the number of files is useful if you use the files to “stripe” across LUNs or if you run into the PFS bottleneck as described above.
Testing with TPC-H LINEITEM data shows the following pattern when loading the database using minimally logged operations:
The above numbers are for a single bulk stream to the file.
SQL Server is quite good at bundling I/O operations together in large blocks – a process known as scatter/gather. In our test, we tried to increase the concurrency of the bulk load to utilize this functionality. With 64 bulk streams, we were able to drive block sizes up to 196 KB, even with 32 files in a single filegroup. But still, with 1 file in each of multiple filegroups, we were getting a faster, 256 KB block sized I/O.
Having many filegroups in a database adds an administrative burden: you now have to balance the space usage in the database between the filegroups. Therefore, you probably do not want to go overboard in optimizing your I/O block sizes by adding filegroups with few files and allocating your tables across them. In our TPC-H load test we only saw a 5% increase in disk throughput from optimizing block sizes and the benefit was only realized above several hundred MB / Sec insert speeds.
The SQL Server startup flag –E forces SQL Server to allocate 4 extents at a time to each file, essentially quadrupling the stripe size. In heavy insert scenarios, this drives larger block sizes to the disk. Also, your pages allocation will be more sequential with the same data file, allowing better sequential I/O for range and table scan operations (which are common in OLAP workloads).
To allocate the optimal number of files you must understand your database workload. The amount of insert activity is the determining factor. You also must balance the following factors:
In extreme cases, when inserting hundreds of MB / Sec, you can benefit from partitioning your table into filegroups, each with low number of files to bundle I/O requests together and create larger block sizes with sequential disk access. But, by doing this you assign performance priority over ease of administration.
This is very useful as a reference point to present when considering the use of multiple files in a filegroup. I note that although the -E is not support for x32 it is present - would it be useful for very heavy tempdb usage where I seem to be getting contention, even with multiple files?
Hi Colin
Thanks for your comments. I am not 100% sure what you are asking regarding tempdb but here is a stab at an answer anyway.
If you have PFS or SGAM/GAM contention in tempdb, -E will probably not help you. You should instead look at adding more files or even use the trace flag -T1118 described here: support.microsoft.com/.../328551
In questi giorni sono finalmente riuscito a sentire l'amico Tomas del SQL CAT (Customer Advisory
A question I get asked quite a lot is "how many files / filegroups should my database have?". As we all
A question I get asked quite a lot is "how many files / filegroups should my database have?"
Pingback from Allen Hermann » Blog Archive » How many files should a database have? - Part 1: OLAP workloads
Hi Thomas,
Great article!
Just a quick question about multiple raid-groups to complicate matters a bit.
In general more spindles means better performance. Many hardware vendors selling SAN disk sub systems have a limit to the optimal amounts of disks in a single raid-group. This means that in large databases you will create several raid-groups. Say we create 8 raid-groups. This means that we will present 8 LUNs to the operating system. To use all LUNs (all spindles) we will have to create at least 8 files in our filegroup (one per LUN). This means as I understand it that we in average will write with 64 I/O block sizes to the files (because we have 8 files in our filegroup).
I’m I correct in assuming this or does SQL Server take into account that we are writing to different LUNs and thus uses another algorithm?
Would the –E option make any sense? With the –E option we will write bigger blocks to our files but at the same time limit the amount of spindles over which we distribute the data.
Greetings,
Ernst.
Hi Ernst
Thanks for your feedback. The more spindles = more performance is not always fully true. Once a RAID array becomes very wide, performance on sequential I/O will not always scale linear.
Using multiple files, you can work around this limitation by driving sequential streams directly to a narrow groups. With the -E flag you can benefit even futher, because data is now grouped together.
You may find the I/O configurations in the fast track warehouse a useful read, futher detailing this subject:
msdn.microsoft.com/.../dd459146.aspx
По материалам технической статьи, посвящённой SQL Server: The Data Loading Performance Guide Авторы:
Pingback from SQL 2008 Table Partitioning Example | Drikus's Mind Musings
It is important to optimize TempDB for good performance. In particular, I am focusing on how to allocate
Great information Thomas. Any idea when you might be posting part 2 of this series?
Also, from your example it looks like you were seeing contention on the 358th PFS page (2895504/8088 = 358 exactly). I'm curious how things change when you have a large file with lots of PFS pages like this. If I have an ascending clustered index (like an identity column) vs. a randmon clustered index (like a GUID), it seems like the ID would have more PFS page contention than the GUID. Is that true?
Pingback from AppFabric CAT | BizTalk group with unused extra MessageBoxes (publishing disabled) causes increased CPU usage on the Master MessageBox and lower overall throughput
This question has already been asked - but are there plans to publish part 2 (for OLTP workloads)?