Database Snapshots (http://msdn2.microsoft.com/en-us/library/ms175158.aspx) in SQL Server 2005 Enterprise Edition allow a read-only point-in-time view of data. Recently we encountered an interesting scenario with a customer implementation. The customer creates a database snapshot. Then a query is executed on the database snapshot. The first time a set of queries are executed, they take 17 minutes. Every subsequent execution of the same queries takes only 20 seconds. Obviously, the first execution required reading the data from the disk (physical reads) and takes a longer time, whereas, the second and subsequent execution of the query finds the data in the buffer pool (logical reads), and therefore takes less amount of time. There is nothing unusual here.
What is interesting is that right before querying the database snapshot, the source database was queried for exactly the same data, so essentially, the source database pages required to satisfy the query are already in the buffer pool, by the time we execute the first query on the database snapshot. So, the customer expected that, since the query against the database snapshot actually reads data from the source database (in this case, since the data has not changed after the database snapshot was created), it should have used the pages in the buffer pool, instead of doing the physical reads. But it didn’t.
The reason lies in the way the queries work on a database snapshot. In this blog we will discuss some of the internals that will throw some light on this. This blog assumes an understanding of database snapshots.
We will analyze the behavior by creating a database snapshot on the AdventureWorks database:
So, for the example discussed here, AdventureWorks is the “source database” and AdventureWorksSnapshot is the “database snapshot”. Note that the database snapshot has its own data file (AdventureWorks_DataSnapshot.mdf), which is a Windows sparse file (refer to Books Online for more details).
For our test, no changes have been done to the source database after the database snapshot was created. Therefore, the sparse file of the database snapshot is empty. At this time, if you run a query on the database snapshot, SQL Server will read the data from the data file of the source database.
Now, let’s execute a query on the source database. But before we do that we need to clear the buffer cache by issuing dbcc dropcleanbuffers.
Let’s also turn on the io and time statistics.
Now, let’s execute the following query one time:
The io and time statistics gives us the following output:
The above results indicate that physical reads were performed on the database files to retrieve the result of the query. Now let’s execute the same query one more time:
The second execution of the query didn’t need any physical reads because after the first execution, the required pages are already cached in the buffer pool.
Now, let’s execute the same query on the database snapshot AdventureWorksSnapshot, and observe the io and time statistics.
Note that the first query on the database snapshot required physical reads. Since no data in the source database was changed after the database snapshot was created, the sparse file is empty, and therefore, the data is retrieved from the source database. And, since, the required pages of the source database are already cached in the buffer pool, the customer expected that there is no need to perform any physical reads. But there were physical reads, why?
A peek into the internals of how database snapshots work will reveal the mystery.
When a database snapshot is created, an in-memory bitmap is maintained to indicate whether a page has been changed in the source database or not. If the page is changed, then the original page has been copied to the database snapshot’s sparse file. If the page is not changed, the original page is still there in the source database. While executing a query, this bitmap is referenced to find out whether a particular page is to be retrieved from the database snapshot’s sparse file or from the source database’s data file.
The database snapshot is just another database from the SQL Server engine point of view. Each database maintains its own cache in the buffer pool. Data pages cached in the buffer pool for one database cannot be copied or transferred to the cache managed by another database (or database snapshot) in the same SQL Server instance. Therefore, when we execute a query the first time on the database snapshot, the database snapshot has no pages in the buffer pool, and therefore decides to read from disk. Even though the source database has already read these pages into the buffer pool it still performs physical reads against the data files of the source database, and places the pages in the buffer pool associated with the database snapshot.
The number of data pages for each database read into the buffer pool is found by querying the sys.dm_os_buffer_descriptors.
The output we get:
Once the pages are read from disk, they are put in the buffer pool for the database snapshot, and the subsequent queries that need these pages will be able to use the cache. Let’s execute the same query again on the database snapshot to verify this.
The io and time statistics gives us the following output:
As you can see, the second execution of the query didn’t do any physical reads.
The queries against the source and snapshot database demonstrate that the data pages are maintained in the buffer pool independently of each database. Fundamentally, reading any database’s data pages that do not exist in the buffer pool will resolve to disk. Databases do not share data pages in the buffer pool. In the case of a snapshot database where the queried data pages are not in the buffer pool, the in-memory bitmap is checked to determine if the data pages are in the sparse files for the snapshot database or in the source databases files. The source database’s data pages in the buffer pool will never be read or copied by a snapshot database.
OVERVIEW: One of the common complaints in using database snapshots is how to get queries and reports
Pingback from Make Extra Money » Blog Archive » Microsoft SQL Server Database Snapshots and Synonyms
Pingback from spiritsex » Blog Archive » Microsoft SQL Server Database Snapshots and Synonyms
Overview I recently conducted some tests to double check the exact behavior of database snapshots when: