Sam's SQL Server blog: July 2008

Monday, 28 July 2008

Script to RESEED Database in SQL Server 2005

It is often required to RESEED the database tables with new range of identifiers especially when the table’s Primary Key column is designed with data types such as BIGINT/INT.

The following script allows us to RESEED the database tables with the new SEED value for all the user tables with identity column.

-- Declaration statement to capture the new SEED value
DECLARE @DesiredSeed VARCHAR(20)
SET @DesiredSeed = '5000000000'

DECLARE @TableName VARCHAR(256), @SQLStatement VARCHAR(1000)

-- Cursor to get the list of all user tables which have identity columns
DECLARE curIdentityTables CURSOR
FOR
SELECT b.TABLE_SCHEMA +'.'+ OBJECT_NAME (a.[id])
FROM sysobjects a INNER JOIN INFORMATION_SCHEMA.COLUMNS b
ON OBJECT_NAME(a.id) = b.TABLE_NAME and b.ORDINAL_POSITION = 1
WHERE OBJECTPROPERTY (a.[id], 'IsUserTable') = 1
and OBJECT_NAME (a.[id]) <> 'dtproperties'
and OBJECTPROPERTY (a.[id], 'TableHasIdentity') = 1
ORDER BY a.[name]

--Open the cursor to loop through the table list and reseed it
OPEN curIdentityTables
FETCH NEXT FROM curIdentityTables INTO @TableName

WHILE @@FETCH_STATUS = 0
BEGIN

SET @SQLSTatement = 'DBCC CHECKIDENT (''' + @TableName + ''', RESEED, ' + @DesiredSeed + ')'
EXEC (@SQLSTatement)
FETCH NEXT FROM curIdentityTables INTO @TableName
END

CLOSE curIdentityTables
DEALLOCATE curIdentityTables

Saturday, 19 July 2008

Index maintenance using DMV’s in SQL Server 2005

SQL Server 2005 provides more flexible ways handling index maintenance activities using DMV (Dynamic Management View’s). The DMV’s can be effectively used to identify the index status such as

Indexes that requires maintenance activities such as reindexing / reorganizing
List of indexes that are going for index scan
List of not used indexes

Tables with indexes that require maintenance

SELECT OBJECT_NAME(OBJECT_ID) TableName,
( SELECT NAME FROM SYS.INDEXES
WHERE OBJECT_ID = A.OBJECT_ID
AND INDEX_ID = A.INDEX_ID) IndexName
FROM SYS.DM_DB_INDEX_USAGE_STATS A
WHERE USER_SEEKS >0 AND USER_SCANS >0
AND OBJECT_ID > 97
AND DATABASE_ID = 5
ORDER BY USER_UPDATES, USER_SEEKS DESC

Tables with Indexes going for Scan

SELECT OBJECT_NAME(OBJECT_ID) TableName ,
( SELECT NAME FROM SYS.INDEXES
WHERE OBJECT_ID = A.OBJECT_ID
AND INDEX_ID = A.INDEX_ID) IndexName
FROM SYS.DM_DB_INDEX_USAGE_STATS A
WHERE INDEX_ID <> 0
AND OBJECT_ID > 97
AND DATABASE_ID = 5
AND USER_SCANS > 0 ORDER BY USER_SCANS DESC

Tables with not used indexes

SELECT OBJECT_NAME(OBJECT_ID) TableName,
(SELECT NAME FROM SYS.INDEXES
WHERE OBJECT_ID = A.OBJECT_ID
AND INDEX_ID = A.INDEX_ID) IndexName
FROM SYS.DM_DB_INDEX_USAGE_STATS A
WHERE USER_SEEKS = 0
AND USER_SCANS =0
AND USER_LOOKUPS =0
AND USER_UPDATES = 0
AND OBJECT_ID > 97
AND INDEX_ID <> 0
AND DATABASE_ID = 5

Saturday, 12 July 2008

Parallel Data Loading using SQL Server 2005 partition techniques (BCP)

It is common in Enterprise scale projects to simulate the test environment with very large-scale databases. The preparation of real time data and loading consumes considerable amount of time during the environment setup. The SQL Server 2005 partition techniques allow parallel data loading into the tables using BCP / Bulk Insert statements. The data loading of 10 Million records into “Sales” tables using the partition and non-partition methods is explained below.

Data loading with out Partition

CREATE TABLE Sales
(
UniqueID BIGINT,
ItemName VARCHAR(100),
SaledAmount NUMERIC(10,3),
SalesDate DATETIME
)on [Primary]

The data loading for this table can be done using the BCP / Bulk Insert options. But we cannot load the data in parallel, as it will lead to table locking

Data loading with Partition

CREATE TABLE Sales(
ItemName VARCHAR(100),
SaledAmount NUMERIC(10,3),
SalesDate DATETIME,
PartitionID INT
)on fPartitionID(PartitionID)

The table “Sales” is partitioned using the column PartitionID into 10 different partitions (say PartitionID accepts value between 1.10)
As the table is split into 10 different partitions, data loading of this table can be done in parallel follows

Generate the data into to 10 X 1 Million files
Load data into the table using 10 BCP / Bulk Insert instances in parallel

Conclusion

The performance of the data loading can be improved up to 10 times if the table is partitioned. This option can be preferred only if there are sufficient hardware resources. As CPU usage of BCP is very high, the number of parallel instances can be reduced as per the resource availability