Ads

31 January 2011

SQL SERVER NUMA

NUMA :- Non Uniform Memory Access
============================

Each group of processors has its own memory and possibly its own I/O channels. However, each CPU can access memory associated with the other groups in a coherent way. Each group is called a NUMA node.

The number of CPUs within a NUMA node depends on the hardware vendor. It is faster to access local memory than the memory associated with other NUMA nodes. This is the reason for the name, non-uniform memory access architecture.

On NUMA hardware, some regions of memory are on physically different buses from other regions. Because NUMA uses local and foreign memory, it will take longer to access some regions of memory than others. Local memory and foreign memory are typically used in reference to a currently running thread. Local memory is the memory that is on the same node as the CPU currently running the thread. Any memory that does not belong to the node on which the thread is currently running is foreign. Foreign memory is also known as remote memory. The ratio of the cost to access foreign memory over that for local memory is called the NUMA ratio. If the NUMA ratio is 1, it is symmetric multiprocessing (SMP). The greater the ratio, the more it costs to access the memory of other nodes. Windows applications that are not NUMA aware (including SQL Server 2000 SP3 and earlier) sometimes perform poorly on NUMA hardware.

The main benefit of NUMA is scalability. The NUMA architecture was designed to surpass the scalability limits of the SMP architecture. With SMP, all memory access is posted to the same shared memory bus. This works fine for a relatively small number of CPUs, but not when you have dozens, even hundreds, of CPUs competing for access to the shared memory bus. NUMA alleviates these bottlenecks by limiting the number of CPUs on any one memory bus and connecting the various nodes by means of a high speed interconnection.

NUMA node with 4 processors
Hardware-NUMA vs. Soft-NUMA

NUMA can match memory with CPUs through specialized hardware (hardware NUMA) or by configuring SQL Server memory (soft-NUMA). During startup, SQL Server configures itself based on underlying operating system and hardware configuration or the soft-NUMA setting. For both hardware and soft-NUMA, when SQL Server starts in a NUMA configuration, the SQL Server log records a multimode configuration message for each node, along with the CPU mask.

Hardware NUMA

Computers with hardware NUMA have more than one system bus, each serving a small set of processors. Each group of processors has its own memory and possibly its own I/O channels, but each CPU can access memory associated with other groups in a coherent way. Each group is called a NUMA node. The number of CPUs within a NUMA node depends on the hardware vendor. Your hardware manufacturer can tell you if your computer supports hardware NUMA.

If you have hardware NUMA, it may be configured to use interleaved memory instead of NUMA. In that case, Windows and therefore SQL Server will not recognize it as NUMA. Run the following query to find the number of memory nodes available to SQL Server:

SELECT DISTINCT memory_node_id
FROM sys.dm_os_memory_clerks

If SQL Server returns only a single memory node (node 0), either you do not have hardware NUMA, or the hardware is configured as interleaved (non-NUMA). If you think your hardware NUMA is configured incorrectly, contact your hardware vendor to enable NUMA. SQL Server ignores NUMA configuration when hardware NUMA has four or less CPUs and at least one node has only one CPU.

Soft-NUMA

SQL Server allows you to group CPUs into nodes referred to as soft-NUMA. You usually configure soft-NUMA when you have many CPUs and do not have hardware NUMA, but you can also use soft-NUMA to subdivide hardware NUMA nodes into smaller groups. Only the SQL Server scheduler and SQL Server Network Interface (SNI) are soft-NUMA aware. Memory nodes are created based on hardware NUMA and therefore not impacted by soft-NUMA. So, for example, if you have an SMP computer with eight CPUs and you create four soft-NUMA nodes with two CPUs each, you will only have one memory node serving all four NUMA nodes. Soft-NUMA does not provide memory to CPU affinity.

The benefits of soft-NUMA include reducing I/O and lazy writer bottlenecks on computers with many CPUs and no hardware NUMA. There is a single I/O thread and a single lazy writer thread for each NUMA node. Depending on the usage of the database, these single threads may be a significant performance bottleneck. Configuring four soft-NUMA nodes provides four I/O threads and four lazy writer threads, which could increase performance.

You cannot create a soft-NUMA that includes CPUs from different hardware NUMA nodes. For example, if your hardware has eight CPUs (0..7) and you have two hardware NUMA nodes (0-3 and 4-7), you can create soft-NUMA by combining CPU(0,1) and CPU(2,3). You cannot create soft-NUMA using CPU (1, 5), but you can use CPU affinity to affinitize an instance of SQL Server to CPUs from different NUMA nodes. So in the previous example, if SQL Server uses CPUs 0-3, you will have one I/O thread and one lazy writer thread. If, in the previous example SQL Server uses CPUs 1, 2, 5, and 6, you will access two NUMA nodes and have two I/O threads and two lazy writer threads.

HOW TO SET NUMA ON SQL SERVER

Consider the following example. A computer with eight CPUs does not have hardware NUMA. Three soft-NUMA nodes are configured. Database Engine instance A is configured to use CPUs 1 through 4. A second instance of the Database Engine is installed and configured to use CPUs 5 through 8. The example can be visually represented as:

CPUs 1 2 3 4 5 6 7 8

Soft-NUMA <-N0--><-N1-><----N2---->

SQL Server

Instance A, which experiences significant I/O, now has two I/O threads and two lazy writer threads, while instance B, which performs processor-intensive operations, has only one I/O thread and one lazy writer thread. Differing amounts of memory can be assigned to the instances, but unlike hardware NUMA, they both receive memory from the same operating system memory block and there is no memory-to-processor affinity.

NoteNote

The Soft-NUMA registry keys are not copied when you upgrade an instance of SQL Server.

Set the CPU affinity mask

  1. Run the following statement on instance A to configure it to use CPUs 1, 2, 3, and 4 by setting the CPU affinity mask:

    ALTER SERVER CONFIGURATION SET PROCESS AFFINITY CPU=1 TO 4
  2. Run the following statement on instance B to configure it to use CPUs 5, 6, 7, and 8 by setting the CPU affinity mask:

    ALTER SERVER CONFIGURATION SET PROCESS AFFINITY CPU=5 TO 8

Map soft-NUMA nodes to CPUs

  • Using the Registry Editor program (regedit.exe), add the following registry keys to map soft-NUMA node 0 to CPUs 1 and 2, soft-NUMA node 1 to CPUs 3 and 4, and soft-NUMA node 2 to CPUs 5, 6, 7, and 8.

    SQL Server 2005

    Type

    Value name

    Value data

    HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\90\NodeConfiguration\Node0

    DWORD

    CPUMask

    0x03

    HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\90\NodeConfiguration\Node1

    DWORD

    CPUMask

    0x0c

    HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\90\NodeConfiguration\Node2

    DWORD

    CPUMask

    0xf0

    SQL Server 2008

    Type

    Value name

    Value data

    HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\100\NodeConfiguration\Node0

    DWORD

    CPUMask

    0x03

    HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\100\NodeConfiguration\Node1

    DWORD

    CPUMask

    0x0c

    HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\100\NodeConfiguration\Node2

    DWORD

    CPUMask

    0xf0

    SQL Server 2008 R2

    Type

    Value name

    Value data

    HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\100\NodeConfiguration\Node0

    DWORD

    CPUMask

    0x03

    HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\100\NodeConfiguration\Node0

    DWORD

    Group

    0

    HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\100\NodeConfiguration\Node1

    DWORD

    CPUMask

    0x0c

    HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\100\NodeConfiguration\Node1

    DWORD

    Group

    0

    HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\100\NodeConfiguration\Node2

    DWORD

    CPUMask

    0xf0

    HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\100\NodeConfiguration\Node2


27 January 2011

Types of Restore - 2008R2

Thanks to :- http://msdn.microsoft.com/en-us/library/ms191253.aspx

Restore and Recovery Overview (SQL Server)

SQL Server 2008 R2

SQL Server supports restoring data on the following levels:

  • The database (a complete database restore)

    The whole database is restored and recovered, and the database is offline for the duration of the restore and recovery operations.

  • The data file (a file restore)

    A data file or a set of files is restored and recovered. During a file restore, the filegroups that contain the files are automatically offline for the duration of the restore. Any attempt to access an offline filegroup causes an error.

    NoteNote

    You cannot back up or restore individual tables.

  • The data page (a page restore)

    Under the full recovery model or bulk-logged recovery model, you can restore individual databases. Page restores can be performed on any database, regardless of the number of filegroups.

NoteNote

In SQL Server 2008, you can restore a database backup that was created by using SQL Server 2000, SQL Server 2005, or SQL Server 2008. However, backups of master, model and msdb that were created by using SQL Server 2000 or SQL Server 2005 cannot be restored by SQL Server 2008. Also, SQL Server 2008 backups cannot be restored by any earlier version of SQL Server.

SQL Server backup and restore work across all supported operating systems, whether they are 64-bit or 32-bit systems. For information about the supported operating systems, see Hardware and Software Requirements for Installing SQL Server 2008 R2.

Restore Scenarios

A restore scenario in SQL Server is the process of restoring data from one or more backups and then recovering the database. The supported restore scenarios depend on the recovery model of the database and the edition of SQL Server.

The following table introduces the possible restore scenarios that are supported for different recovery models.

Restore scenario

Under simple recovery model

Under full/bulk-logged recovery models

Complete database restore

This is the basic restore strategy. A complete database restore might involve simply restoring and recovering a full database backup. Alternatively, a complete database restore might involve restoring a full database backup followed by restoring and recovering a differential backup.

For more information, see Performing a Complete Database Restore (Simple Recovery Model).

This is the basic restore strategy. A complete database restore involve restoring a full database backup and, optionally, a differential backup (if any), followed by restoring all subsequent log backups (in sequence). The complete database restore is finished by recovering the last log backup and also restoring it (RESTORE WITH RECOVERY).

For more information, see Performing a Complete Database Restore (Full Recovery Model)

File restore *

Restore one or more damaged read-only files, without restoring the entire database. File restore is available only if the database has at least one read-only filegroup.

Restores one or more files, without restoring the entire database. File restore can be performed while the database is offline or, for some editions of SQL Server 2005 and later versions, while the database remains online. During a file restore, the filegroups that contain the files that are being restored are always offline.

Page restore

Not applicable

Restores one or more damaged pages. Page restore can be performed while the database is offline or, for some editions of SQL Server 2005 and later versions, while the database remains online. During a page restore, the pages that are being restored are always offline.

An unbroken chain of log backups must be available, up to the current log file, and they must all be applied to bring the page up to date with the current log file.

For more information, see Performing Page Restores.

Piecemeal restore *

Restore and recover the database in stages at the filegroup level, starting with the primary and all read/write, secondary filegroups.

Restore and recover the database in stages at the filegroup level, starting with the primary filegroup.

* Online restore is supported only in SQL Server 2005 Enterprise Edition and later versions.

Regardless of how data is restored, before a database can be recovered, the SQL Server Database Engine guarantees that the whole database is logically consistent. For example, if you restore a file, you cannot recover it and bring it online until it has been rolled far enough forward to be consistent with the database.

Advantages of a File or Page Restore

Restoring and recovering files or pages, instead of the whole database, provides the following advantages:

  • Restoring less data reduces the time required to copy and recover it.

  • On SQL Server 2005 Enterprise Edition and later versions, restoring files or pages might allow other data in the database to remain online during the restore operation.

Recovery Models and Supported Restore Operations

The restore operations that are available for a database depend on its recovery model. The following table summarizes whether and to what extent each of the recovery models supports a given restore scenario.

Restore operation

Full recovery model

Bulk-logged recovery model

Simple recovery model

Data recovery

Complete recovery (if the log is available).

Some data-loss exposure.

Any data since last full or differential backup is lost.

Point-in-time restore

Any time covered by the log backups.

Disallowed if the log backup contains any bulk-logged changes.

Not supported.

File restore *

Full support.

Sometimes.**

Available only for read-only secondary files.

Page restore *

Full support.

Sometimes.**

None.

Piecemeal (filegroup-level) restore *

Full support.

Sometimes.**

Available only for read-only secondary files.

* Available only in the SQL Server 2005 Enterprise Edition and later versions.

** For the required conditions, see Restore Restrictions Under the Simple Recovery Model.

25 January 2011

Reporting Services Catalog Error.

Thanks to :- http://www.sqlcoffee.com/Troubleshooting028.htm


Reporting Services Catalog Error.


Applies to: Microsoft SQL Server 2008 Release Candidate 0 (RC0).

Error Message.

SQL Server 2008 RC 0 Setup shows the following error messages after running a set of rules to determine if the installation process will be blocked:

bulletRule "Reporting Services Catalog Database File Existence" failed
bulletRule "Reporting Services Catalog Temporary Database File Existence" failed.





Cause.

Setup program found the Reporting Services Catalog database file and Reporting Services Catalog temporary database file in the following path:

C:\Program Files\Microsoft SQL Server\MSSQL10.MSSQLSERVER\MSSQL\DATA



Solution.

Remove the following files from the "C:\Program Files\Microsoft SQL Server\MSSQL10.MSSQLSERVER\MSSQL\DATA" directory:

bulletReportServer.mdf
bulletReportServer_log.LDF
bulletReportServerTempDB.mdf
bulletReportServerTempDB_log.LDF.






After deleting the files, click on the "Re-run" button of the Installation Rules setup page, and click "Next" after passing all the installation rules.


Log Shipping Jobs

Log Shipping Jobs

Log shipping involves four jobs, which are handled by dedicated SQL Server Agent jobs. These jobs include the backup job, copy job, restore job, and alert job.

The user controls how frequently log backups are taken, how frequently they are copied to each secondary server, and how frequently they are applied to the secondary database. To reduce the work required to bring a secondary server online, for example after the production system fails, you can copy and restore each transaction log backup soon after it is created. Alternatively, perhaps on a second secondary server, you can delay applying transaction log backups to the secondary database. This delay provides an interval during which you can notice and respond to a failure on the primary, such as accidental deletion of critical data.

Backup Job

A backup job is created on the primary server instance for each primary database. It performs the backup operation, logs history to the local server and the monitor server, and deletes old backup files and history information. By default, this job will run every 15 minutes, but the interval is customizable.

When log shipping is enabled, the SQL Server Agent job category "Log Shipping Backup" is created on the primary server instance.

SQL Server 2008 Enterprise and later versions support backup compression. When creating a log shipping configuration, you can control the backup compression behavior of log backups. For more information, see Backup Compression (SQL Server).

Copy Job

A copy job is created on each secondary server instance in a log shipping configuration. This job copies the backup files from the primary server to a configurable destination on the secondary server and logs history on the secondary server and the monitor server. The copy job schedule, which is customizable, should approximate the backup schedule.

When log shipping is enabled, the SQL Server Agent job category "Log Shipping Copy" is created on the secondary server instance.

Restore Job

A restore job is created on the secondary server instance for each log shipping configuration. This job restores the copied backup files to the secondary databases. It logs history on the local server and the monitor server, and deletes old files and old history information. The SQL Server job category "Log Shipping Restore" is created on the secondary server instance when log shipping is enabled.

On a given secondary server instance, the restore job can be scheduled as frequently as the copy job, or the restore job can delayed. Scheduling these jobs with the same frequency keeps the secondary database as closely aligned with the primary database as possible to create a warm standby database.

In contrast, delaying restore jobs, perhaps by several hours, can be useful in the event of a serious user error, such as a dropped table or inappropriately deleted table row. If the time of the error is known, you can move that secondary database forward to a time soon before the error. Then you can export the lost data and import it back into the primary database.

Alert Job

If a monitor server is used, an alert job is created on the monitor server instance. This alert job is shared by the primary and secondary databases of all log shipping configurations using this monitor server instance. Any change to the alert job (such as rescheduling, disabling, or enabling the job) affects all databases using that monitor server. This job raises alerts (for which you must specify alert numbers) for primary and secondary databases when backup and restore operations have not completed successfully within specified thresholds. You must configure these alerts to have an operator receive notification of the log shipping failure. The SQL Server Agent job category "Log Shipping Alert" is created on the monitor server instance when log shipping is enabled.

If a monitor server is not used, alert jobs are created locally on the primary server instance and each secondary server instance. The alert job on the primary server instance raises errors when backup operations have not completed successfully within a specified threshold. The alert job on the secondary server instance raises errors when local copy and restore operations have not completed successfully within a specified threshold.

What is a deadlock and what is a live lock?

What is a deadlock and what is a live lock? How will you go about resolving deadlocks?

Deadlock occurs when two user processes/transactions have locks on 2 separate objects and each process is trying to acquire a lock on the object that has been acquired by the other process. In such a scenario each process is waiting for the other process to release the lock to acquire a lock on the object itself. When a request for exclusive lock is denied again and again because a series of overlapping shared locks are interfering with each other and to adapt from each other they keep on changing the status, it is known as live lock.

One can resolve deadlocks by using TRY CATCH blocks. If the code inside a TRY statement fails, the CATCH automatically catches the control of the flow letting the transaction rollback and resume execution.

VLF IN LDF

SQL SERVER – Detect Virtual Log Files (VLF) in LDF

In one of the recent training engagements, I was asked if it true that there are multiple small log files in the large log file (LDF). I found this question very interesting as the answer is yes. Multiple small Virtual Log Files commonly known as VLFs together make an LDF file. The writing of the VLF is sequential and resulting in the writing of the LDF file is sequential as well. This leads to another talk that one does not need more than one log file in most cases.

However, in short, you can use following DBCC command to know how many Virtual Log Files or VLFs are present in your log file.

DBCC LOGINFO

You can find the result of above query to something as displayed in following image.

You can see the column which is marked as 2 which means it is active VLF and the one with 0 which is inactive VLF.

24 January 2011

DONT KILL SPID

Killing a SQL Server thread? Don’t!

Sometimes, when you kill a session (i.e. a spid) in a SQL Server instance, the spid just refuses to go away not because it’s doing a rollback. Perhaps, it’s stuck on a certain dependency on something external to SQL Server or it’s just simply stuck for some decipherable reasons. And the spid may hang around for as long as the instance is online and will only go away when the instance is restarted.

In more than a few occasions, I have heard people wondering if they can just kill the thread of the SQL Server process from the OS, that is, the OS thread corresponding to the SQL Server spid.

That is a very bad idea! Killing such an OS thread will likely crash the SQL Server instance.

You can easily demonstrate the harm of doing this with Process Explorer. You’ll need to run Process Explorer locally on the server of your test SQL Server instance, and needless to say that this should not be done on a production instance.

First, connect to the test SQL Server instance, and open a New Query window in SQL Server Management Studio. Note the spid for the connection, and run the following query (replacing with the spid for the connection) to find the kpid value in the resultset for the spid:

select * from master..sysprocesses where spid =

This kpid is the id the OS assigns to the thread that is currently bound to the worker executing a task or idling on the spid.

Then, fire up Process Explorer and locate the OS process for the SQL Server instance. If you have multiple SQL Server instances running on the server, you can mouse over each sqlservr.exe to identify the right instance.

Double click on the process, and select the Thread tab. This will give you a list of all the OS threads for the SQL Server instances. Highlight the Thread ID corresponding to the kpid found above, and click on the Kill button to terminate the thread.

After you have killed the thread from Process Explorer, you may still be able to use the SQL Server instance for a while. You’ll for sure see stack dump error messages in the errorlog, and you may see error messages such as the following:

2010-02-04 14:45:31.03 Server Potential image corruption/hotpatch detected. This may be a sign of a hardware problem or caused by presence of CLR/jitted images on the stack. Check SQLDUMPER_ERRORLOG.log for details.

The session related to the kpid value would be dead. But the SQL Server instance itself may or may not continue to function. If you repeat the above steps to kill the threads for the other user sessions, eventually this will crash the SQL Server instance, i.e. the SQL Server instance will become unresponsive. After that, the only way to get the instance back online is to restart it. Most likely, you'll have to first kill the corresponding process from the OS because the instance probably won't respond to any of the normal restart methods.