SQL Server tips for developers

These are some tips I have gathered over the years, there are many more to keep in mind when developing for SQL Server, but this is a beginning

  • Always create a primary key (pk), preferably on an int/bigint column, not on a GUID column

Sometimes that is the wrong solution, if the table is insert heavy, like a log table, it can create a hotspot at the end as each insert tries to lock a whole page.

 

  • Always create an index on a foreign key (fk) column.

If we update or delete in the referenced table, the SQL Server must check all fk tables.

 

  • Do not store same data in multiple places

Use lookup tables and fk instead to point to where the data is stored, we do not want to have to select/update/delete the same data in multiple places.

NoSQL databases totally disregard this, that is what makes them fast…ish in some circumstances

 

  • Try to predict which indexes will be needed.

Normally if we are using a column in a where clause that column will need to be indexed. Creating an extra, maybe not used, non-clustered index is normally better than not creating it.

 

  • Missing indexes according to the optimizer?
SELECT

  mig.index_group_handle, mid.index_handle,

  CONVERT (decimal (28,1),

    migs.avg_total_user_cost * migs.avg_user_impact * (migs.user_seeks + migs.user_scans)

    ) AS improvement_measure,

  'CREATE INDEX missing_index_'

    + CONVERT (varchar, mig.index_group_handle) + '_'

    + CONVERT (varchar, mid.index_handle)

    + ' ON ' + mid.statement

    + ' (' + ISNULL (mid.equality_columns,'')

    + CASE WHEN mid.equality_columns IS NOT NULL AND mid.inequality_columns IS NOT NULL THEN ',' ELSE '' END

    + ISNULL (mid.inequality_columns, '')

    + ')'

    + ISNULL (' INCLUDE (' + mid.included_columns + ')', '') AS create_index_statement,

  migs.*, mid.database_id, mid.[object_id]

FROM sys.dm_db_missing_index_groups mig

INNER JOIN sys.dm_db_missing_index_group_stats migs ON migs.group_handle = mig.index_group_handle

INNER JOIN sys.dm_db_missing_index_details mid ON mig.index_handle = mid.index_handle

WHERE CONVERT (decimal (28,1), migs.avg_total_user_cost * migs.avg_user_impact * (migs.user_seeks + migs.user_scans)) > 10

ORDER BY migs.avg_total_user_cost * migs.avg_user_impact * (migs.user_seeks + migs.user_scans) DESC

Do not create separate indexes for every column, try to use a covering index instead.

 

  • Indexes that are not used for searching are normally not necessary.

Updating/inserting/deleting rows in an index needs cpu and i/o, only keep those indexes that are needed.

This is easy to check and can be done when the application has been in production for a while.

SELECT d.name + '.' + s.name + '.' + o.name AS objectName

, i.name AS indexName

, MAX(ius.user_updates) AS userUpdates

, MAX(ius.user_seeks) AS userSeeks

, MAX(ius.user_scans) AS userScans

, MAX(ius.user_lookups) AS userLookups

, CASE

WHEN SUM(a.used_pages) * 8 < 1024 THEN CONVERT(VARCHAR(32), SUM(a.used_pages) * 8) + ' kB'

WHEN SUM(a.used_pages) * 8 / 1024 < 1024 THEN CONVERT(VARCHAR(32), SUM(a.used_pages) * 8 / 1024) + ' MB'

ELSE CONVERT(VARCHAR(32), SUM(a.used_pages) * 8 / 1024 / 1024) + ' GB'

END AS indexSize

FROM sys.indexes i

INNER JOIN sys.objects o ON i.[object_id] = o.[object_id]

INNER JOIN sys.schemas s ON o.[schema_id] = s.[schema_id]

INNER JOIN sys.dm_db_index_usage_stats ius ON ius.[object_id] = i.[object_id]

AND i.index_id = ius.index_id

INNER JOIN sys.databases d ON d.database_id = ius.database_id

INNER JOIN sys.partitions AS p ON i.[object_id] = p.[object_id]

AND i.index_id = p.index_id

INNER JOIN sys.allocation_units AS a ON p.[partition_id] = a.container_id

WHERE ius.user_updates > 0

AND ius.user_seeks + ius.user_scans + ius.user_lookups <= 1000

AND i.is_primary_key = 0

GROUP BY d.name, s.name, o.name, i.name

ORDER BY MAX(ius.user_updates) DESC;

 

  • Is the database made for OLTP or OLAP queries?

OLTP – short fast transactions, normalized tables

OLAP – DW queries, can run longer, often denormalized

Try not to mix OLTP/OLAP in the same database, try to keep then on different servers.

 

  • Use a standard naming convention all team developers agree on. Preferably a company wide standard.

It is harder working with a database where objects are named non-descriptively.

E.g Calling tables something like tbl_12345asdf gives no clue what it is.

 

  • Only use transactions where it is necessary, keep transactions as short as possible.

Long transactions cause blocking and uses up space in the transaction log that can not be dumped out to disk.

Check transaction_isolation_level column in below SQL during load

SELECT       db_name(r.database_id), r.session_id, r.blocking_session_id, r.cpu_time,

SUBSTRING(qt.text,(r.statement_start_offset/2) +1,

(CASE WHEN r.statement_end_offset = -1

THEN LEN(CONVERT(nvarchar(max), qt.text)) * 2

ELSE

r.statement_end_offset

END -r.statement_start_offset)/2) AS stmt_executing,

s.login_name, r.percent_complete, r.start_time

,CASE s.transaction_isolation_level

WHEN 0 THEN 'Unspecified'

WHEN 1 THEN 'ReadUncomitted'

WHEN 2 THEN 'ReadComitted'

WHEN 3 THEN 'Repeatable'

WHEN 4 THEN 'Serializable'

WHEN 5 THEN 'Snapshot'

END as 'Isolation level'

,            s.HOST_NAME,              s.PROGRAM_NAME,          s.host_process_id,

r.status,   r.wait_time, wait_type,              r.wait_resource,

r.total_elapsed_time,r.reads,r.writes, r.logical_reads, r.plan_handle

, (select query_plan from sys.dm_exec_query_plan(r.plan_handle)) as xml_plan

FROM sys.dm_exec_requests r

CROSS APPLY sys.dm_exec_sql_text(sql_handle) as qt,

sys.dm_exec_sessions s

WHERE r.session_id > 50

and r.session_id=s.session_id

and r.session_id <> @@SPID

ORDER BY r.cpu_time desc

 

 

  • Do not let a user start a transaction when e.g. opening a form in an application.

No one knows how long it will take until the user closes the form, until then the transaction is running.

Only start the transaction when actual database activity is done.

 

We can read data that is not yet committed in a transaction, so probably not for banking stuff…

Instead of using WITH(NOLOCK) try running in RCSI instead (Read Committed Snapshot Isolation) or Snapshot Isolation, puts a load on tempdb but can in many cases really help you getting rid of blocking without the nasty side effect of reading uncommitted data. See https://littlekendra.com/2016/02/18/how-to-choose-rcsi-snapshot-isolation-levels/ for a very good explanation of how this works.

 

  • Try to have the database on the same compatibility level as the server instance.

 

  • Use the same collation as the server instance, tempdb will use the same as the instance.

 

  • Do not use serializable isolation level if it can be avoided, read committed should be default. Snapshot is a possibility.

 

  • Do not use spaces in object names(tables, sp, views, databases …)

 

  • Use bit fields for boolean, not integer or varchar. Consider adding not null.

 

  • Use only as wide columns as you need, do not just use varchar(255) because it’s easiest.

 

  • Avoid select * if you can, the table design may change, columns may be added or removed.

 

  • Use constraints for data control, fk/pk, check, not null and so on.

 

  • Try to keep the application rights to read/write, DBO should be for development and release.

 

  • If a table grows fast, create an archival plan. Could be deleting data, partitioning with sliding window or something else that suits you.

 

  • User defined functions can be slow, table variables lack statistics, optimizer always assumes it holds only 1 row, use a temp table instead

Leave a Reply

Your email address will not be published. Required fields are marked *