What does a DBA do really?

Good to have when the boss asks 🙂


  • Installations (Configuration/Disk/File placements/Maxdop/Memory/Collation/Version/Pachlevel/Cost threshold….)


  • Upgrades (See above…)


  • Configurations (Database options and instance configuration…….)


  • Sizing (SAN/Network/Memory/CPU/Have to be a hardware wiz too)


  • Performance tuning (Indexes/IO/CPU/Wait stats/SQL Plans/TSQL/EF/Plan cache/Buffer cache/…)


  • Backup (Time taken/IO Performance/Network Performance/Impact on Prod/Secondary backups/Configuration/Test/Disk/Networker/Scripts/Alarms/)


  • Restore (See backups, plus testing/Solve replication marker in log/logins/users/security/permissions/snapshots/Native/striped/Networker……)


  • Consistency (DBCC CheckDB/How to solve inconsistent database, a DBA’s biggest fear)


  • Clients (Advice/Run scripts/Configure instance-clients/Solve problems/Explain-Design-Architect solutions….answer all and any questions…)


  • Knowledge of (SAN/Disk/Network/CPU/Memory/Clients/Protocols/Problem solving/….)


  • Design databases (tables/hardware/firewall/indexes/triggers/views/stored procedures/functions/roles/……)


  • Security procedures and settings (Snapshot/sysadmin/dbo/read/write/….)


  • HA/DR solutions and how they work and their differences, see list below


  • AlwaysOn (2 or multiple nodes/differences Standard-Enterprise/Secondary reads/Secondary backups/Failover/Disaster recovery/Problem solving/

Readonly routing/Not synchronizing/Copy-only…….)


  • Replication (Merge/Transactional/Snapshot/When to use which and how to configure/Implications of moving a replicated database/Monitoring/Detecting problems)


  • Mirroring (Configure/Setup/Failover/Automated Failover Multiple DB’s/Monitor….)


  • Log shipping (Configuration/When to use/Fix issues….)


  • Stored Procedures(Coding/Speed up/Query Plans/”Parameter sniffing”/…)


  • Functions (When and when not to use/Performance/Coding/Fix/)


  • Sessions (Blocking processes/Deadlocks/How to solve them/How to monitor for them…)


  • Performance monitor (Perfmon/Spotlight/SQL Monitor/Scripts)


  • Compat mode (How it works/why/optimizer/TSQL compat)


  • Recovery mode (Full/bulk load/Simple and their respective implications)


  • Collations(tempdb/instance/how to fix (collate…) )


  • Tempdb (Number of files/trace flags/file placements/latch contention/full disk/…..)


  • Transactions (Explain to developers/track down long running/Fix long running)


  • TSQL (Understand developers code/Fix developers code/Make everything faster/…)


  • Isolations levels (read committed/read uncommitted/repeatable read/snapshot/serializable/…)


  • Differences between Standard and Enterprise edition (2005/2008/2008R2/2012/2014/2016/E.g. Rebuild clustered index online on a standard edition…)


  • Capacity planning (CPU/Disk/Network/Memory/PLE/Buffer cache/CLR/….)


  • Maintenance jobs (Backup/index update stats/index rebuild/index reorg/checkdb…..)

TempDB file sizes

So, how large will Tempdb files be after reboot?

Normally when you look at the file sizes in SSMS it will show you the size the files are right now, but after a reboot tempdb files will reset to the configured file size in master database, use below sql to find out.

TempDB file size after reboot

name, file_id, type_desc, size * 8 / 1024 [TempdbSizeInMB]
FROM sys.master_files
WHERE DB_NAME(database_id) = 'tempdb'
ORDER BY type_desc DESC, file_id

TempDB current size

name, file_id, type_desc, size * 8 / 1024 [TempdbSizeInMB]
FROM tempdb.sys.database_files
ORDER BY type_desc DESC, file_id

Fill TempDB (9GB), check on disk how the files are extended, check original and current size afterwards

CREATE TABLE #temptable (column1 char(3000) default 'a', column2 char(3000) default 'b')


WHILE @i <= 950000
SET @i += 1


DROP TABLE #temptable

SQL Server tips for developers

These are some tips I have gathered over the years, there are many more to keep in mind when developing for SQL Server, but this is a beginning

  • Always create a primary key (pk), preferably on an int/bigint column, not on a GUID column

Sometimes that is the wrong solution, if the table is insert heavy, like a log table, it can create a hotspot at the end as each insert tries to lock a whole page.


  • Always create an index on a foreign key (fk) column.

If we update or delete in the referenced table, the SQL Server must check all fk tables.


  • Do not store same data in multiple places

Use lookup tables and fk instead to point to where the data is stored, we do not want to have to select/update/delete the same data in multiple places.

NoSQL databases totally disregard this, that is what makes them fast…ish in some circumstances


  • Try to predict which indexes will be needed.

Normally if we are using a column in a where clause that column will need to be indexed. Creating an extra, maybe not used, non-clustered index is normally better than not creating it.


  • Missing indexes according to the optimizer?

  mig.index_group_handle, mid.index_handle,

  CONVERT (decimal (28,1),

    migs.avg_total_user_cost * migs.avg_user_impact * (migs.user_seeks + migs.user_scans)

    ) AS improvement_measure,

  'CREATE INDEX missing_index_'

    + CONVERT (varchar, mig.index_group_handle) + '_'

    + CONVERT (varchar, mid.index_handle)

    + ' ON ' + mid.statement

    + ' (' + ISNULL (mid.equality_columns,'')

    + CASE WHEN mid.equality_columns IS NOT NULL AND mid.inequality_columns IS NOT NULL THEN ',' ELSE '' END

    + ISNULL (mid.inequality_columns, '')

    + ')'

    + ISNULL (' INCLUDE (' + mid.included_columns + ')', '') AS create_index_statement,

  migs.*, mid.database_id, mid.[object_id]

FROM sys.dm_db_missing_index_groups mig

INNER JOIN sys.dm_db_missing_index_group_stats migs ON migs.group_handle = mig.index_group_handle

INNER JOIN sys.dm_db_missing_index_details mid ON mig.index_handle = mid.index_handle

WHERE CONVERT (decimal (28,1), migs.avg_total_user_cost * migs.avg_user_impact * (migs.user_seeks + migs.user_scans)) > 10

ORDER BY migs.avg_total_user_cost * migs.avg_user_impact * (migs.user_seeks + migs.user_scans) DESC

Do not create separate indexes for every column, try to use a covering index instead.


  • Indexes that are not used for searching are normally not necessary.

Updating/inserting/deleting rows in an index needs cpu and i/o, only keep those indexes that are needed.

This is easy to check and can be done when the application has been in production for a while.

SELECT d.name + '.' + s.name + '.' + o.name AS objectName

, i.name AS indexName

, MAX(ius.user_updates) AS userUpdates

, MAX(ius.user_seeks) AS userSeeks

, MAX(ius.user_scans) AS userScans

, MAX(ius.user_lookups) AS userLookups


WHEN SUM(a.used_pages) * 8 < 1024 THEN CONVERT(VARCHAR(32), SUM(a.used_pages) * 8) + ' kB'

WHEN SUM(a.used_pages) * 8 / 1024 < 1024 THEN CONVERT(VARCHAR(32), SUM(a.used_pages) * 8 / 1024) + ' MB'

ELSE CONVERT(VARCHAR(32), SUM(a.used_pages) * 8 / 1024 / 1024) + ' GB'

END AS indexSize

FROM sys.indexes i

INNER JOIN sys.objects o ON i.[object_id] = o.[object_id]

INNER JOIN sys.schemas s ON o.[schema_id] = s.[schema_id]

INNER JOIN sys.dm_db_index_usage_stats ius ON ius.[object_id] = i.[object_id]

AND i.index_id = ius.index_id

INNER JOIN sys.databases d ON d.database_id = ius.database_id

INNER JOIN sys.partitions AS p ON i.[object_id] = p.[object_id]

AND i.index_id = p.index_id

INNER JOIN sys.allocation_units AS a ON p.[partition_id] = a.container_id

WHERE ius.user_updates > 0

AND ius.user_seeks + ius.user_scans + ius.user_lookups <= 1000

AND i.is_primary_key = 0

GROUP BY d.name, s.name, o.name, i.name

ORDER BY MAX(ius.user_updates) DESC;


  • Is the database made for OLTP or OLAP queries?

OLTP – short fast transactions, normalized tables

OLAP – DW queries, can run longer, often denormalized

Try not to mix OLTP/OLAP in the same database, try to keep then on different servers.


  • Use a standard naming convention all team developers agree on. Preferably a company wide standard.

It is harder working with a database where objects are named non-descriptively.

E.g Calling tables something like tbl_12345asdf gives no clue what it is.


  • Only use transactions where it is necessary, keep transactions as short as possible.

Long transactions cause blocking and uses up space in the transaction log that can not be dumped out to disk.

Check transaction_isolation_level column in below SQL during load

SELECT       db_name(r.database_id), r.session_id, r.blocking_session_id, r.cpu_time,

SUBSTRING(qt.text,(r.statement_start_offset/2) +1,

(CASE WHEN r.statement_end_offset = -1

THEN LEN(CONVERT(nvarchar(max), qt.text)) * 2



END -r.statement_start_offset)/2) AS stmt_executing,

s.login_name, r.percent_complete, r.start_time

,CASE s.transaction_isolation_level

WHEN 0 THEN 'Unspecified'

WHEN 1 THEN 'ReadUncomitted'

WHEN 2 THEN 'ReadComitted'

WHEN 3 THEN 'Repeatable'

WHEN 4 THEN 'Serializable'

WHEN 5 THEN 'Snapshot'

END as 'Isolation level'

,            s.HOST_NAME,              s.PROGRAM_NAME,          s.host_process_id,

r.status,   r.wait_time, wait_type,              r.wait_resource,

r.total_elapsed_time,r.reads,r.writes, r.logical_reads, r.plan_handle

, (select query_plan from sys.dm_exec_query_plan(r.plan_handle)) as xml_plan

FROM sys.dm_exec_requests r

CROSS APPLY sys.dm_exec_sql_text(sql_handle) as qt,

sys.dm_exec_sessions s

WHERE r.session_id > 50

and r.session_id=s.session_id

and r.session_id <> @@SPID

ORDER BY r.cpu_time desc



  • Do not let a user start a transaction when e.g. opening a form in an application.

No one knows how long it will take until the user closes the form, until then the transaction is running.

Only start the transaction when actual database activity is done.


We can read data that is not yet committed in a transaction, so probably not for banking stuff…

Instead of using WITH(NOLOCK) try running in RCSI instead (Read Committed Snapshot Isolation) or Snapshot Isolation, puts a load on tempdb but can in many cases really help you getting rid of blocking without the nasty side effect of reading uncommitted data. See https://littlekendra.com/2016/02/18/how-to-choose-rcsi-snapshot-isolation-levels/ for a very good explanation of how this works.


  • Try to have the database on the same compatibility level as the server instance.


  • Use the same collation as the server instance, tempdb will use the same as the instance.


  • Do not use serializable isolation level if it can be avoided, read committed should be default. Snapshot is a possibility.


  • Do not use spaces in object names(tables, sp, views, databases …)


  • Use bit fields for boolean, not integer or varchar. Consider adding not null.


  • Use only as wide columns as you need, do not just use varchar(255) because it’s easiest.


  • Avoid select * if you can, the table design may change, columns may be added or removed.


  • Use constraints for data control, fk/pk, check, not null and so on.


  • Try to keep the application rights to read/write, DBO should be for development and release.


  • If a table grows fast, create an archival plan. Could be deleting data, partitioning with sliding window or something else that suits you.


  • User defined functions can be slow, table variables lack statistics, optimizer always assumes it holds only 1 row, use a temp table instead