T-SQL Tuesday #016 – Getting Totals And Subtotals With ROLLUP And CUBE

 

olap_1Welcome to TSQL Tuesday!  Click the picture on the right for more information.

TSQL is a great tool for aggregating lots of data but what do you do with it once it’s aggregated?  Often the answer is to aggregate it again through subtotals and totals.  Normally this will require a second query or if you want to see the totals “in-line”, a trip to Excel.  But, when using aggregation, you can use the WITH ROLLUP or WITH CUBE operator in the GROUP BY clause to generate totals.

MSDN explains the difference between ROLLUP and CUBE as follows:

  • CUBE generates a result set that shows aggregates for all combinations of values in the selected columns. 
  •  ROLLUP generates a result set that shows aggregates for a hierarchy of values in the selected columns.

 If that seems a little fuzzy don’t worry about it – just run the query below with each operator and note the difference in output. 

This query will run on SQL 2005 or 2008 DBs and shows the number of reads since the last restart.  Normally these operators (ROLLUP & CUBE) are used to present an application with pre-aggregated data for performance reasons.  However in this example it is quite useful for ad hoc I/O performance queries. 

SELECT

      OBJECT_SCHEMA_NAME(S.object_id)     AS SchemaName

      ,OBJECT_NAME(S.object_id)           AS ObjectName

      ,SUM(S.user_seeks + S.user_scans + S.user_lookups)                           AS TotalReads

FROM

      sys.dm_db_index_usage_stats         AS S

      INNER JOIN

      sys.indexes                         AS I

            ON S.object_id = I.object_id

WHERE

      OBJECTPROPERTY(S.object_id,‘IsUserTable’) = 1

      AND I.index_id = S.index_id

      AND S.database_id = DB_ID()

GROUP BY

      OBJECT_SCHEMA_NAME(S.object_id)

      ,OBJECT_NAME(S.object_id)

      –WITH ROLLUP / CUBE  

ORDER BY

      1,2,3

  In Figure1 we see the results from the basic aggregation.

 

 

 

 

Figure 1

Adding WITH ROLLUP produces the following.  Row 1 (NULL NULL) is a grand total.  Note that row 2 gives the sub-total for the Person group and so on.

 

 

 

 

 

Figure 2

Finally WITH CUBE produces all possible subtotals. 

 

 

 

 

 

 

 

 

Figure 3

 These operators aren’t Rocket Science but if they save you acouple minutes every time you’d have to make a trip to Excel then I think they are worth adding to the toolbox. 

Advertisements

Using PowerShell and SMO to Script SQL Server Database Objects

I’ve been telling myself to learn PowerShell for quite awhile but I’ve never been able to get over the learning curve.  What I needed was a piece of work that was bigger than trivial and smaller than huge that would force me into PowerShell.  It came along this week in the form of a request to script all the views in a test database before the refresh from production and then “putting them back” into the new test database.  Doing this from SSMS is pretty easy but I sure don’t want to do it every time we refresh.  I needed something repeatable, simple, and parameter driven. 

 I knew PS would be relatively simple to manage so I looked into using PS with SQL Server Management Objects (SMO).  First I searched the Web for something like “PowerShell SMO Script Objects” and found this, which pointed me in the right direction. 

The complete script is at the bottom of the article but I’ve broken it up into sections which I’ll cover in detail.  My focus is those of us learning PS so if you are a guru please be patient with the rest of us.

 

Figure 1

All the user configurable parameters can be set in Figure 1 above.  In line 13 note that $ObjType will be inserted as a literal in the code so PowerShell will read MyObject.$ObjType but execute MyObject.views.  This allows you to script tables, views, stored procedures, and other Db objects by simply changing the parameters.  For even more flexibility you could read in parameters from a text file.

Also note in line 15 that the “@” indicates a list of string objects that will be excluded from the script.  You could also add other schemas if you wish.

Finally, the If block (lines 18-21) tests for the output file and deletes it if it exists.  Without this check successive runs of the script would append the same scripts to the file.

 

Figure 2

Next, let’s look at the code to create the SMO we’ll use to script our objects.  Line 25 tells PowerShell which assembly we want to use so it can create the objects we need.  Lines 28 and 29 create the Server and Databases collection objects respectively.  Note that the Server object takes the instance name on creation.

Once you have the DB collection available you are almost ready to start scripting.  The Script method is available for each object (table, view, etc.) but there are a number of options you can use to control the scripts generated.  For example, if you want to test for existence before creating an object you would set the IncludeIfNotExists = 1.  To use the option you will create a ScriptingOptions object (line 32) and set various properties as needed.  A complete list of options is here.

If you want to perform multiple scripting tasks, say Drop and then Create you can perform the scripting multiple times, appending the results as we see in the following two code blocks.

 

Figure 3

First you set the ScriptingOptions as in Figure 3 for the operation you want to perform (lines 38 & 50).  Then you iterate through every object of the specified type in the database using a foreach loop (lines 41 & 53).  Each time the foreach starts it filters out the object if the schema is any one of the values passed in from the list discussed above.

Without the whitespace, the whole PowerShell script takes up only about 20 lines.  As you can see, PowerShell and SMO together make a very powerful tool to put in your toolbox.    I’d be interested to hear about how you modified this code to meet your own needs.

Here is the whole script, ready to use with your parameters.

##########################################################
# PS Script DB Objects
#
# Uses PS and SMO to script the drop and create a type of
# SQL Server object (views, tables, etc.)
#
# Tom Powell                                       2/24/2011
##########################################################
 
#Setup####################################################
    $SQLInstance = “Server\Instance”
    $DB = “DBName ”
    $ObjType = “views”
    $OutFile = “C:\temp\Scripts.sql”
    $ExcludeSchemas = @(“sys”,”Information_Schema”) #List of schemas to exclude
   
#Create smo objects##########################################
    [System.Reflection.Assembly]::LoadWithPartialName(‘Microsoft.SqlServer.SMO’) | out-null
   
    #SQL Instance & DB object
    $s = new-object (‘Microsoft.SqlServer.Management.Smo.Server’) $SQLInstance
    $dbs=$s.Databases
   
    #ScriptingOption object to pass parms for Script method
    $so = new-object (‘Microsoft.SqlServer.Management.Smo.ScriptingOptions’)
    $so.IncludeIfNotExists = 1
    $so.SchemaQualify = 1
    $so.AllowSystemObjects = 0

#Script Drop Objects###########################################
    $so.ScriptDrops = 1         #Script Drop Objects

    #Generate script for all objects##################################
    foreach ($objs in $dbs[$DB].$ObjType)
    {
        If ($ExcludeSchemas -notcontains $objs.Schema )  
        {
            $objs.Script($so) + “`r GO `r ” | out-File $OutFile #-Append
        }
    }

#Script Create Objects#########################################
    $so.ScriptDrops = 0         #Script Create Objects

    #Generate script for all objects#################################
    foreach ($objs in $dbs[$DB].$ObjType)
    {
        If ($ExcludeSchemas -notcontains $objs.Schema )
        {
            $objs.Script($so) + “`r GO `r ” | out-File $OutFile -Append
        }
    }

Tables and Files – What’s the Connection?

What is the relationship between a table and the files that contain it?  This isn’t as easy a question as you might think because there is not a direct relationship between a logical object like a table and a physical object like a file.  For a small database like AdventureWorks (< 200 MB) all you need is a single data file to hold all your tables.  As things start to get larger and more complex you may add files as a way of increasing performance.  For example if you have a single table that is large and heavily queried you may want to put it on a faster (and probably more expensive) drive.  You may also have non-clustered indexes that cover your query that would benefit from being separated.  If terms like non-clustered index and covering query are unfamiliar you might benefit from this excellent article by Joe Webb.

So, how do you control table and index placement?  You do it by using filegroups.  Every relational index (clustered or non-clustered) is assigned to a single filegroup that is composed of one or more files.  The relationship looks like this:

In the picture above the Table01 is actually the clustered index and Table01/Index01 is a non-clustered index.  Note that a file can be in exactly one filegroup, but the components of the table can be located and managed through the use of different filegroups.
Here is a script you can use to explore these relationships:

–Create a test DB.  Does the DB exist?  Drop if it does.
            Use tempdb
            Go
            If Exists (Select name from sys.databases Where name = ‘TabFile’)
                        Begin
                                    Drop Database TabFile
                        End
            Go

–Create DB with two filegroups.  You will need a “C:\temp”
–folder to use the script as written.
            Create Database TabFile On
                        Primary                                                                                                –FileGroup
                                    ( Name = N’TabFileF01′,
                                                Filename = N’C:\Temp\TabFileF01.mdf’),
                         Filegroup FG02                                                                               –FileGroup
                                    ( Name = N’TabFileF02′,                                                   –File
                                                Filename = N’C:\Temp\TabFileF02.ndf’ ),
                                    ( Name = N’TabFileF03′,                                                   –File
                                                Filename = N’C:\Temp\TabFileF03.ndf’)
                         Log ON                                                                                                –Log
                                    ( Name = N’TabFile_log’,
                                    Filename = N’C:\Temp\TabFile_log.ldf’)
            Go

–Create table with two indexes
            Use TabFile
            Go
            Create Table dbo.Table_1
                        (Col1 int Constraint PK_Table_1 Primary Key Clustered Not Null Identity (1, 1),
                        Col2 int Not Null)  On [Primary]
            Go
            Create Nonclustered index IX02_Table_1 ON dbo.Table_1
                        (Col1     ) On FG02
            Go
            Create Table dbo.Table_2
                        (Col1 int Constraint PK_Table_2 Primary Key Clustered Not Null Identity (1, 1),
                        Col2 int Not Null)  On FG02
            Go

At this point you can take a look at the various structures using either the Object Explorer in SSMS or TSQL.  I prefer TSQL because it is repeatable and far more flexible.  The following script shows the relationship between tables, filegroups, and files.

–View tables and files
      Use TabFile
            Go
            Select Distinct
                        O.name                         As TableName
                        ,I.name                         As IndexName
                        ,I.type_desc               As IndexType
                        ,I.index_id                  As IndexID
                        ,FG.name                     As FileGroupName
                        ,F.name                        As FileLogicalName
                        ,F.physical_name    As FilePhysicalName
            From
                        sys.indexes                As I
                        Inner Join
                        sys.filegroups            As FG
                                    On I.data_space_id = FG.data_space_id
                        Inner Join
                        sys.master_files       As F
                                    On FG.data_space_id = F.data_space_id
                                    And F.database_id = DB_ID()
                        Inner Join
                        sys.all_objects          As O
                                    On I.[object_id] = O.[object_id]
            Where
                        I.data_space_id = FG.data_space_id
                        And O.[type] = ‘U’
            Order By
                        TableName
                        ,IndexID
                        ,FileLogicalName
            Go

I recommend you play with this script on the sample DB above but also on of your own DBs.  Not only will it familiarize you with the various catalog views it will also point out how your data is arranged and may help you in managing it even better.

Ad Hoc Automation On TSQL Tuesday

 

Ad hoc Automation – isn’t that an oxymoron?  Isn’t it hard to automate something?  That answer depends on what you mean by automation.  If you want to build SSIS packages and SQL Agent jobs to comb your network and return all kinds of information about the servers and instances and track changes over time then you’re not going to knock that out before your second cup of coffee.  But suppose you just need to know what the SQL Server version is for each instance you own.  Or what if you are tired of cleaning AdventureWorks tables out of master and you want to set everybody’s default database to tempdb?  These tasks are a slice of pastry using the multi-server query feature of SQL Server’s Central Management Servers (CMS).

I’m not going to go over the details of how to set up a CMS  as Kimberly Tripp has already done a great job.  Instead I’d like to focus on some of the things you can do with them to automate repetitive tasks.  

SQL Server CMS allows you to interact with a group of servers much as you would with a single server.  The grouping of servers is entirely up to you.  You can have as many groups as you wish and you can configure them as you please.  This is important because, with a little thought you can slice and dice your environment by version (2K, 2005, 2008), by environment (Dev, Test, Prod), by function (Log Shipping, Replication, Clustered) and so on.  This allows you to customize your code and multiply the power of your CMS.

For example, suppose you have a server discovery query that returns SQL server metadata for an instance like this:

Wouldn’t it be handy to be able to generate that same information for every server in your enterprise in a single, Excel ready, recordset!

At one client site we decided to make every login use tempdb for their default DB.  Connecting to each of the 100+ instances and doing this manually didn’t appeal to me so I wrote and tested a TSQL script to do it on a single server, connected to every server at once, and made the change in about 2 minutes.  Notice I mentioned that I TESTED the script.  Remember, as Uncle Ben reminds us “with great power comes great responsibility.”   

I believe SQL Server’s CMS is a very useful but under-appreciated feature  limited only by your creativity.

 

http://sqlasylum.wordpress.com/2011/02/01/invitation-to-t-sql-tuesday-15-automation-in-sql-server/

I Have HOW MANY filegroups!?!



For small DBs one or maybe two data files (.mdf/.ndf) works fine.  With larger DBs though there is an opportunity to gain performance by the judicious use of additional file groups and files.   Unfortunately some folks go a little crazy and you can end up with more files than you can shake a dead cat at (my apologies to cat lovers).

Here’s a quick script to give you the lay of the land regarding file groups and their files.

Select

            FG.name                                                           As FGName

            ,Count(F.name)                                                 As FileCount

            ,Convert (Integer, Avg((F.size*8096.0)/(POWER(1024.0,2))))

                                                                                    As AvgFileSize_MB

From

            <DBName,sysname,MyDB>.sys.filegroups         As FG

            Inner Join

            sys.master_files                                              As F

                        On FG.data_space_id = F.data_space_id

Where

            F.database_id = DB_ID(‘<DBName,sysname,MyDB>’)

Group By

            FG.name

Having

            COUNT(F.name) > 1

Order By

            FGName

Actually I like cats, I just can’t eat a whole one by myself.

How Bad Are My Statistics?

Here’s a quick query that tells you how many rows have been changed (table & index) since the last statistics update.

Select     OBJECT_NAME(I.id)            As TableName

    ,Sum(Convert(BigInt,I.rowmodctr))    As RowsModSinceStats

    ,C.RowCnt

From     sys.sysindexes                As I

    Left
Outer
Join

    (    Select     OBJECT_NAME(PS.object_id)    As TableName

            ,SUM(PS.row_count)            As RowCnt

        From     sys.dm_db_partition_stats        As PS

            INNER
JOIN

            sys.indexes                As I

                ON I.object_id
= PS.object_id

                AND I.index_id = PS.index_id

        Where I.type_desc IN
(‘CLUSTERED’,‘HEAP’)

            AND I.object_id
> 100

            AND
OBJECT_SCHEMA_NAME(PS.object_id)
<>
‘sys’

        Group
By PS.object_id)    As C

        On
OBJECT_NAME(I.id)
= C.TableName            

Where
Coalesce(C.RowCnt,0)
> 0

Group
By
OBJECT_NAME(I.id)    

    ,C.RowCnt

Order
By TableName

    –RowsModSinceStats Desc

Server Discovery

Need a quick and easy way to gather basic information about the SQL Server instance you are on?  Keep this script in your SSMS Template Explorer and the information is only a double-click away.  This script works for SQL Server 2005 and SQL 2008 only because it uses the DMV sys.dm_os_sys_info.  If you are on SQL Server 2000 just highlight the code down to the comment and execute.

SELECT
    @@ServerName                    AS ServerName
    ,Case Left(Convert(Varchar(128),SERVERPROPERTY(‘productversion’)),2)
        When ‘8.’ Then ‘SQL 2000’
        When ‘9.’ Then ‘SQL 2005′
        When ’10’ Then ‘SQL 2008’
        Else ‘Unknown’
        End                                As SQLVersion
    ,SERVERPROPERTY(‘productlevel’)        As ServicePackLevel
    ,SERVERPROPERTY(‘productversion’)        As BuildNumber
    ,SERVERPROPERTY(‘edition’)            As Edition
    ,SERVERPROPERTY(‘LicenseType’)        As LicenseType
    ,SERVERPROPERTY(‘NumLicenses’)        As NumLicenses   
    ,CASE SUBSTRING(@@VERSION,PATINDEX(‘%Windows NT%’,@@VERSION),15)
            WHEN ‘Windows NT 5.0’ THEN ‘Windows 2000’
            WHEN ‘Windows NT 5.1’ THEN ‘Windows XP’
            WHEN ‘Windows NT 5.2’ THEN ‘Windows Server 2003 or Windows XP (64-bit)’
            WHEN ‘Windows NT 6.0’ THEN ‘Windows Server 2008’
            WHEN ‘Windows NT 6.1’ THEN ‘Windows Server 2008 R2 or Windows 7’
            ELSE ‘Unknown’
        END                            AS WindowsVersion
    ,CASE WHEN SERVERPROPERTY(‘IsClustered’) = 1
            THEN ‘Yes’
            ELSE ‘No’
        END AS IsClustered
–For SQL Server 2K run to here
    ,cpu_count                            AS LogicalCPUs
    ,cpu_count / hyperthread_ratio            AS PhysicalCPUs
    ,max_workers_count                    AS Threads
    ,physical_memory_in_bytes/1048576        AS PhysicalMemory_MB
FROM  
    sys.dm_os_sys_info

Here is an example of the result set on SQL Server 2008:

Results