Inserting Binary Objects Into A SQL Server Table Using A Stored Procedure And PowerShell



Our team was tasked with putting scanned expense report receipts (.PDF) in a SQL Server table for review.  As the DBA I thought my part was pretty straight forward.  I knocked out the table and associated objects including the stored procedures to service the table and was enjoying a cup of coffee when I realized I had not tested my insert procedure.  Then it dawned on me – I had no way to test it in SSMS.  How can you use a text based tool to manipulate binary files into the database?  If you’ve figured that one out I’d love to see it.  If you haven’t then perhaps my PowerShell approach will be of some use to you.

First let’s create some basic SQL objects, a table and a stored procedure:

CREATE TABLE TAPTEST (Col1 VARBINARY(MAX) NULL)

GO

CREATE PROCEDURE usp_TAPTEST @Col1 VARBINARY (MAX) AS

      INSERT INTO dbo.TAPTEST (Col1) VALUES (@Col1)

GO

Here is the PowerShell code if you are in a hurry but it’s worth going through line by line as we will do below.

#Get the file

      [Byte[]]$file = get-content -Encoding Byte c:\Temp\TAPTEST.pdf 

#Connect to DB   

      $DB = “server=MyInstanceName;integrated security=sspi;database=Bob”

      $conn = new-object System.Data.SqlClient.SqlConnection($DB)

#Build the command and parameters

      $cmd = new-object System.Data.SqlClient.SqlCommand(usp_TAPTEST”,$conn)

      $cmd.CommandType = [System.Data.CommandType]StoredProcedure

      $cmd.Parameters.Add(“@Col1”, [System.Data.SqlDbType]VarBinary)

      $cmd.Parameters[“@Col1”].Size = 1

      $cmd.Parameters[“@Col1”].Value = $file

#Execute the command

      $conn.Open()

      $cmd.ExecuteNonQuery()

 Taking a closer look.

#Get the file

      [Byte[]]$file = get-content -Encoding Byte c:\Temp\TAPTEST.pdf

 The square brackets force the data type for the $file variable.  Additionally, since Get-Content normally reads the file as text the –Encoding parameter is used to tell it to read it as binary.

Connecting to the database is just as you would do for any other SMO connection, nothing special here.

#Connect to DB   

      $DB = “server=MyInstanceName;integrated security=sspi;database=Bob”

      $conn = new-object System.Data.SqlClient.SqlConnection($DB)

Building the command and parameter objects was a little more interesting.

 

#Build the command and parameters

      $cmd = new-object System.Data.SqlClient.SqlCommand(usp_TAPTEST”,$conn)

 

Note the command text above:  usp_TAPTEST which will be executed at the server.  Any valid DML should execute, including the following. usp_TAPTEST @Col1 = 1″ which works great as long as you have a printable data type.  If it weren’t for reading the file into a Byte data type in the first line we’d be stuck again.   So how DO you tell it this is a binary object?  Read on.

In the first line below we tell PowerShell to treat the CommandText as a stored procedure.   Then we set the first parameter’s data type to varbinary.  Setting the size parameter to -1 tells it that the size is MAX  Finally, set the value of the parameter to the file you read in initially.

 

      $cmd.CommandType = [System.Data.CommandType]StoredProcedure

      $cmd.Parameters.Add(“@Col1”, [System.Data.SqlDbType]VarBinary)

      $cmd.Parameters[“@Col1”].Size = 1

      $cmd.Parameters[“@Col1”].Value = $file

 At this point you are ready to go, just open the connection and execute.

#Execute the command

      $conn.Open()

      $cmd.ExecuteNonQuery()

 

And voila!

Notice the French lingo – this is a high class blog!

 

 

 

 

 

As you can see, the code is pretty simple with the only tricky part having to do with fitting the binary nature of the document with the parameter used by the SQL Command object.

Hopefully you will find this useful both as stand-alone code and as another step in the drive for PowerShell mastery.

 

T-SQL Tuesday #016 – Getting Totals And Subtotals With ROLLUP And CUBE

 

olap_1Welcome to TSQL Tuesday!  Click the picture on the right for more information.

TSQL is a great tool for aggregating lots of data but what do you do with it once it’s aggregated?  Often the answer is to aggregate it again through subtotals and totals.  Normally this will require a second query or if you want to see the totals “in-line”, a trip to Excel.  But, when using aggregation, you can use the WITH ROLLUP or WITH CUBE operator in the GROUP BY clause to generate totals.

MSDN explains the difference between ROLLUP and CUBE as follows:

  • CUBE generates a result set that shows aggregates for all combinations of values in the selected columns. 
  •  ROLLUP generates a result set that shows aggregates for a hierarchy of values in the selected columns.

 If that seems a little fuzzy don’t worry about it – just run the query below with each operator and note the difference in output. 

This query will run on SQL 2005 or 2008 DBs and shows the number of reads since the last restart.  Normally these operators (ROLLUP & CUBE) are used to present an application with pre-aggregated data for performance reasons.  However in this example it is quite useful for ad hoc I/O performance queries. 

SELECT

      OBJECT_SCHEMA_NAME(S.object_id)     AS SchemaName

      ,OBJECT_NAME(S.object_id)           AS ObjectName

      ,SUM(S.user_seeks + S.user_scans + S.user_lookups)                           AS TotalReads

FROM

      sys.dm_db_index_usage_stats         AS S

      INNER JOIN

      sys.indexes                         AS I

            ON S.object_id = I.object_id

WHERE

      OBJECTPROPERTY(S.object_id,‘IsUserTable’) = 1

      AND I.index_id = S.index_id

      AND S.database_id = DB_ID()

GROUP BY

      OBJECT_SCHEMA_NAME(S.object_id)

      ,OBJECT_NAME(S.object_id)

      –WITH ROLLUP / CUBE  

ORDER BY

      1,2,3

  In Figure1 we see the results from the basic aggregation.

 

 

 

 

Figure 1

Adding WITH ROLLUP produces the following.  Row 1 (NULL NULL) is a grand total.  Note that row 2 gives the sub-total for the Person group and so on.

 

 

 

 

 

Figure 2

Finally WITH CUBE produces all possible subtotals. 

 

 

 

 

 

 

 

 

Figure 3

 These operators aren’t Rocket Science but if they save you acouple minutes every time you’d have to make a trip to Excel then I think they are worth adding to the toolbox. 

Using PowerShell and SMO to Script SQL Server Database Objects

I’ve been telling myself to learn PowerShell for quite awhile but I’ve never been able to get over the learning curve.  What I needed was a piece of work that was bigger than trivial and smaller than huge that would force me into PowerShell.  It came along this week in the form of a request to script all the views in a test database before the refresh from production and then “putting them back” into the new test database.  Doing this from SSMS is pretty easy but I sure don’t want to do it every time we refresh.  I needed something repeatable, simple, and parameter driven. 

 I knew PS would be relatively simple to manage so I looked into using PS with SQL Server Management Objects (SMO).  First I searched the Web for something like “PowerShell SMO Script Objects” and found this, which pointed me in the right direction. 

The complete script is at the bottom of the article but I’ve broken it up into sections which I’ll cover in detail.  My focus is those of us learning PS so if you are a guru please be patient with the rest of us.

 

Figure 1

All the user configurable parameters can be set in Figure 1 above.  In line 13 note that $ObjType will be inserted as a literal in the code so PowerShell will read MyObject.$ObjType but execute MyObject.views.  This allows you to script tables, views, stored procedures, and other Db objects by simply changing the parameters.  For even more flexibility you could read in parameters from a text file.

Also note in line 15 that the “@” indicates a list of string objects that will be excluded from the script.  You could also add other schemas if you wish.

Finally, the If block (lines 18-21) tests for the output file and deletes it if it exists.  Without this check successive runs of the script would append the same scripts to the file.

 

Figure 2

Next, let’s look at the code to create the SMO we’ll use to script our objects.  Line 25 tells PowerShell which assembly we want to use so it can create the objects we need.  Lines 28 and 29 create the Server and Databases collection objects respectively.  Note that the Server object takes the instance name on creation.

Once you have the DB collection available you are almost ready to start scripting.  The Script method is available for each object (table, view, etc.) but there are a number of options you can use to control the scripts generated.  For example, if you want to test for existence before creating an object you would set the IncludeIfNotExists = 1.  To use the option you will create a ScriptingOptions object (line 32) and set various properties as needed.  A complete list of options is here.

If you want to perform multiple scripting tasks, say Drop and then Create you can perform the scripting multiple times, appending the results as we see in the following two code blocks.

 

Figure 3

First you set the ScriptingOptions as in Figure 3 for the operation you want to perform (lines 38 & 50).  Then you iterate through every object of the specified type in the database using a foreach loop (lines 41 & 53).  Each time the foreach starts it filters out the object if the schema is any one of the values passed in from the list discussed above.

Without the whitespace, the whole PowerShell script takes up only about 20 lines.  As you can see, PowerShell and SMO together make a very powerful tool to put in your toolbox.    I’d be interested to hear about how you modified this code to meet your own needs.

Here is the whole script, ready to use with your parameters.

##########################################################
# PS Script DB Objects
#
# Uses PS and SMO to script the drop and create a type of
# SQL Server object (views, tables, etc.)
#
# Tom Powell                                       2/24/2011
##########################################################
 
#Setup####################################################
    $SQLInstance = “Server\Instance”
    $DB = “DBName ”
    $ObjType = “views”
    $OutFile = “C:\temp\Scripts.sql”
    $ExcludeSchemas = @(“sys”,”Information_Schema”) #List of schemas to exclude
   
#Create smo objects##########################################
    [System.Reflection.Assembly]::LoadWithPartialName(‘Microsoft.SqlServer.SMO’) | out-null
   
    #SQL Instance & DB object
    $s = new-object (‘Microsoft.SqlServer.Management.Smo.Server’) $SQLInstance
    $dbs=$s.Databases
   
    #ScriptingOption object to pass parms for Script method
    $so = new-object (‘Microsoft.SqlServer.Management.Smo.ScriptingOptions’)
    $so.IncludeIfNotExists = 1
    $so.SchemaQualify = 1
    $so.AllowSystemObjects = 0

#Script Drop Objects###########################################
    $so.ScriptDrops = 1         #Script Drop Objects

    #Generate script for all objects##################################
    foreach ($objs in $dbs[$DB].$ObjType)
    {
        If ($ExcludeSchemas -notcontains $objs.Schema )  
        {
            $objs.Script($so) + “`r GO `r ” | out-File $OutFile #-Append
        }
    }

#Script Create Objects#########################################
    $so.ScriptDrops = 0         #Script Create Objects

    #Generate script for all objects#################################
    foreach ($objs in $dbs[$DB].$ObjType)
    {
        If ($ExcludeSchemas -notcontains $objs.Schema )
        {
            $objs.Script($so) + “`r GO `r ” | out-File $OutFile -Append
        }
    }

Tables and Files – What’s the Connection?

What is the relationship between a table and the files that contain it?  This isn’t as easy a question as you might think because there is not a direct relationship between a logical object like a table and a physical object like a file.  For a small database like AdventureWorks (< 200 MB) all you need is a single data file to hold all your tables.  As things start to get larger and more complex you may add files as a way of increasing performance.  For example if you have a single table that is large and heavily queried you may want to put it on a faster (and probably more expensive) drive.  You may also have non-clustered indexes that cover your query that would benefit from being separated.  If terms like non-clustered index and covering query are unfamiliar you might benefit from this excellent article by Joe Webb.

So, how do you control table and index placement?  You do it by using filegroups.  Every relational index (clustered or non-clustered) is assigned to a single filegroup that is composed of one or more files.  The relationship looks like this:

In the picture above the Table01 is actually the clustered index and Table01/Index01 is a non-clustered index.  Note that a file can be in exactly one filegroup, but the components of the table can be located and managed through the use of different filegroups.
Here is a script you can use to explore these relationships:

–Create a test DB.  Does the DB exist?  Drop if it does.
            Use tempdb
            Go
            If Exists (Select name from sys.databases Where name = ‘TabFile’)
                        Begin
                                    Drop Database TabFile
                        End
            Go

–Create DB with two filegroups.  You will need a “C:\temp”
–folder to use the script as written.
            Create Database TabFile On
                        Primary                                                                                                –FileGroup
                                    ( Name = N’TabFileF01′,
                                                Filename = N’C:\Temp\TabFileF01.mdf’),
                         Filegroup FG02                                                                               –FileGroup
                                    ( Name = N’TabFileF02′,                                                   –File
                                                Filename = N’C:\Temp\TabFileF02.ndf’ ),
                                    ( Name = N’TabFileF03′,                                                   –File
                                                Filename = N’C:\Temp\TabFileF03.ndf’)
                         Log ON                                                                                                –Log
                                    ( Name = N’TabFile_log’,
                                    Filename = N’C:\Temp\TabFile_log.ldf’)
            Go

–Create table with two indexes
            Use TabFile
            Go
            Create Table dbo.Table_1
                        (Col1 int Constraint PK_Table_1 Primary Key Clustered Not Null Identity (1, 1),
                        Col2 int Not Null)  On [Primary]
            Go
            Create Nonclustered index IX02_Table_1 ON dbo.Table_1
                        (Col1     ) On FG02
            Go
            Create Table dbo.Table_2
                        (Col1 int Constraint PK_Table_2 Primary Key Clustered Not Null Identity (1, 1),
                        Col2 int Not Null)  On FG02
            Go

At this point you can take a look at the various structures using either the Object Explorer in SSMS or TSQL.  I prefer TSQL because it is repeatable and far more flexible.  The following script shows the relationship between tables, filegroups, and files.

–View tables and files
      Use TabFile
            Go
            Select Distinct
                        O.name                         As TableName
                        ,I.name                         As IndexName
                        ,I.type_desc               As IndexType
                        ,I.index_id                  As IndexID
                        ,FG.name                     As FileGroupName
                        ,F.name                        As FileLogicalName
                        ,F.physical_name    As FilePhysicalName
            From
                        sys.indexes                As I
                        Inner Join
                        sys.filegroups            As FG
                                    On I.data_space_id = FG.data_space_id
                        Inner Join
                        sys.master_files       As F
                                    On FG.data_space_id = F.data_space_id
                                    And F.database_id = DB_ID()
                        Inner Join
                        sys.all_objects          As O
                                    On I.[object_id] = O.[object_id]
            Where
                        I.data_space_id = FG.data_space_id
                        And O.[type] = ‘U’
            Order By
                        TableName
                        ,IndexID
                        ,FileLogicalName
            Go

I recommend you play with this script on the sample DB above but also on of your own DBs.  Not only will it familiarize you with the various catalog views it will also point out how your data is arranged and may help you in managing it even better.

Ad Hoc Automation On TSQL Tuesday

 

Ad hoc Automation – isn’t that an oxymoron?  Isn’t it hard to automate something?  That answer depends on what you mean by automation.  If you want to build SSIS packages and SQL Agent jobs to comb your network and return all kinds of information about the servers and instances and track changes over time then you’re not going to knock that out before your second cup of coffee.  But suppose you just need to know what the SQL Server version is for each instance you own.  Or what if you are tired of cleaning AdventureWorks tables out of master and you want to set everybody’s default database to tempdb?  These tasks are a slice of pastry using the multi-server query feature of SQL Server’s Central Management Servers (CMS).

I’m not going to go over the details of how to set up a CMS  as Kimberly Tripp has already done a great job.  Instead I’d like to focus on some of the things you can do with them to automate repetitive tasks.  

SQL Server CMS allows you to interact with a group of servers much as you would with a single server.  The grouping of servers is entirely up to you.  You can have as many groups as you wish and you can configure them as you please.  This is important because, with a little thought you can slice and dice your environment by version (2K, 2005, 2008), by environment (Dev, Test, Prod), by function (Log Shipping, Replication, Clustered) and so on.  This allows you to customize your code and multiply the power of your CMS.

For example, suppose you have a server discovery query that returns SQL server metadata for an instance like this:

Wouldn’t it be handy to be able to generate that same information for every server in your enterprise in a single, Excel ready, recordset!

At one client site we decided to make every login use tempdb for their default DB.  Connecting to each of the 100+ instances and doing this manually didn’t appeal to me so I wrote and tested a TSQL script to do it on a single server, connected to every server at once, and made the change in about 2 minutes.  Notice I mentioned that I TESTED the script.  Remember, as Uncle Ben reminds us “with great power comes great responsibility.”   

I believe SQL Server’s CMS is a very useful but under-appreciated feature  limited only by your creativity.

 

http://sqlasylum.wordpress.com/2011/02/01/invitation-to-t-sql-tuesday-15-automation-in-sql-server/

I Have HOW MANY filegroups!?!



For small DBs one or maybe two data files (.mdf/.ndf) works fine.  With larger DBs though there is an opportunity to gain performance by the judicious use of additional file groups and files.   Unfortunately some folks go a little crazy and you can end up with more files than you can shake a dead cat at (my apologies to cat lovers).

Here’s a quick script to give you the lay of the land regarding file groups and their files.

Select

            FG.name                                                           As FGName

            ,Count(F.name)                                                 As FileCount

            ,Convert (Integer, Avg((F.size*8096.0)/(POWER(1024.0,2))))

                                                                                    As AvgFileSize_MB

From

            <DBName,sysname,MyDB>.sys.filegroups         As FG

            Inner Join

            sys.master_files                                              As F

                        On FG.data_space_id = F.data_space_id

Where

            F.database_id = DB_ID(‘<DBName,sysname,MyDB>’)

Group By

            FG.name

Having

            COUNT(F.name) > 1

Order By

            FGName

Actually I like cats, I just can’t eat a whole one by myself.

How Bad Are My Statistics?

Here’s a quick query that tells you how many rows have been changed (table & index) since the last statistics update.

Select     OBJECT_NAME(I.id)            As TableName

    ,Sum(Convert(BigInt,I.rowmodctr))    As RowsModSinceStats

    ,C.RowCnt

From     sys.sysindexes                As I

    Left
Outer
Join

    (    Select     OBJECT_NAME(PS.object_id)    As TableName

            ,SUM(PS.row_count)            As RowCnt

        From     sys.dm_db_partition_stats        As PS

            INNER
JOIN

            sys.indexes                As I

                ON I.object_id
= PS.object_id

                AND I.index_id = PS.index_id

        Where I.type_desc IN
(‘CLUSTERED’,‘HEAP’)

            AND I.object_id
> 100

            AND
OBJECT_SCHEMA_NAME(PS.object_id)
<>
‘sys’

        Group
By PS.object_id)    As C

        On
OBJECT_NAME(I.id)
= C.TableName            

Where
Coalesce(C.RowCnt,0)
> 0

Group
By
OBJECT_NAME(I.id)    

    ,C.RowCnt

Order
By TableName

    –RowsModSinceStats Desc