Categories
Business Intelligence Product Reviews

Redgate SQL Search – Free Download

I have always wanted a good way to search all the db’s on a server, find procs, views, whatever that have something in their DDL so i could do what I want with it. There are ways using DMV’s or other things in SQL Server, but it just becomes a pain.

Then came along SQL Search (http://www.red-gate.com/products/SQL_Search/index.htm) – from Redgate. And it is free! Works like a charm. It eliminates the need to fire up some query and change what you are looking for. It works well. Check it out the next time you are trying to find every object that has “whatever” in its T-SQL DDL.

Categories
Business Intelligence Geeky/Programming SQLServerPedia Syndication

SQL Server 2008 – Intellisense – Update Local Cache

Since we recently upgraded to SQL 2008, now I have some small tidbits that I can share! First one is this: Intellisense updating.

In SQL 2008, it has built in Intellisense, pretty awesome. Until you add new objects, then everything is red underlined. What to do?

Well if you are writing some T-SQL in SQL Server Management Studio (SSMS) and run into this issue for newly created objects, just use this command

CTRL+SHIFT+R

and you are all set, the local Intellisense cache will be rebuilt and your new tables/procs/whatever will show up in Intellisense, cool!

Categories
Business Intelligence Geeky/Programming SQLServerPedia Syndication

Dev and Prod Systems, Using a HOST file to ensure your datasource is pointing the right system

In many shops, I am guessing there are multiple servers. Development, Production, maybe a Staging, etc.

With SQL Server Analysis Services (SSAS) and SQL Server Integration Services (SSIS), you set up data sources, connection to databases. In SSAS you usually have a connection which then you build your data source view off of, and in SSIS you have connections from which you push data to and pull data from.

Another thing, in SSAS you can “deploy” right from Visual Studio (BIDS). All these things have a server name. What we have run into is this:

You develop on your local machine, pointing at development server. You deploy to development, your connections are pointing to development, and everything works great. When you deploy to production (usually planned, every 2 weeks, or whatever) you deploy your stuff and what ends up happening?

In SSIS your config files should have a connection string (or however you store it) and it should point to production. But in SSAS, if you deploy from BIDS, your data source will have to change and in the cube project properties you need to change your deployment server.

I have seen countless times, a cube or a connection in SSIS without a config that is running in production, yet pointing at development. We keep our dev data as fresh or very close so sometimes we don’t even notice, but then it happens, something weird is reported and we dig into it, and we find the erroneous connection string.

Here is my solution to the problem:

Developers – go to C:WindowsSystem32DriversEtc and open your Hosts file with notepad or text editor. You then add a couple of entries

#production
#xxx.xxx.xxx.xxx datawarehouse

#development
yyy.yyy.yyy.yyy datawarehouse

where xxx is the ip of your production system, and yyy is the ip of your dev system. the # is the rem/comment out symbol. You can see above I have everything commented out but the line for the dev system. But notice each is pointed to “datawarehouse” so if I ping or connect to “datawarehouse” from Management Studio, or whatever, it goes to the IP I have commented out.

Now, go on to each server, but only add the line that corresponds to that server in the hosts file, or better yet just

127.0.0.1 datawarehouse

Now, when you deploy to either server, and your connections, etc are set to connect to “datawarehouse” you ensure it will always connect to the local server. Brilliant!

Categories
Business Intelligence Geeky/Programming SQLServerPedia Syndication

SQL Server Master Data Services (Nov CTP)

Last night I configured up SQL Server Master Data Services (MDS) on a test box. It looks good so far. I ran into a few issues with the box/setup that I had to tweak in order to get it working. I had to allow “handlers” and “modules” in the applicationHost config on the machine. IIS was also inadvertently set up with only anon access, which was an issue, after I got windows auth installed and turned on everything seemed to work.

The app/system itself is pretty slick. Very basic, but lets you do complex things. Once you get some users set up, and a few models (think: Product, Customer), you can add entities (think: Category1, Category2, etc) you can set up hierarchies, business rules, etc.

I haven’t played much more with it, but it seems like it could get the job done. I would say some things aren’t intuitive enough. Example – they could say “drag this over to this area” but there is nothing as far as what to do, its kind of guess and check.

I’m excited to see where MDS goes.

mds

Categories
Business Intelligence Geeky/Programming SQLServerPedia Syndication

SQL Server Schema Automatic Revision History using DDL Triggers and SVN

SQL 2005 introduced DDL Triggers, where you could catch events on your DDL statements in SQL, and hopefully most DBA’s are catching them, at least to a table and have some kind of report on who is adding, changing and deleting what objects.

What I wanted to do was capture that, but also keep and automatic running log in SVN (subversion) source control. Here is how I went about it.

First, you need SVN set up somewhere. We use unfuddle (http://unfuddle.com/) – which we also use for Agile BI stuff, but that is another post 🙂 and unfuddle also let’s us do SVN in the cloud, but you could do it locally or internally or whatever you’d like. I had to create a user for this, to do automatic commits, and give it rights to commit to the repo.

Second, you probably already want your DDL triggers set up to capture events on your databases, and write to a table. Writing to central DB on a server is probably a good idea. If you have that, or something similar, I then created a trigger on that table, to capture the INSERTED action and take that record that was just inserted and parse what I need, and automatically commit to SVN.

But first, you want to script out all your objects for a database to a folder, say C:DBSchemas

I created a folder structure to help me

DBSchemasServerNameDatabaseNameObjectType

and then scripted out all the objects to each folder for ObjectType (Tables, StoredProcedures, Functions, and Views)

Once that was done, I did an initial import into SVN, and make the folder I was working with a working copy. Then the funs starts.

I created a couple of procs (which I found online, links to the blogs are below) to CREATE and DELETE files from T-SQL using OLE automation

CREATE/APPEND (http://sqlsolace.blogspot.com/2009/01/ole-automation-write-text-file-from.html)

CREATE PROCEDURE [Utils].[usp_OLEWriteFile] (@FileName varchar(1000), @TextData NVARCHAR(MAX),@FileAction VARCHAR(12)) AS

BEGIN
DECLARE @OLEfilesytemobject INT
DECLARE @OLEResult INT
DECLARE @FileID INT

EXECUTE @OLEResult =
 sp_OACreate 'Scripting.FileSystemObject', @OLEfilesytemobject OUT
IF @OLEResult  0
  PRINT 'Error: Scripting.FileSystemObject'

-- check if file exists
EXEC sp_OAMethod @OLEfilesytemobject, 'FileExists', @OLEresult OUT, @FileName
-- if file esists
IF (@OLEresult=1 AND @FileAction = 'APPEND') OR (@OLEresult=0)
BEGIN

IF (@FileAction = 'CREATENEW')
 PRINT 'New file specified, creating...'
IF (@OLEresult=1 AND @FileAction = 'APPEND')
 PRINT 'File exists, appending...'
IF (@OLEresult=0 AND @FileAction = 'APPEND')
 PRINT 'File doesnt exist, creating...'

 -- open file
 EXECUTE @OLEResult = sp_OAMethod @OLEfilesytemobject, 'OpenTextFile', @FileID OUT,
 @FileName, 8, 1
 IF @OLEResult 0 PRINT 'Error: OpenTextFile'

 -- write Text1 to the file
 EXECUTE @OLEResult = sp_OAMethod @FileID, 'WriteLine', Null, @TextData
 IF @OLEResult  0
  PRINT 'Error : WriteLine'
 ELSE
  PRINT 'Success'
END
IF (@OLEresult=1 AND @FileAction = 'CREATENEW')
 PRINT 'File Exists, specify APPEND if this is the desired action'

EXECUTE @OLEResult = sp_OADestroy @FileID
EXECUTE @OLEResult = sp_OADestroy @OLEfilesytemobject

END
GO

DELETE (http://www.kodyaz.com/articles/delete-file-from-sql-server-xp-cmdshell-ole-automation-procedures.aspx)

DECLARE @Result int
DECLARE @FSO_Token int

EXEC @Result = sp_OACreate 'Scripting.FileSystemObject', @FSO_Token OUTPUT
EXEC @Result = sp_OAMethod @FSO_Token, 'DeleteFile', NULL, 'C:delete-me-file.txt'
EXEC @Result = sp_OADestroy @FSO_Token

You need to make sure OLE Automation is on. You need to make sure that the account you are running SQL as has modify rights to your DBSchemas folder.

But the crux of the solution is the trigger that gets the DDL info, and writes/deletes the files and SVN Add/Del/Commit’s the file. Now this is some ugly 1 hour SQL script craziness, tons of IF statements, etc. It could be improved, but it works, and it is a start, it can be modified and tweaked to do whatever you want. Note, if your SVN repo isn’t authenticated you don’t need the username/password for the SVN commands.

You can see, it gets the DDL, checks the events (and I have it limited to one database), and it checks what type of object and what operation, and for and add, it adds and commits, for a updated, deletes file, recreates it, and commits, and for a delete it does and svn delete and commit. Pretty easy 🙂




CREATE TRIGGER DDLRevisionHistory
	ON dbo.DDLEventLog
	AFTER INSERT
AS

BEGIN
SET NOCOUNT ON;

DECLARE @EventType VARCHAR(50)
DECLARE @DatabaseName VARCHAR(50)
DECLARE @ServerName VARCHAR(50)
DECLARE @ObjectName VARCHAR(100)
DECLARE @SchemaName VARCHAR(10)
DECLARE @CommandText VARCHAR(MAX)
DECLARE @LoginName VARCHAR(50)

SELECT
	@EventType = EventInstance.value('(//EventType)[1]', 'varchar(50)'),
	@DatabaseName = EventInstance.value('(//DatabaseName)[1]', 'varchar(50)'),
	@ServerName = EventInstance.value('(//ServerName)[1]', 'varchar(50)'),
	@ObjectName = EventInstance.value('(//ObjectName)[1]', 'varchar(50)'),
	@SchemaName =EventInstance.value('(//SchemaName)[1]', 'varchar(50)'),
	@CommandText = EventInstance.value('(//TSQLCommand//CommandText)[1]', 'varchar(max)'),
	@LoginName = EventInstance.value('(//LoginName)[1]', 'varchar(50)')
	FROM inserted

DECLARE @filepath VARCHAR(8000)

SET @filepath = 'C:DBSchemas' + @ServerName + '' + @DatabaseName + ''

	IF (
		@EventType = 'CREATE_VIEW' OR @EventType = 'ALTER_VIEW' OR @EventType = 'DROP_VIEW'
		OR @EventType = 'CREATE_TABLE' OR @EventType = 'ALTER_TABLE' OR @EventType = 'DROP_TABLE'
		OR @EventType = 'CREATE_PROCEDURE' OR @EventType = 'ALTER_PROCEDURE' OR @EventType = 'DROP_PROCEDURE'
		OR @EventType = 'CREATE_FUNCTION' OR @EventType = 'ALTER_FUNCTION' OR @EventType = 'DROP_FUNCTION'
		)

		AND @DatabaseName = 'YourDatabase' BEGIN


		-- write out new file to correct folder
		IF CHARINDEX('VIEW',@EventType) > 0 BEGIN
			SET @filepath = @filepath + 'Views' + @SchemaName + '.' + @ObjectName + '.View.sql'
		END

		IF CHARINDEX('TABLE',@EventType) > 0 BEGIN
			SET @filepath = @filepath + 'Tables' + @SchemaName + '.' + @ObjectName + '.Table.sql'
		END

		IF CHARINDEX('PROCEDURE',@EventType) > 0 BEGIN
			SET @filepath = @filepath + 'StoredProcedures' + @SchemaName + '.' + @ObjectName + '.StoredProcedure.sql'
		END

		IF CHARINDEX('FUNCTION',@EventType) > 0 BEGIN
			SET @filepath = @filepath + 'Views' + @SchemaName + '.' + @ObjectName + '.UserDefinedFunction.sql'
		END

		IF CHARINDEX('CREATE',@EventType) > 0 BEGIN

			-- create file
			EXEC dbo.usp_OLEWriteFile @filepath,@CommandText,'CREATENEW'

			-- svn add
			DECLARE @instrAdd VARCHAR(4000)
			SET @instrAdd='svn add ' + @filepath + ' --username dbschema --password yourpassword'
			EXEC xp_cmdshell @instrAdd

			-- svn commit
			DECLARE @instrCommitAdd VARCHAR(4000)
			SET @instrCommitAdd='svn commit ' + @filepath + ' --message "added by '+ @LoginName +'" --username dbschema --password yourpassword'
			EXEC xp_cmdshell @instrCommitAdd

		END

		IF CHARINDEX('ALTER',@EventType) > 0 BEGIN

			--delete and readd file
			EXEC dbo.usp_OLEDeleteFile @filepath
			EXEC dbo.usp_OLEWriteFile @filepath,@CommandText,'CREATENEW'

			-- svn commit
			DECLARE @instrCommitChange VARCHAR(4000)
			SET @instrCommitChange='svn commit ' + @filepath + ' --message "changed by '+ @LoginName + '" --username dbschema --password yourpassword'
			--PRINT @instrCommitChange
			EXEC xp_cmdshell @instrCommitChange
		END

		IF CHARINDEX('DROP',@EventType) > 0 BEGIN
			-- svn delete
			DECLARE @instrDel VARCHAR(4000)
			SET @instrDel='svn delete ' + @filepath + ' --username dbschema --password yourpassword'
			EXEC xp_cmdshell @instrDel

			-- svn commit
			DECLARE @instrCommitDel VARCHAR(4000)
			SET @instrCommitDel='svn commit ' + @filepath + ' --message "deleted by '+ @LoginName +'" --username dbschema --password yourpassword'
			EXEC xp_cmdshell @instrCommitDel
		END

	END

END

as you can see you can create a homegrown revision history of your DDL objects in SQL . I have tested this on the basic operations, no renames, etc using the GUI, but if you do use it, you might want to wrap it all in exception handling just to be on the safe side.

Happy DBA’ing 🙂

Categories
Business Intelligence Geeky/Programming SQLServerPedia Syndication

SSIS – Custom Control Flow Component – Execute SQL Job And Wait

Sometimes you have some pretty complex ETL’s going in SSIS, and you might have multiple projects/solutions that need to call other SSIS Packages or SQL Agent Jobs and you have a pretty big production going on. You might have an ETL solution that needs to kick off other packages, and you can either import those into your solution or call them where they lie on the file system/SQL server, etc. You might have to call some SQL agent jobs, and most often they are async calls (you dont need to wait for them to come back) and this works nicely, I do this all the time. The Execute SQL Agent Task in SSIS works nice, or you can just call the SQL statement to execute a job, either way, it kicks off the job and then just comes back successful right away, and doesn’t care if the job actually succeeds. You might want this in some scenarios, and the built in functionality works great.

But what if you want to just call an existing SQL Agent job and actually wait for the job to finish (success or failure)? There isn’t anything that I could see built in to SSIS to do this, sp_start_job is asynchronous, so you are out of luck there. I figured I could call sp_start_job, then create a for loop in SSIS and just check the status every X seconds/minutes, but I would have to either make this a package I could use everywhere or reproduce the same logic in multiple solutions, so I shied away from that solution.

What I decided to do was build a custom SSIS control flow task in .NET that will execute a SQL agent job and check the status and wait until it finishes. A disclaimer: This is going to be a lot of code 🙂 also, it could be improved (but what couldn’t?) – this was a 1.5-2 hour experiment.

First, I created a VS2008 C# class library. I tried adding a UI to my task, but I couldn’t get it working so there is some code there for that but it’s commented out.

here is what my solution looks like:

Capture

import the correct namespaces:

using System;
using System.Collections.Generic;
using System.Text;
using Microsoft.SqlServer.Dts.Runtime;
using System.Net;
using System.Net.NetworkInformation;
using System.Xml;
using Microsoft.SqlServer.Dts.Runtime.Design;
using System.Data.SqlClient;

Next, you need to create the actual skeleton/wrapper for your component. You can see I have two properties, job name, server name. It could be expanded to have the connection string or use an existing connection in SSIS, I wasn’t that ambitious. The “Execute” method basically just calls some functions and waits for result.

namespace ExecuteSQLJobAndWaitControlTask
{

    [DtsTask(
        Description = "Execute SQL Job And Wait",
        DisplayName = "Execute SQL Job And Wait",
        TaskContact = "Steve Novoselac",
        TaskType = "SSIS Helper Task",
        RequiredProductLevel = DTSProductLevel.None)]
    public class ExecuteSQLJobAndWaitControlTask : Task, IDTSComponentPersist

    {
        private string _jobName;
        private string _serverName;

        ///

        /// The sql job name
        ///

public string JobName { get { return _jobName; } set { _jobName = value; } } ///
/// The sql server name ///

public string ServerName { get { return _serverName; } set { _serverName = value; } } public override DTSExecResult Execute(Connections connections, VariableDispenser variableDispenser, IDTSComponentEvents componentEvents, IDTSLogging log, object transaction) { try { StartJob(); System.Threading.Thread.Sleep(5000); do { System.Threading.Thread.Sleep(5000); } while (IsJobRunning()); if (DidJobSucceed()) { return DTSExecResult.Success; } else { return DTSExecResult.Failure; } } catch (Exception ex) { Console.WriteLine(ex.Message); return DTSExecResult.Failure; } } public override DTSExecResult Validate(Connections connections, VariableDispenser variableDispenser, IDTSComponentEvents componentEvents, IDTSLogging log) { if (string.IsNullOrEmpty(_serverName) || string.IsNullOrEmpty(_jobName)) { componentEvents.FireError(0, “You must specify a JobName and ServerName in the properties”, “”, “”, 0); return DTSExecResult.Failure; } else { return DTSExecResult.Success; } } void IDTSComponentPersist.LoadFromXML(System.Xml.XmlElement node, IDTSInfoEvents infoEvents) { if (node.Name != “ExecuteSQLJobAndWaitTask”) { throw new Exception(string.Format(“Unexpected task element when loading task – {0}.”, “ExecuteSQLJobAndWaitTask”)); } else { this._jobName = node.Attributes.GetNamedItem(“JobName”).Value; this._serverName = node.Attributes.GetNamedItem(“ServerName”).Value; } } void IDTSComponentPersist.SaveToXML(System.Xml.XmlDocument doc, IDTSInfoEvents infoEvents) { XmlElement taskElement = doc.CreateElement(string.Empty, “ExecuteSQLJobAndWaitTask”, string.Empty); XmlAttribute jobNameAttribute = doc.CreateAttribute(string.Empty, “JobName”, string.Empty); jobNameAttribute.Value = this._jobName.ToString(); taskElement.Attributes.Append(jobNameAttribute); XmlAttribute serverNameAttribute = doc.CreateAttribute(string.Empty, “ServerName”, string.Empty); serverNameAttribute.Value = this._serverName.ToString(); taskElement.Attributes.Append(serverNameAttribute); doc.AppendChild(taskElement); }

And then I have some helper methods, this is where the meat and potatoes are for this task. Now of course I could have the connection string once, etc. Like I said, it was a quick thing :). The heart of it is though, starting the job, checking if it is still running, and then after, if it succeeded. Pretty simple.


        private bool DidJobSucceed()
        {
            SqlConnection dbConn = new SqlConnection("Integrated Security=SSPI;Persist Security Info=False;Initial Catalog=msdb;Data Source=" + ServerName);
            SqlCommand dbCmd = new SqlCommand("exec msdb.dbo.sp_help_job @job_name = N'" + JobName + "' ;", dbConn);
            dbConn.Open();

            SqlDataReader dr = dbCmd.ExecuteReader();
            dr.Read();
            int status = Convert.ToInt32(dr["last_run_outcome"]);
            dr.Close();

            dbConn.Close();

            if (status == 1)
            {
                return true;
            }
            else
            {
                return false;
            }
        }

        private bool IsJobRunning()
        {

            SqlConnection dbConn = new SqlConnection("Integrated Security=SSPI;Persist Security Info=False;Initial Catalog=msdb;Data Source=" + ServerName);
            SqlCommand dbCmd = new SqlCommand("exec msdb.dbo.sp_help_job @job_name = N'" + JobName + "' ;", dbConn);
            dbConn.Open();

            SqlDataReader dr = dbCmd.ExecuteReader();
            dr.Read();
            int status = Convert.ToInt32(dr["current_execution_status"]);
            dr.Close();

            dbConn.Close();

            if (status == 1)
            {
                return true;
            }
            else
            {
                return false;
            }

        }

        private void StartJob()
        {
            SqlConnection dbConn = new SqlConnection("Integrated Security=SSPI;Persist Security Info=False;Initial Catalog=msdb;Data Source=" + ServerName);
            SqlCommand dbCmd = new SqlCommand("EXEC dbo.sp_start_job N'" + JobName + "' ;", dbConn);
            dbConn.Open();
            dbCmd.ExecuteNonQuery();
            dbConn.Close();
        }
   }

Now, to install this you need to register it in the GAC (global assembly cache), and then copy to the DTS/Tasks folder. Depending if you have VS2005 or VS2008 (or both) your gacutil path might be different.

cd
c:
cd C:Program FilesMicrosoft SDKsWindowsv6.0Abin
gacutil /uf "ExecuteSQLJobAndWaitTask"
gacutil /if "C:ProjectsSSISCustomTasksExecuteSQLJobAndWaitbinDebugExecuteSQLJobAndWaitTask.dll"
copy "C:ProjectsSSISCustomTasksExecuteSQLJobAndWaitbinDebugExecuteSQLJobAndWaitTask.dll" "C:Program FilesMicrosoft SQL Server90DTSTasks"

I have found once you have done that, you need to actually restart your SSIS service to make it work, but then you can use it in new Visual Studio SSIS packages.

Capture

Once you drag it on your package, you can set the JobName and ServerName property (from the properties window – remember, no GUI). and it should run.

Some notes:

If you kill the job, the SSIS task will fail (obviously). If you kill the SSIS package, the job will keep running. Maybe a future enhancement will be to capture the SSIS package fail/cancel and kill the job. Maybe 🙂

Attached is the source code for the task (Vs2008 C#) https://onedrive.live.com/redir?resid=ac05d3c752d3b50a!187358&authkey=!APJ06uEMqWiLlIM&ithint=file%2crar

This has been testing with BIDS VS2005. I take no responsibility if this blows up your system, computer, server, the world, etc.

Happy ETL’ing!

Categories
Business Intelligence Geeky/Programming SQLServerPedia Syndication

SSASMeta – C# App to Log Info About SSAS Objects

I manage some servers that have many cubes. OK, a lot of cubes (60+ on one). I needed some way to output a report of last processed time, last schema update, etc. Now, there are about 5 different ways to do this (one being the SSAS Stored Procedure Project), but this is what I came up with. I wrote a 100 line C# app to take a server name, loop through the SSAS DB’s, cubes, measures, partitions, and dimensions and log info about them.

Here is a c# code snippet of a function that just outputs to the console, the app I have actually logs the info to a SQL Server database and then I can write reports off that.

     private static void LogSSASInfo(string serverName)
        {
            var server = new Server();
            server.Connect(serverName);

            foreach (Database database in server.Databases)
            {
                Console.WriteLine(database.Name + " " + database.LastUpdate + " " + database.EstimatedSize / 1024 + " " + database.CreatedTimestamp);

                foreach (Cube cube in database.Cubes)
                {
                    Console.WriteLine("     Cube: " + cube.Name + " " + cube.LastProcessed + " " + cube.LastSchemaUpdate);

                    foreach (MeasureGroup measureGroup in cube.MeasureGroups)
                    {
                        Console.WriteLine("         Measure Group: " + measureGroup.Name + " " + measureGroup.LastProcessed);

                        foreach (Partition partition in measureGroup.Partitions)
                        {
                            Console.WriteLine("             Partition: " + partition.Name + " " + partition.LastProcessed);
                        }
                    }
                }

                foreach (Dimension dimension in database.Dimensions)
                {
                    Console.WriteLine(" Dimension: " + dimension.Name + " " + dimension.LastProcessed);
                }

                Console.WriteLine("");
                Console.WriteLine("------------------------------------------------");
                Console.WriteLine("");
            }

            server.Disconnect();
        }

As you can see, it isn’t the most elegant code in the world, but it works. In order to get this to work in your project, you need to reference the Microsoft.AnalysisServices assembly.

ssasmeta

Use your imagination, you could make an app wrap that function above and log info for all the SSAS instances on your network. There have been a few times already in the last year where I have found some cube or measure group not updating correctly and a report like the one I can get now will help dealing with that challenge.

Categories
Business Intelligence SQLServerPedia Syndication

SQL Job – Check Cube Valid Data as Last Step

Running a SQL Agent job to do an ETL/Cube Processing, you might also want to check the status of the cube after you process it, just to make sure.

Create a job step that is a T-SQL type, and

image

DECLARE @forecast VARCHAR(10)

    SELECT  @forecast = CAST("[Measures].[Forecast-Part]" AS VARCHAR(10))
        FROM
    OPENROWSET(‘MSOLAP’, ‘Data Source=localhost;Initial Catalog=ComponentForecast;’,
        ‘SELECT { [Measures].[Forecast-Part] }  ON COLUMNS FROM [ComponentForecast]’)

IF @forecast = ‘0’ OR @forecast IS NULL
RAISERROR (‘Cube Data Not Loaded Correctly’, 17, 1)

 

Of course your MDX query in the OPENROWSET will need to be different depending on your cube. If you get more complicated, you can also just call a stored procedure and let your imagination run wild with what you can do.

* update – fixed sql code – changed from BIGINT to VARCHAR(10)

Categories
Business Intelligence Geeky/Programming SQLServerPedia Syndication

ETL Method – Fastest Way To Get Data from DB2 to Microsoft SQL Server

For a while, I have been working on figuring out a “better” way to get data from DB2 to Microsoft SQL Server. There are many different options and approaches and environments, and this one is mine, your mileage may vary.

Usually, when pulling data from DB2 to any Windows box, the first thing you might think of is ODBC. You can either use the Microsoft DB2 driver (which works, if you are lucky enough to get it configured and working), or the IBM iSeries Client Access ODBC Driver (which works well), or another 3rd party ODBC driver. Using ODBC, you can access DB2 with a ton of different clients. Excel, WinSQL, any 3rd party SQL Tool, a MSSQL linked server, SSIS, etc. ODBC connects just fine, and will work for “querying” needs. Also, with the drivers you might install, you can usually set up an OLE DB connection if your client supports it (SSIS for example) and query the data using OLEDB – this works as well, but there are some caveats, which I will talk about.

In comes SSIS, the go to ETL tool for MSFT BI developers. You want to get data from DB2 to your SQL Server Data Warehouse, or whatever. You try with an OLEDB connection source, but it is clunky, weird, and sometimes doesn’t work at all (PrimeOutput Errors Anyone?). If you do manage to get OLEDB configured and working, you still probably will be missing out on some performance gains compared to the method I am going describe.

Back to SSIS, using ODBC. It works. You have to create an ADO.NET ODBC connection, and use a DataReader source instead of an OLEDB source. Everything works fine, except one thing. It is slow! Further proof?

http://social.msdn.microsoft.com/Forums/en-US/sqlintegrationservices/thread/162e55e5-b64b-423e-94c1-dd764ca1f683

http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=96977

http://social.msdn.microsoft.com/Forums/en-US/sqlintegrationservices/thread/cfade7e7-50d5-4447-9821-35c5d5ae1b66

http://www.sqlservercentral.com/Forums/Topic702042-148-1.aspx

http://www.sqlservercentral.com/Forums/Topic666993-148-1.aspx

Ok, enough links. But if you do read those. SQL 2000 DTS is faster than using SQL 2005/2008 SSIS. WTF? The best I can guess is that it is because of the .NET wrapper around ODBC. DTS is using “native” ODBC.

So, now what? Do we want to use DTS 2000? No. What to do though?

Well, after a few days of research, and just exploring around, I think I have found a good answer.. Replace DB2 with SQL Server.. just kidding. Here is what you need to do:

Install the IBM Client Access tools. There is a tool called “Data Transfer From iSeries Server” which the actual exe is "C:Program FilesIBMClient Accesscwbtf.exe"

image

This little tool allows you to set up data transfers from your DB2 system to multiple output choices (Display, Printer, Html, and Text). We want to export to Text file on our filesystem. You have to set up a few options, like the FileName, etc. In “Data Options” you can set up a where statement, aggregates, etc.

If you output to a file, you can go into “Details” and choose a file type, etc. I use ASCII Text, and then in the  “ascii file details” I uncheck all checkboxes. You set up your options and then hit the “Transfer data from iSeries” button and it will extract data to the file you chose in the filename field. Pretty sweet. But this is a GUI, how can I use this tool? I am not going to run this manually. Well, you are in luck.

If you hit the “Save” button, it will save a .dtf file for you. If you open this .dtf file in a text editor, you will see all options are defined in text, in a faux ini style. Awesome, we are getting somewhere.

Now, how do you run this from a cmd prompt? Well, we are in luck again. Dig around in C:Program FilesIBMClient Access and you will find a little exe called “rxferpcb”

image

What this tool allows you to do, is pass in a “request” (aka a DTF file), and a userid/password for your DB2 system, and it will execute the transfer for you. Sweet!

Now what do we do from here?

1) Create an SSIS package

2) Create an execute process task, call rxferpcb and pass in your arguements.

3) Create a BULK Insert task, and load up the file that the execute process task created. (note you have to create a .FMT file for fixed with import. I create a .NET app to load the FDF file (the transfer description) which will auto create a .FMT file for me, and a SQL Create statement as well – saving time and tedious work)

Now take 2 minutes and think how you could make everything generic/expression/variable driven, and you have yourself a sweet little SSIS package to extract any table from DB2 to text and bulk load it.

image

What is so great about the .DTF files is that you can modify them with a text editor, which means you can create/modify them programmatically. Think – setting where statements for incremental loads, etc.

image

 

You can see from the two screenshots above, that is all there is. Everything is expression/variable drive. Full Load, and Incremental Load. Using nothing but .dtf files, rxferpcb, a little .NET app I wrote to automatically create DTF’s for incremental (where statements), truncate, delete, and bulk insert. I can load up any table from DB2 to SQL by just setting 3 variables in a parent package.

After you wrap your head around everything I just went over, then stop to think about this. The whole DTF/Data Transfer/etc is all exposed in a COM API for “Data Transfer Automation Objects’”

http://www-912.ibm.com/s_dir/slkbase.NSF/643d2723f2907f0b8625661300765a2a/0c637d6b03f927ff86256a710076ab22?OpenDocument

With that information at your disposal, you could really do some cool things. Why not just create a SSIS Source Adapter that wraps that COM object and dumps the rows directly to the SSIS Buffer, and then does an OLEDB insert or Bulk Insert using the SQL Server Destination?

I have found in my tests that I can load over 100 million row tables – doing a full complete load, in about 6-7
hours. 30-40 million row tables in 4 hours. 2 to extract, 2 to BULK insert. Again, your mileage may vary depending on the width of your table, network speed, disk I/O, etc. To compare, with ODBC, just pulling and inserting 2 million records was taking over 2 hours, I didn’t wait around for it to finish. Pulling 2 Million records with my method described in this blog takes about 3-5 minutes (or less!)

I know I have skimmed over most of the nitty gritty details in this post, but I hope to convey from a high level that ODBC/OLE DB just aren’t as fast as the method here, I have spent a lot of time over the last few weeks comparing and contrasting performance and manageability. Now, if I could just get that DB2 server upgrade to SQL Server 2008. . . Happy ETL’ing!

Categories
Business Intelligence SQLServerPedia Syndication

The problem isn’t SQL Server. It’s you.

Throughout all my years in different places, I have seen SQL, Oracle, Firebird, MySQL, DB2, Zortec, Access, and probably a few other crazy databases set up and run, and administered. Of course most of them along the way have been Microsoft SQL Server, (6.5, 7, 2000, 2005, 2008). I’ve worked with some knowledgeable DBA’s, and in those cases everything usually turns out ok.

But sometimes, in some department or place or whatever, your buddy down the street wanting to start a new company, your girlfriends place of work that wants to track orders, whatever, they usually try to get SQL Server running, and what sometimes happens next just makes my head spin. Microsoft, bless them, sometime in the past, not so much now, tried to market SQL Server as “self manageable”. Probably sometime between 6.5 and 7, they tweaked some update stats routines and schedules and its all good, right? Set autogrow by default, and you are good to go. Wrong.

What this awesome marketing strategy did, was get people, places, and organizations, mostly ISV’s to use SQL Server and install it, get their app running, and walk away. Of course it runs for a while, runs like a champ even. But then months, even years go by and the system starts running slow. There is no DBA around, they didn’t need one, SQL Server manages itself! Wrong again.

What you might end up with though, are people using the system that might know a little bit, enough to be deadly even, and they start making changes, when in reality you need a full fledged DBA to manage your server, and database, hence the name DBA (database administrator). But before the DBA comes on to the scene to save the day, you will have the people that blame SQL Server. “Oh SQL Server doesn’t work at all, it can’t perform’”… or “Our other databases run 10x as fast, what gives” (not mentioning they have 3 DBA’s for those “other” databases, but not for MS SQL). and the quotes keep coming.

That is why the title of this post is what it is.

The problem isn’t SQL server, it’s you

. If you fail to realize that MS SQL is an Enterprise class database system, and treat it like some out of the box, already configured, plug and play system, you are going to run into issues eventually. You need a DBA. Probably best to have one BEFORE you implement any system, even if it is a consultant to guide your implementation, and assist as time goes on.

I sometimes get tired trying to argue that MS SQL can hold its own against Oracle, DB2, whatever. Trust me, it can. I could probably go find tons of SQL DBA’s that would back me up as well. It is all about how you manage and administer it! SQL Sever does just fine, as long as you know what you are doing. Just like any system. I think sometimes that if we took SSMS away, and just made everything cmd line/scripting, that people “outside” of the MS SQL community would see how MS SQL works in compared to their own systems.

This post isn’t meant to be a beat down rant or anything, but the same things can be said for .NET compared to Java, C++, etc, or whatever. It just seems sometimes that people that live and breathe Microsoft SQL need to know what the other RDMS/BI systems are capable of, but for some reason the same isn’t true for people that use the other systems. They kind of just brush MS SQL off as a play toy, something that shouldn’t be taken seriously, a “hobbyist” SQL system. Something that any enterprise wouldn’t be caught dead running, that is of course, unless you are Microsoft. 🙂

I’m still hedging my bets on MS SQL and .NET, I haven’t seen anything better for the price and ease of use, and the best part about it, the community. The MS SQL and Development community is huge compared to anything else, and to me that just puts the icing on the cake. Just remember the next time someone who needs a MS SQL DBA but doesn’t have one complains about performance of their system, you can tell them it’s not SQL Server’s fault, it’s probably the lack of neglect to SQL Server that caused the problems.