Tag: linkedin

Business Intelligence SQLServerPedia Syndication

Microsoft Business Intelligence Development in a Team Environment

Post author By Steve Novoselac
Post date June 4, 2009
2 Comments on Microsoft Business Intelligence Development in a Team Environment

Today I received an email asking to some extent best practices on development with SQL Server Integration Studio (SSIS) and Business Intelligence Developer Studio (BIDS) in a team environment. Here is part of the email:

Me and another DBA belong to the same team, we have a SQL server with SSIS running. We use the SSIS transfer data among multiple data sources. In SQL 2000 DTS, both of us can save the package on the server and open/edit it in the enterprise manager. In SQL 2005, I can see the package on server, but can’t open it directly. We came out a solution: create a shared folder on the server called ‘SSIS Projects’, both of us can access to it. We run the ‘SQL Server Business Intelligence Development Studio’ on local PC, to open the project in that shared folder. When done with the change, save the package to the SSIS server. Now, we have more than 50 packages in a project. Problem is: it’s very slow when open a project, ‘Business Intelligence Development Studio’ tends to open/verify every single package inside a project, takes up to 10 mins and getting worse. We really miss the SQL 2000 DTS, but we have to turn to SQL 2005.

Are we doing the right thing? Is there any better solution for SSIS developing in a team environment?

When open a project, does ‘Business Intelligence Development Studio’ has to open/verify every package?

This got me thinking, and I figured instead of write an email back, it would be good info for a blog post. So here is what I think and some things I have done that have worked.

First, yes, SQL 2000 DTS allows you to just edit on the server, do more than SSIS, is just way better than SSIS. Wait, what? Well, yeah some people will say that, because it does one thing that might be a little rigmarole in SSIS, but no, SQL 2000 DTS is not better than SSIS, just wanted to clear that up.

So, the is meant to be a starting point, by no means all encompassing, and as always, YMMV.

One thing that I first thought about is this: Yeah, if BI devs and SQL devs have never really worked in a team environment, developing software, how would they know what to do, or best practices? They would just go about “making it work” until everything breaks or who know what.

So how to develop Microsoft Business Intelligence Solutions in a team environment?

1) Standardize on Versions

First, figure out what “versions” you are going to support, and what you are going to use, and get standardized on them. I am guessing majority of BI devs right now are on the 2005 stack. Yeah, there is still probably a bit of 2000 legacy stuff out there, and some people are now getting into the 2008 stuff, but 2005 is pretty much the norm from what I see, at least at this point.

So, 2005. Get all your dev’s on 2005 on their machine – same patch level, etc. Get BIDS up to the same level. Get BIDS helper installed everywhere. Strive to get all your ETL packages in SSIS 2005, get all your cubes to SSAS 2005, etc, etc. Come to a consensus on things like config files for SSIS, naming conventions, within your development and on disk – folder structure is key! With a smaller number of versions of things floating around, it makes it easy for anyone on the team to open up a solution and start hammering away without tons of setup.

2) Get Source Control

This is crucial! I have talked about source control in the past, and also about some that aren’t so great. Really it doesn’t matter what you use, I prefer SVN. I install Tortoise SVN, SVN proper (to do scripting etc if I need to using cmd line) and also purchase Visual SVN, an add on to Visual Studio that integrates with SVN. for 50 bucks you have your source control system. Visual Source Safe works but is outdated, honestly I hate it. Team Foundation Server is good, but expensive. Other solutions might be using something like GIT, etc. Whatever you do, just get a source control system going, and learn it well. Learn how to create repos, commit, update, revert, merge, etc. Set up a user for each BI dev and make sure they commit often, and make sure they leave comments in the source control log when they commit, history is your lifeline to go back to something if you need to! Note: exclude .suo, bin, obj directories, .user files, etc. Anything that changes every time you build, open, etc, you want to exclude from source control.

3) Development Box

You now have your version standardized, and your source control setup. You can get most of your work done on your machine, but you need somewhere to test deployments, run scenarios, etc, etc. Make sure you have a comparable box to your production server. Set it up the same, same software etc. Make sure its backed up. Let all the devs know its a dev box, it can be wiped at any time for any reason if need be. It can be rebooted 5 times a day if need be. Its a dev box! But you can test and develop and tweak and change settings to your hearts content and not have to worry about breaking Mr. Executives reports.

4) Developing, Merging, Committing, Collaborating, Communicating.

So now you have your setup, well.. setup. Start creating stuff. SSIS Packages, ETL’s, SSAS Cubes, SSRS Reports, the whole MSFT BI Solution. This is where stuff can start to get tricky in a team environment though. SSIS/SSAS/SSRS isn’t as clean cut as something like C#/VB.NET, etc. Everything is in some form of XML behind the scenes, and with graphical based editing, you can move stuff around and it changes the files. Things like that are going to be your enemy. This is why you need to collaborate and communicate. Usually one person should be working on one project at a time. You can get really good at communicating and then in SSIS at least have multiple people working on different packages. Also in SSAS dimension editing and stuff can be done by multiple people at the same time as long as the dim is already hooked up to the cube. But you want to make sure that you communicate, “Hey, I am checking this in, you might want to do an update”, or “Is anyone working on this or are they going to? I want to modify something, and I will check it in so you all can see it”

You want to make sure you have your folder structure, and solution/project structure set up well. C:Projects .. and then maybe a folder for each major project “CompanySales” and under that, “ETL”, “Cube”, “Reports” and have a solution under each with 1 project of each type. You can also have a generic SSRS solution with many projects, which might work well for you. In any case, just come up with a standard and stick to it. Trust me it will make your life easier. The question from the email above, it sounds like they have every package in one solution, one project. Sounds like it needs to be split, multiple solutions, multiple projects.

5) Deployment Scenarios and Strategies

Now that you have everything developed, tested, checked in, what do you do?

Personally for SSIS I like xcopy deployments. One folder on the server, not on the C drive, but another drive, lets say “E:SSIS” under that a folder for each project. Put your dtsx and configs in the same folder. 99% of the time you are going to call the dtsx from a SQL Agent Job, and most likely you are going to run into a scenario
where you need uber rights to execute it, so learn how to create a proxy/credential in SQL security so you can run the step as that. Once you have this folder and subfolders setup, you can use something like Beyond Compare to compare the folder on the server to the one you have locally that matches. Remember to copy files from the bin directory of your project after you build it, not the files directly on your project. As far as BIDS validating every package, there are workarounds out there you can do, here is one.

For SSAS, I try to lean towards using the Deployment Wizard that comes with SQL Server. You can use BIDS deployment, but if you start doing anything advanced with roles, partitions, etc, you are going to run into trouble. Take control and use the deployment wizard. I usually like to deploy, and then process manually when developing. And then later use SQL Agent or and SSIS package to actually do processing when it comes to a scheduled processing.

SSRS, I have become used to the auto deployment from Visual Studio. To really do this though, you need a project for every folder in SSRS, which can become a pain. You can always just upload the .RDL file and connection and do it manually, but if you start off right with using the deployment from BIDS, it can make your life easier.

So that is just a 10 minute overview of everything to kind of get started. Everything depends on your infrastructure and the way your team is setup, etc. But I think the biggest thing to take from all of the above is to standardize on things. If you standardize on as much as possible, SQL versions, setup of machines, naming conventions, layouts, design patterns, etc, everyone can do things faster and pretty soon it will start running like a well oiled machine!

Tags Development, linkedin, Microsoft Business Intelligence, Microsoft SQL Server, SSAS, SSIS, SSRS, Team

Business Intelligence Geeky/Programming SQLServerPedia Syndication

ETL Method – Fastest Way To Get Data from DB2 to Microsoft SQL Server

Post author By Steve Novoselac
Post date May 20, 2009
25 Comments on ETL Method – Fastest Way To Get Data from DB2 to Microsoft SQL Server

For a while, I have been working on figuring out a “better” way to get data from DB2 to Microsoft SQL Server. There are many different options and approaches and environments, and this one is mine, your mileage may vary.

Usually, when pulling data from DB2 to any Windows box, the first thing you might think of is ODBC. You can either use the Microsoft DB2 driver (which works, if you are lucky enough to get it configured and working), or the IBM iSeries Client Access ODBC Driver (which works well), or another 3rd party ODBC driver. Using ODBC, you can access DB2 with a ton of different clients. Excel, WinSQL, any 3rd party SQL Tool, a MSSQL linked server, SSIS, etc. ODBC connects just fine, and will work for “querying” needs. Also, with the drivers you might install, you can usually set up an OLE DB connection if your client supports it (SSIS for example) and query the data using OLEDB – this works as well, but there are some caveats, which I will talk about.

In comes SSIS, the go to ETL tool for MSFT BI developers. You want to get data from DB2 to your SQL Server Data Warehouse, or whatever. You try with an OLEDB connection source, but it is clunky, weird, and sometimes doesn’t work at all (PrimeOutput Errors Anyone?). If you do manage to get OLEDB configured and working, you still probably will be missing out on some performance gains compared to the method I am going describe.

Back to SSIS, using ODBC. It works. You have to create an ADO.NET ODBC connection, and use a DataReader source instead of an OLEDB source. Everything works fine, except one thing. It is slow! Further proof?

http://social.msdn.microsoft.com/Forums/en-US/sqlintegrationservices/thread/162e55e5-b64b-423e-94c1-dd764ca1f683

http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=96977

http://social.msdn.microsoft.com/Forums/en-US/sqlintegrationservices/thread/cfade7e7-50d5-4447-9821-35c5d5ae1b66

http://www.sqlservercentral.com/Forums/Topic702042-148-1.aspx

http://www.sqlservercentral.com/Forums/Topic666993-148-1.aspx

Ok, enough links. But if you do read those. SQL 2000 DTS is faster than using SQL 2005/2008 SSIS. WTF? The best I can guess is that it is because of the .NET wrapper around ODBC. DTS is using “native” ODBC.

So, now what? Do we want to use DTS 2000? No. What to do though?

Well, after a few days of research, and just exploring around, I think I have found a good answer.. Replace DB2 with SQL Server.. just kidding. Here is what you need to do:

Install the IBM Client Access tools. There is a tool called “Data Transfer From iSeries Server” which the actual exe is "C:Program FilesIBMClient Accesscwbtf.exe"

This little tool allows you to set up data transfers from your DB2 system to multiple output choices (Display, Printer, Html, and Text). We want to export to Text file on our filesystem. You have to set up a few options, like the FileName, etc. In “Data Options” you can set up a where statement, aggregates, etc.

If you output to a file, you can go into “Details” and choose a file type, etc. I use ASCII Text, and then in the “ascii file details” I uncheck all checkboxes. You set up your options and then hit the “Transfer data from iSeries” button and it will extract data to the file you chose in the filename field. Pretty sweet. But this is a GUI, how can I use this tool? I am not going to run this manually. Well, you are in luck.

If you hit the “Save” button, it will save a .dtf file for you. If you open this .dtf file in a text editor, you will see all options are defined in text, in a faux ini style. Awesome, we are getting somewhere.

Now, how do you run this from a cmd prompt? Well, we are in luck again. Dig around in C:Program FilesIBMClient Access and you will find a little exe called “rxferpcb”

What this tool allows you to do, is pass in a “request” (aka a DTF file), and a userid/password for your DB2 system, and it will execute the transfer for you. Sweet!

Now what do we do from here?

1) Create an SSIS package

2) Create an execute process task, call rxferpcb and pass in your arguements.

3) Create a BULK Insert task, and load up the file that the execute process task created. (note you have to create a .FMT file for fixed with import. I create a .NET app to load the FDF file (the transfer description) which will auto create a .FMT file for me, and a SQL Create statement as well – saving time and tedious work)

Now take 2 minutes and think how you could make everything generic/expression/variable driven, and you have yourself a sweet little SSIS package to extract any table from DB2 to text and bulk load it.

What is so great about the .DTF files is that you can modify them with a text editor, which means you can create/modify them programmatically. Think – setting where statements for incremental loads, etc.

You can see from the two screenshots above, that is all there is. Everything is expression/variable drive. Full Load, and Incremental Load. Using nothing but .dtf files, rxferpcb, a little .NET app I wrote to automatically create DTF’s for incremental (where statements), truncate, delete, and bulk insert. I can load up any table from DB2 to SQL by just setting 3 variables in a parent package.

After you wrap your head around everything I just went over, then stop to think about this. The whole DTF/Data Transfer/etc is all exposed in a COM API for “Data Transfer Automation Objects’”

http://www-912.ibm.com/s_dir/slkbase.NSF/643d2723f2907f0b8625661300765a2a/0c637d6b03f927ff86256a710076ab22?OpenDocument

With that information at your disposal, you could really do some cool things. Why not just create a SSIS Source Adapter that wraps that COM object and dumps the rows directly to the SSIS Buffer, and then does an OLEDB insert or Bulk Insert using the SQL Server Destination?

I have found in my tests that I can load over 100 million row tables – doing a full complete load, in about 6-7
hours. 30-40 million row tables in 4 hours. 2 to extract, 2 to BULK insert. Again, your mileage may vary depending on the width of your table, network speed, disk I/O, etc. To compare, with ODBC, just pulling and inserting 2 million records was taking over 2 hours, I didn’t wait around for it to finish. Pulling 2 Million records with my method described in this blog takes about 3-5 minutes (or less!)

I know I have skimmed over most of the nitty gritty details in this post, but I hope to convey from a high level that ODBC/OLE DB just aren’t as fast as the method here, I have spent a lot of time over the last few weeks comparing and contrasting performance and manageability. Now, if I could just get that DB2 server upgrade to SQL Server 2008. . . Happy ETL’ing!

Tags Client Access, DB2, Disk IO, dtf, ETL, ETL Performance, IBM, IBM Client Access, iSeries, linkedin, Microsoft, Network, ODBC, OLE DB, OLEDB, rxferpcb, SQL Server, SSIS

Business Intelligence SQLServerPedia Syndication

Early Arriving Facts, Late Arriving Dimensions, Inferred Dimensions

Post author By Steve Novoselac
Post date May 19, 2009
3 Comments on Early Arriving Facts, Late Arriving Dimensions, Inferred Dimensions

Most ETL systems (at least that I have seen/studied/worked on) that populate data warehouses run something like

1) Load Dims

a) populate an unknown

b) populate dim data

2) Load Facts

a) join/lookup to dim’s, and if no match, set as “unknown” dimension record

3) Process Cube

This type of system works in many cases, but there are some flaws that bubble up over time. First, unless you reload your fact table, or update your unknown dimension keys on your fact, you could end up with unknowns, that will be unknowns forever. The system described above also means you need to run it in that order. Dims first, Facts second.

Early Arriving Facts/Late Arriving Dimension – If you are an optimist, we have the fact data before we have the dimension data. Or, if you are a pessimist, we don’t have the dimension data when we load the fact. You choose, but in either scenario, we have data missing somewhere.

Like I mentioned earlier, many systems will just set the early arriving fact as “unknown” and set it to a unknown dimension key (usually –1) in the fact table. Some people might just ignore the fact record completely. You probably don’t want to do that.

But what if we have the “business” key in our fact data select. What can we do with that?

One option is to modify your dimension data select to UNION in all the distinct business keys from your fact data that aren’t in your dimension data. This works in a small data set. If you fact table is 500 million rows, you won’t like the performance of this option.

Another option we can use is the idea of an inferred dimension. As you load your fact table data (preferably through SSIS) you do a lookup to your dimension. If you have a match, cool, take that key and move on. If you don’t have a match, instead of setting the key to –1 (unknown), do this:

1) Insert a new dimension record with your business key from your fact table

2) Grab the newly created dimension key from the record you just inserted

3) Merge the key back into your fact data pipeline.

Awesome. Now, sometime in the future, your Dimension process can come through, and if you are doing Slowly Changing Dim’s, it should just update your inferred dimension records with data. If your inferred dimension records are some one offs that might never get updated, you might be able to get someone to manually update them through some interface, or whatever, in any event you aren’t stuck with tons of fact records that are set to –1/unknown.

Of course, the method above works best using SSIS, with a “Get Data -> Lookup Pattern –> Insert” method.

Happy ETL’ing!

Tags Dimensions, Early Arriving Facts, ETL, Inferred Dimensions, Late Arriving Dimensions, linkedin, SSIS

Business Intelligence SQLServerPedia Syndication

The problem isn’t SQL Server. It’s you.

Post author By Steve Novoselac
Post date May 9, 2009
3 Comments on The problem isn’t SQL Server. It’s you.

Throughout all my years in different places, I have seen SQL, Oracle, Firebird, MySQL, DB2, Zortec, Access, and probably a few other crazy databases set up and run, and administered. Of course most of them along the way have been Microsoft SQL Server, (6.5, 7, 2000, 2005, 2008). I’ve worked with some knowledgeable DBA’s, and in those cases everything usually turns out ok.

But sometimes, in some department or place or whatever, your buddy down the street wanting to start a new company, your girlfriends place of work that wants to track orders, whatever, they usually try to get SQL Server running, and what sometimes happens next just makes my head spin. Microsoft, bless them, sometime in the past, not so much now, tried to market SQL Server as “self manageable”. Probably sometime between 6.5 and 7, they tweaked some update stats routines and schedules and its all good, right? Set autogrow by default, and you are good to go. Wrong.

What this awesome marketing strategy did, was get people, places, and organizations, mostly ISV’s to use SQL Server and install it, get their app running, and walk away. Of course it runs for a while, runs like a champ even. But then months, even years go by and the system starts running slow. There is no DBA around, they didn’t need one, SQL Server manages itself! Wrong again.

What you might end up with though, are people using the system that might know a little bit, enough to be deadly even, and they start making changes, when in reality you need a full fledged DBA to manage your server, and database, hence the name DBA (database administrator). But before the DBA comes on to the scene to save the day, you will have the people that blame SQL Server. “Oh SQL Server doesn’t work at all, it can’t perform’”… or “Our other databases run 10x as fast, what gives” (not mentioning they have 3 DBA’s for those “other” databases, but not for MS SQL). and the quotes keep coming.

That is why the title of this post is what it is.

The problem isn’t SQL server, it’s you

. If you fail to realize that MS SQL is an Enterprise class database system, and treat it like some out of the box, already configured, plug and play system, you are going to run into issues eventually. You need a DBA. Probably best to have one BEFORE you implement any system, even if it is a consultant to guide your implementation, and assist as time goes on.

I sometimes get tired trying to argue that MS SQL can hold its own against Oracle, DB2, whatever. Trust me, it can. I could probably go find tons of SQL DBA’s that would back me up as well. It is all about how you manage and administer it! SQL Sever does just fine, as long as you know what you are doing. Just like any system. I think sometimes that if we took SSMS away, and just made everything cmd line/scripting, that people “outside” of the MS SQL community would see how MS SQL works in compared to their own systems.

This post isn’t meant to be a beat down rant or anything, but the same things can be said for .NET compared to Java, C++, etc, or whatever. It just seems sometimes that people that live and breathe Microsoft SQL need to know what the other RDMS/BI systems are capable of, but for some reason the same isn’t true for people that use the other systems. They kind of just brush MS SQL off as a play toy, something that shouldn’t be taken seriously, a “hobbyist” SQL system. Something that any enterprise wouldn’t be caught dead running, that is of course, unless you are Microsoft. 🙂

I’m still hedging my bets on MS SQL and .NET, I haven’t seen anything better for the price and ease of use, and the best part about it, the community. The MS SQL and Development community is huge compared to anything else, and to me that just puts the icing on the cake. Just remember the next time someone who needs a MS SQL DBA but doesn’t have one complains about performance of their system, you can tell them it’s not SQL Server’s fault, it’s probably the lack of neglect to SQL Server that caused the problems.

Tags DB2, DBA, linkedin, Microsoft SQL Server, MS SQL, Oracle, SQL, SQL Server

Geeky/Programming

Using Windows Performance Toolkit to find System Issues in Vista/Win2k8/Win7

Post author By Steve Novoselac
Post date May 5, 2009
No Comments on Using Windows Performance Toolkit to find System Issues in Vista/Win2k8/Win7

Windows 7 RC1 just came out. I am a TechNet subscriber, so I wanted to try it out. I have an old (2005) Dell desktop, 2.8 GHz, 2 GB ram, 160 GB drive box. 3.7 rating for Vista (because of the Graphics card mostly, would be 4.4 otherwise – not too bad, even for being kind of an old box). It has been sitting in the basement since I moved into my new place in October, doing nothing really. I use Mac full time at home, so it just sits.

A few times I have tried to get Windows Vista running smooth on it, Media Center, or just a file server,etc. Thing is, it was just flaking out. I knew it was a hardware issue, but figured it might be the CPU fan, or overheating, etc. Vista installed fine, but as I was using it, I would see just hang-ups, lockups. Not BSOD’s, but it would just hang, for 30 seconds, 1 minute, and then come back. WTF?

Nothing in the Reliability monitor, nothing I could see in event logs, etc. I rebooted, did Windows Memory test, nothing there. If you go into Computer Management, you will see Performance, then Data Collector Sets and Reports, Monitoring Tools. You can set it up to run a test on metrics of your system and it will give you a report

I did this, and everything was ok. BUT… Avg Disk Length Queue was > 2 – red flag. Disk issues. But I wanted to know more. So I started digging around, and there is a Windows Performance Toolkit you can download. Here is another good site going into detail about the WPT.

So I fire up cmd line (as admin! – start->run, cmd ctrl+shift+enter), and run

xperf -providers K

to see what providers are available for the Kernel flags. IOTrace looks like something I want, so I then run

xperf -on IOTrace

and let it run. I go and open/close things, play around, see if I can replicate the issue. Once I feel I have, I want to stop and analyze the trace. You need to stop it and output to a file using this command:

xperf -d iotrace.etl

Side note: Files are named ETL. Coming from a BI background, this makes my world explode, since it has nothing to do with Extract, Transform, and Load

Now that my trace is done, time to analyze:

xperfview iotrace.etl

And you get some awesome stats like this:

Although I didn’t save my stats from my tests that showed the bad IO, what I saw were just gaps in the graphs, glitches in The Matrix. Time missing. Something is really bad here. So I did the drive error checking in Vista:

And when that ran, after reboot, it got to 11% and croaked. Bad drive. So I went and bought a new 500 GB SATA drive and loaded it up, and I am running Windows 7 now. Pretty sweet.

After all this fun spelunking into Windows performance, it also got me thinking about things, like running these detailed traces on SQL Server boxes or other servers on intervals, and saving them somehow, reporting on the data. The IOTrace is just one of hundreds of traces, that you can then auto analyze. I know that there are perfmon tools but there are some added benefits to xperf that you can you utilize, and I am glad I learned more about it and put it to use, just another tool for the sysadmin tool belt.

Tags Hard Drive, IOTrace, linkedin, perfmon, System Admin, System Issues, tracing, Windows 2008, Windows 7, Windows Performance Toolkit, Windows Vista, xperf, xperfview

Business Intelligence Geeky/Programming SQLServerPedia Syndication

SSIS – Two Ways Using Expressions Can Make Your Life Easier – Multi DB Select, Non Standard DB Select

Post author By Steve Novoselac
Post date April 30, 2009
No Comments on SSIS – Two Ways Using Expressions Can Make Your Life Easier – Multi DB Select, Non Standard DB Select

In SQL Server Integration Services (SSIS), pretty much every task or transformation lets you set “expressions” up. Expressions are basically ways to set property values programmatically.

Here are two scenarios where you might use expressions (there are 100’s of uses, these are just two that are kind of related).

Multiple Database Select – You have multiple databases – same schema, let’s say you have 300 installs of a 3rd party product and they all need their own database. I know it might sound impossible, but trust me, it can happen. Now, you want to run the same query over all databases, and pull data from a table, and dump into a data warehouse, for example. You could write 300 queries, and keep adding/removing based on the databases, you could create some elaborate dynamic SQL proc using loops, or you might have some other way, or, you could use SSIS Expressions.
Now, how would you go about doing this? It is pretty easy actually. First step, you need to set up a loop in SSIS. You would want to grab a recordset of database names using an Execute SQL Task, or however you’d like, and store in an object variable. Then you can loop through that list. Your only difference in your query would be database name, so what you would do is have a variable for your SELECT statement. Name it whatever, but what you want to do is click on the variable, the properties of it. You will see Expression. Open the expression box and then set it to something like this

”SELECT Col1,Col2,Col3 FROM “ + @[User:CurrentDatabaseName] + ".dbo.MyTable"

@[User:CurrentDatabaseName] is another variable to store the databasename that you would grab as you loop through your list of databasenames.

Finally, in your dataflow, OLE DB source, you can change the Data Access Mode to “SQL Command From Variable”, and then it will let you choose your variable. As your for loop loops through your database names, and updates your SELECT variable, you can then select data from each database as you loop through them.
Non-Standard Database Select – Not sure how to label this one, but here is what I am talking about. I like to make all my queries as stored procedures in SSIS, at least as much as possible. This works great when you are doing SQL Server to SQL Server, but what happens if its Oracle to SQL Server, DB2 to SQL Server, etc? Yes I know you can create stored procs on those systems, but you might be in a place or position where you just can’t or don’t want to. In that case you would want to use just standard T-SQL select statements to get data. You can easily put in params if the source is an OLE DB source, but what if it is an ODBC Source? You have to use the DataReader source, and you can’t easily set params – like a WHERE statement. You HAVE to use Expressions in order to have a query with a dynamic WHERE statement or passing in a variable as WHERE statement filter.
So, throw a DataFlow on your package, and inside that, throw a DataReader source, and then set the connection to your ODBC Connection (ADO.NET Connection) and set the command text. Good to go. But where to set the connection? Not very intuitive. Go back to your DataFlow and look at the expressions for it. You will see one for DataReaderSource.CommandText (where DataReaderSource is the name of your DataReaderSource). You can set the expression up there. Now you can change an Oracle SQL Statement or DB2 or whatever to something that takes params without the need for a stored proc on that other database server.

So, while there are hundreds if not thousands of uses for expressions in SSIS, these are just a couple of uses that can make your life easier when trying to do more dynamic type queries in your DataFlow. Happy ETL’ing!

Tags DataFlow, DataReader, Expressions, linkedin, ODBC, SQL Server, SSIS, SSIS 2005

Business Intelligence SQLServerPedia Syndication

SQL Server 2008 – Saving changes is not permitted

Post author By Steve Novoselac
Post date April 21, 2009
1 Comment on SQL Server 2008 – Saving changes is not permitted

Finally getting around to doing some work on SQL 2008, and after about 3 minutes, I run into this error: “Saving Changes is not permitted.. blah blah blah” See screenshot below.

This is different than SQL 2005. Microsoft maybe trying to save us from ourselves? The thing is, I never “enabled the option Prevent saving changes that require the table to be re-created” – it seems to be enabled by default. It would be awesome if this error told me exactly where the setting was.

Well, it happes to be in Tools->Options, Designers, Table and Database Designers. Uncheck the box and go about your merry way!

Tags linkedin, SQL 2008, SQL Server 2008, SQL Server Management Studio, SSMS

Business Intelligence SQLServerPedia Syndication

OLAP PivotTable Extensions on CodePlex

Post author By Steve Novoselac
Post date April 20, 2009
2 Comments on OLAP PivotTable Extensions on CodePlex

This weekend, I ran across this on CodePlex – OLAP PivotTable Extensions which got me thinking back to a post by the Excel blog about adding calculated measures and named sets in VBA (which is another blog post completely)

From CodePlex:

OLAP PivotTable Extensions is an Excel 2007 add-in which extends the functionality of PivotTables on Analysis Services cubes. The Excel 2007 API has certain PivotTable functionality which is not exposed in the UI. OLAP PivotTable Extensions provides an interface for some of this functionality.

What an awesome tool. I have been playing with it for a couple days and I have turned on some of the “power” users of the OLAP cubes to it as well. The first thing I thought of when running across this was “Woah, ok, when business users request calculated measures that might be more obscure, or just specific to them, they can add them! We don’t have to do a special release, maybe not even a release at al!”

The uses for this tool could be pretty extensive. You can import and export calculation libraries, you can also see the MDX that Excel is producing, which is another plus (I know there are other ways to get it, but this tool makes it easy!) – With the MDX you can just copy it and run it in SSMS to see the results there. You can see how Excel is doing things behind the scenes with your result set to make it look nice.

Another sweet feature, if you have a cube with tons of attributes, there is a search tab to search for the attributes you want.

I haven’t seen any issues yet. One user had to install the Visual Studio 2005 Tools for Office Second Edition Runtime which the CodePlex site says is required, so no big deal.

If you have tons of users using OLAP Cubes with Excel 2007, take a look at this free open source tool on CodePlex, you probably will get some good mileage out of it. I think Microsoft should put these features in the next version of Office!

Tags CodePlex, Cubes, Excel 2007, linkedin, MDX, OLAP, Pivot Tables