Categories
Business Intelligence SQLServerPedia Syndication

Microsoft BI Client Tools: Easing the Transition from Excel 2003 to Excel 2007 – PivotTable Layout and Design

Being that we are now in the year 2009, you would think that most people are currently using or have been using the Office 2007 suite for a couple of years now. The truth is that there are many businesses “in the wild” that are still standardizing on the Office 2003 suite.

Why? Well there are a variety of reasons. Some places might cite cost to upgrade (as in dollars), where others might cite backwards compatibility with legacy applications. Some others might just say that end users “won’t understand” the new ribbon interface, and they might think that the pain and time of training and helpdesk support outweighs the benefits of using Office 2007.

Over the past year I have been in three different places and they all are standardized on Office 2003, and it puzzles me that it isn’t a harder push to upgrade. The benefits of Office 2007 are huge, once you get used to the new interface, and I could go into the benefits but that is probably another blog post, but Outlook 2007 a GTD (getting things done) booster.

As a Business Intelligence guy, it really works for me if every user is on the same client tool, same interface, some quirks and same training, etc. Excel 2007 adds many things when using cubes and pivot tables, and especially with SQL Server Analysis Services 2005, it is a no brainer to use Excel 2007 with SSAS 2005.

In trying to get users of cubes using 2007, there are a few things that I have encountered that can make the transition easier, and today I am going to talk about PivotTable layout and design.

Users of Excel 2003 are used to a pivot table that is laid out in a tabular form, and no subtotals, and maybe grand totals or not. When they use Excel 2007 by default, the are shell shocked by the default pivot table layout and get confused and maybe even sometimes “scared” of what they have gotten into with 2007.

Well, the thing is, it is really easy to get your pivot table to look like a 2003 pivot table in 2007. When you insert a pivot table into Excel, you see this kind of layout.

You can see under the “Pivot Table” tools there is an “Options” and a “Design” tab. Click on the “Design” tab before you set up any dimensions or measures or filters on you pivot table.

These settings on the design tab you can set how you want your Pivot Table to look. To make it “2003 style”, on Subtotals, pick “Do not show subtotals”, On Report Layout, choose “Show in tabular form”. If you don’t to see Grand Totals, then you can turn those off as well. And you can fiddle with the various design options as well.

One thing not on this tab is changing the setting for the +/- on the rows. On the Pivot Table options tab, under the Pivot Table name way on the right, there is an options button.

Here you can tweak other various settings, but you can uncheck “Display expand/collapse buttons” to remove the +/-. As you can see, you can also make the Pivot Table a “classic pivottable” if you really want.

Moving from Excel 2003 to Excel 2007, at least in the Pivot Table and OLAP cube browsing area, shouldn’t be a hard move, and you shouldn’t be scared of it, as you can see you can make your pivot tables look like 2003, or go wild and shift to the new 2007 style.

Categories
Business Intelligence Geeky/Programming

Trends for GMail?

Google Reader has had “trends” for a while now, it really lets you see what you read, when you read it, etc.


Why doesn’t Gmail have a trends page? Why doesn’t it show you who you get email from, who you email the most, time of day etc. Something like Xobni. I know there is a Python app that a dev at Google made, but why can’t Google just add this to Gmail? They already have the information. Maybe it would be nice as a labs feature.

There are so many things out there like this. Where the metadata is available, but not available to you. From your TV, to you car, to your computer, apps, even your body. You can’t automatically get the times you went to sleep, bathroom, eat, etc from yourself, but it would be nice. Build the ultimate Data Warehouse.

I just wish that apps that had the data would let you see it, use it, and improve your productivity. Someday.


Categories
Business Intelligence Geeky/Programming SQLServerPedia Syndication

SSIS – Slowly Changing Dimensions with Checksum

In the Microsoft Business Intelligence world, most people use SQL Server Integration Services (SSIS) to do their ETL. Usually with your ETL you are building a DataMart or OLAP Database so then you can build a multi-dimensional cube using SQL Server Analysis Services (SSAS) or report on data using SQL Server Reporting Services (SSRS).

When creating the ETL using SSIS, you pull data from one or many sources (usually an OLTP database) and summarize it, put it into a snowflake or star schema in your DataMart. You build you Dimension tables and Fact tables, which you can then build your Dims and Measures in SSAS.

When building and populating your Dimensions in SSIS, you pull data from a source table, and then move it to your dimension table (in the most basic sense).

You can handle situations in different ways (SCD type 1, SCD type 2, etc) – basically, should you update records, or mark them as old and add new records, etc.

The basic problem with dimension loading comes in with grabbing data from the source, checking if you already have it in your destination dimension table, and then either inserting, updating, or ignoring it.

SSIS has a built in transformation, called the “Slowly Changing Dimension” wizard. You give it some source data, you go through some wizard screens, choosing what columns from your source are “business keys” (columns that are the unique key/primary key in your source table) and what other columns you want. You choose if the other columns are changing or historical columns, and then choose what should happen when data is different, update the records directly, or add date columns, etc.

Once you get through the SCD wizard, some things happen behind the scenes, and you see the insert, update transformations are there and things just “work” when you you run your package.

What most people fail to do, is tweak any settings with the automagically created transformations. You should tweak some things depending on your environment, size of source/destination data, etc.

What I have found in my experience though, is that the SCD is good for smaller dimensions (less than 10,000 records). When you get a dimension that grows larger than that, things slow down dramatically. For example, a dimension with 2 million, or 10 million or more records, will run really slow using the SCD wizard – it will take 15-20 minutes to process a 2 million row dimension.

What else can you do besides use the SCD wizard?

I have tried a few other 3rd party SCD “wizard” transformations, and I couldn’t get them to work correctly in my environment, or they didn’t work correctly. What I have found to work the best, the fastest and the most reliable is using a “checksum SCD method”

The checksum method works similarly to the SCD wizard, but you basically do it yourself.

First, you need to get a checksum transformation (you can download here: http://www.sqlis.com/post/Checksum-Transformation.aspx)

The basic layout of your package to populate a dimension will be like this:


What you want to do, is add a column your dimension destination table called “RowChecksum” that is a BIGINT. Allow nulls, you can update them all to 0 by default if you want, etc.

In your “Get Source Data” Source, add the column to the result query. In your “Checksum Transformation”, add the columns that you want your checksum to be created on. What you want to add is all the columns that AREN’T your business keys.

In your “Lookup Transformation”, your query should grab from your destination the checksum column and the business key or keys. Map the business keys as the lookup columns, add the dimension key as a new column on the output, and the existing checksum column as a new column as well.

So you do a lookup, and if the record already exists, its going to come out the valid output (green line) and you should tie that to your conditional transformation. You need to take the error output from the lookup to your insert destination. (Think of it this way, you do a lookup, you couldn’t find a match, so it is an “error condition, the row doesn’t already exist in your destination, so you INSERT it).

On the conditional transformation, you need to do a check if the existing checksum == the new checksum. If they equal, you don’t need to do anything. You can just ignore that output, but I find it useful to use a trash destination transform, so when debugging you can see the row counts.

If the checksums don’t equal, you put that output the OLEDB Command (where you do your update statement).

Make sure in your insert statement, you set the “new checksum” column from the lookup output to the RowChecksum column in your table. In the update, you want to do the update on the record that matches your business keys, and set the RowChecksum to the new checksum value.

One thing you might also run into is this. If your destination table key column isn’t an identity column, you will need to create that new id before your insert transformation. Which probably consists of grabbing the MAX key + 1 in the Control Flow tab, dumping into a variable, and using that in the Data Flow tab. You can use a script component to add 1 each time, or you can get a rownumber transformation as well (Both the trash and rownumber transformations are on the SQLIS website – same as the checksum transformation)>

After getting your new custom slowly changing dimension data flow set up, you should see way better performance, I have seen 2.5 million rows process in around 2 minutes.

One other caveat I have seen, is in the Insert Destination, you probably want to uncheck the “Table Lock” checkbox. I have seen the Insert and Update transformations lock up on each other. Basically what happens is the run into some type of race condition. The cpu on your server will just sit low, but the package will just run forever, never error out, just sit there. It usually happens with dimension tables that are huge.

Like I said earlier, the checksum method is good for large dimension tables. I use the checksum method for all my dimension population, small or large, I find it easier to manage when everything is standardized.

By no means is this post an extensive list of everything that you would run across, but more of a basic overview of how it works with using the checksum method.

In any event, you can get huge performance gains by using the checksum method and not using the SCD Wizard that comes with SSIS, and you will also feel like you have more control of what is going on, because you will. Happy ETL’ing 🙂


Categories
Business Intelligence Geeky/Programming Ramblings

Excel 2003 vs Excel 2007

It is the year 2008, we are half way through. Excel 2003 is 5 years old. Stop using it please.

Why? Excel 2003 has the old “limits” – 65,000 rows, 256 columns, memory limits etc. Excel 2007 on the other hand, 1 million row limit, etc etc. That coupled with the way pivot tables work in Excel 2003 compared to 2007, and the SQL Server Analysis Services features with 2007, it is a no brainer to go to 2007.

Companies will say – “But we can’t move all our users to 2007, we can’t afford it” – well, think about just moving your power users. The users that have huge spreadsheet extracts, etc. It is worth it. They can save files in 2003 format if they need to share a smaller file or something, and the 2003 users can install a 2007 viewer.

Other options for huge spreadsheets and extracts are… Access – which your users need training on, or need to be able to adapt to, or another options is SSAS and Cubes, which you need executive buy in, and the infrastructure, and the training to get your users up to speed, and by that time you will want Excel 2007 to connect to the cubes, so…

just start using Excel 2007 – 2009 will be here soon!

Categories
Business Intelligence Geeky/Programming SQLServerPedia Syndication

SQL Server Reporting Services: Quick way to get 10 digit year (all the zero's) using String.Format

Dates are fun. See, by default most dates come out like 5/6/2008. But computers, and programs like them formatted as 05/06/2008. That way, all the dates, no matter what month or day, are all the same length, cool huh?

Well, in Reporting Services, if you have a date field coming back in a dataset, and you want to format it as a 10 digit string, there are about 50 different ways to do it. You can use old VBA Left and Mid etc, or you can use String.Format like..

=String.Format(“{0:MM}/{0:dd}/{0:yyyy}”,CDate(Fields!CalendarDate.Value))

Categories
Business Intelligence Geeky/Programming SQLServerPedia Syndication

SQL 2005, SSAS 2005: Using ascmd.exe To Create SQLAgent Jobs That Give You Completion Status

In SQL2005, you can create SQL Agent jobs. They can be scheduled, have multiple steps, alert, notify, etc. Pretty great setup. There are some downfalls though.

Like in order to call a job from a job, you need to execute the second job with T-SQL. Thing is, it doesn’t run synchronously. It runs asynchronously, which really stinks if you want to wait to see if the second job completes successfully or what not.

Another thing, if you call an XMLA script from a SQL Agent job, if the XMLA query or command fails, the SQL Agent job still reports success – that’s no good! What can you do? Use ascmd.exe!

ascmd.exe is a utility that you actually have to download the SQL Samples from CodePlex (http://www.codeplex.com/SqlServerSamples/Release/ProjectReleases.aspx?ReleaseId=4000) and then build the ascmd solution, to get ascmd.exe A few notes. The samples are VS2005, and I don’t have that installed ,so I had to open with VS2008, then it is digitally signed, and when building couldn’t find the .snk file to sign it, so I turned that off as well, after I had it built, I did some testing locally and made sure it would work as I wanted it to.

You can test by just calling it from the cmd line, for example:

ascmd.exe -S SSASServerName -d SSASDatabaseName -i MyXMLA.xmla

you can use -Q and try to pass in XMLA, but then you have to handle all the special characters and what not, which is a pain. Now, if you just put this on your C drive, (or wherever), then create a SQL Agent Job to call this command, it will fail the job if the ascmd.exe reports back failure. Exactly what we want!!

You will notice in the SQL Agent job setup that you can specify the value for a failed command  – 0 is the default and that is what we want.

Now, get some XMLA that you know works, set it up in the xmla file and test it, the job should succeed. Now just change something in the xmla (like CubeId) to some fake value and test it, the job should fail, and you can have it alert your DBA’s or whoever.

Pretty sweet, but I wish SQL Agent would handle failed XMLA like a failed query and report failure, I am not sure if in SQL 2008 it does that or not, but it would make life a lot easier. Otherwise you could be scratching your head trying to find where stuff is failing, looking in logs, etc, but not seeing any failures. The only way you would be able to tell is to run the XMLA manually, ugh 🙂

 

Categories
Business Intelligence Geeky/Programming SQLServerPedia Syndication

SSAS 2005: Cube Perspectives Are Good, But Something Is Missing…

In SQL Server Analysis Services 2005, you can create "perspectives" on cubes. What a perspective can do, is allow you hide different dimensions, measure groups and attributes. This works great but I still think there are a few things that are missing.

The first thing is that the cube itself is a perspective, one that is always there and you cannot hide. In a scenario where you want to build a cube but then have multiple perspectives, but you don’t want end user clients to see the main cube, or use the main perspective, you can’t do it, or at least I cannot find a way to do it. 🙂

The second thing is, which really is more to do with linked objects, is that when you link in a dimension, for instance, you cannot hide or show attributes, you are stuck with what is in the main dimension in your source cube. So what do you do? Use a perspective. But if SSAS let you hide attributes, etc on the dimension, it would let you forgo the use of perspectives.

The third thing is just security in general. You cannot secure a perspective. If you want Accounting to see XYZ perspective, and HR to see ABC perspective, but you don’t want them to see each other’s perspectives, you are out of luck, and need to come up with a new solution, which probably involves crazy security in your cube, or creating new cubes that link in dimensions and measure groups from the main cube.

Don’t get me wrong, SSAS 2005 is a vast improvement over SSAS 2000, but there are just a few things that I feel are missing, or , I might not know about how to enable some of the things I want to do. 🙂 I know SSAS 2008 has more improvements and that will be a good change, hopefully there are some cool things that let you manage your cubes and perspectives a little better.

Categories
Business Intelligence Geeky/Programming SQLServerPedia Syndication

Reporting Services: Can't Uninstall – The setup failed to read IIsMimeMap table. The error code is -2147024893

Ran into this error tonight trying to uninstall SQL Server Reporting Services. Not sure if it is just Vista, or XP and other OS’s as well, but the fix is to stop IIS and then re-run the Uninstall.

Categories
Business Intelligence Geeky/Programming SQLServerPedia Syndication

SSAS: Changing Object Id's Breaks Report Builder Reports

Ugh. Found this one out the hard way. Usually when you change underlying object id’s in SQL Server Analysis Services, it shouldn’t cause any harm. You might have some XMLA to process dim’s and measure groups, if so you would have to change those, etc. But all reporting services reports and excel 2007 pivot tables, and MDX should keep working. What breaks? Report Builder reports.

You can build reports with Report Builder (the link is in Reporting Services to open Report Builder) off a cube model. They are paired down reports, you can’t do as much as you can with SSRS, but for advanced end users, they do the trick.

Thing is, the use SemanticQuery XML behind the scenes for the query and data source is to the model, and the XML is build off the object id’s of the cube. Ugh again. Even worse is that all parameters that were set as drop down lists (in this list) type are converted to IN formulas, and all the other params are converted to formulas. Graphs break. Matrix Reports break. Tabular Reports break. It just sucks. They shouldn’t build the query off the underlying id’s of objects, they should build them off the displayed names, like everything else does. Whew 🙂

 

Categories
Business Intelligence Life

What Have I Been Up To? Windows 2008, Drive Backups, VS2008 Web Programming, SSIS Stuff

I have been pretty busy lately. I have some posts that I want to write up, but it seems they are getting bigger and bigger, longer. More stuff 🙂

Just a highlight, this weekend I decided to do a full drive backup and create a restore cd, I followed this process. http://www.darkchip.com/DriveImageXML/Complete%20System%20Recovery%20Process.html

I just didn’t stumble upon that. I did some research on drive backups, etc and found Drive Image XML. From there I did a backup to USB. But then I needed a way to restore, so I followed the BartPE tutorial for Drive Image XML and got that all set up. Just gives you a sense of confidence, knowing you have a full backup. Why did I decide to do this? Because I wanted to try Windows 2008 Server on my laptop, as a workstation, so next I did that.

Installed Windows 2008, set it all up. I am liking it, but there are some caveats so far. Wifi is off by default, you need to install some extra stuff. Turn on audio services, etc.

IE is totally locked down, but I installed Firefox right away anyways. Windows Live Writer won’t install, but yea you can get it to work – you need to get the MSI – http://on10.net/blogs/sarahintampa/20879/Default.aspx

The OS doesn’t find the iPhone, so I will have to figure that out. I haven’t installed the "Desktop Experience" yet, I want to try not installing it for as long as I can. Win2k8 flys with everything turned off by default.

Also have been working on some side projects, web site and other blogs/projects. Getting deep into VS2008 and .NET 3.5, which is fun. Ran into some debacles with the ProfileBase auto generation stuff, probably another post.

Also, so cool stuff in SSIS that I probably could write up, we will see.

I am all over the board right now, I know, but it is fun, doing Programming, Database, Web 2.0 blog stuff, System Admin, the whole gamut. I love it.