Categories
Blogging Geeky/Programming

Twitter or Yammer?

Recently I have been using Yammer more often than Twitter, but they both have their place. You probably know what Twitter is as it has been going mainstream like crazy the last month or two. Yammer on the other hand is less known. Yammer is sort of like Twitter, but just for your organization.

What Yammer brings, besides being just internal to your place of work, are other enhancements. Groups, for one, is huge. I can create a group say for “Microsoft SQL Server”, and anyone on Yammer in my place of work can join that group and share, discuss, and consume anything from that group. Instead of me sending out emails from articles and tips, people can just subscribe to them.

Also, departments can create groups , private groups, to have discussions just within their dept.

Yammer also has an iPhone app, and a Adobe Air Desktop App (I think I heard TweetDeck integration soon if not already), and it uses SMS message as well if you want, and also you can get daily digest emails from your groups and people you follow.

I would suggest groups of employees that are on Twitter going back and forth, take a look at Yammer for internal needs. What I try to do is save anything for Twitter that is more applicable to everyone, and Yammer for things I just want to share with people I work with.

On that note as well, I separated out my Twitter and Facebook feeds, and am going to try to use Facebook for more non-technical status updates.

So,

Yammer = company/corporate/internal groups

Twitter = colleagues outside of the company, and people I find interesting in my same line of work, etc

Facebook = family, friends, etc

Categories
Business Intelligence Geeky/Programming SQLServerPedia Syndication

SSAS Cube Rule #1 – YOU NEED CALCULATE

Ok, I just spent a few hours debugging a cube that wasn’t bringing back any data. Checking all the relationships, the source data, the marts, etc, etc. And lo and behold, find out that a colleague commented out all the calculations, including the CALCULATE statement which says:

The CALCULATE command controls the aggregation of leaf cells in the cube.
If deleted or modified, the data within the cube will be affected.
This command should only be edited if you intend on manually specifying how the cube will be aggregated.

Remember, you NEED the “CALCULATE” command!

Categories
Geeky/Programming

SQL Server 2005 Reporting Services – Fix Slow Loading On First Report Load

Setting up a new instance of SQL Server Reporting Services (SSRS), and getting things configured. What I noticed is that the first time I hit the SSRS site, it would take forever to load, then it would be fine as I used it, and then later after no use, I could come back and it would be slow again till it loaded once, then fast.

My first thing to think of was the App Pool settings in IIS, so I checked them..


The first setting on the performance tab of the app pool, “Shutdown worker processes after being idle for (time in minutes)” is checked by default, and set to 20 minutes. Sounds about right, my SSRS instance would get slow after some time of non-use (20+ minutes)

Uncheck that, and you are good to go. Of course if you have tons of usage, then there is no need for changing this, but on a site you might be building up slowly, you would see the effects of this right away.


Categories
Geeky/Programming

Pandora for iPhone Security Flaw?

Recently, I restored my iPhone fresh, didn’t restore a backup or anything. I installed the essential apps I use, and one of them was Pandora Radio.

To my surprise, when I opened it, all my stations were there and it was like I didn’t even have to login. Weird, I thought, well maybe the full restore left some setting on it or something, no big deal.

Now, a second incident. My old 2G iPhone, when I gave it to Emily, was wiped clean, clean, clean. She has a few apps, but this morning I was like, try Pandora. So I install it on her phone, open it, and woah, all MY stations are listed, and it is logged in as me. WTF?!?!?!

My only guess is that it somehow ties to the iPhone hardware, since months ago when I was using the 2G phone, I had Pandora.app loaded, not sure how, but yea, there is definitely something going on.

Wondering if anyone else has seen an issue like this?


Categories
Business Intelligence Geeky/Programming

Trends for GMail?

Google Reader has had “trends” for a while now, it really lets you see what you read, when you read it, etc.


Why doesn’t Gmail have a trends page? Why doesn’t it show you who you get email from, who you email the most, time of day etc. Something like Xobni. I know there is a Python app that a dev at Google made, but why can’t Google just add this to Gmail? They already have the information. Maybe it would be nice as a labs feature.

There are so many things out there like this. Where the metadata is available, but not available to you. From your TV, to you car, to your computer, apps, even your body. You can’t automatically get the times you went to sleep, bathroom, eat, etc from yourself, but it would be nice. Build the ultimate Data Warehouse.

I just wish that apps that had the data would let you see it, use it, and improve your productivity. Someday.


Categories
Geeky/Programming

How To Speed Up Visual Studio Start Up

Add /nosplash to the end of your visual studio shortcuts..

image

also you can also go to tools->options and change the startup to an empty environment to speed up open times

 

image

Categories
Geeky/Programming

Programmatically creating Excel (XLS) Files, as XML files – Things to Keep In Mind

I worked on a small project that required to export data to Excel. The spreadsheets needed to be formatted very precisely, and the best way to do this is with the XML format of an excel file. But I have found some gotchas throughout the project, which will cause the .xls files to not load.

First, since it is XML, you need to make sure you handle some special XML characters..

quotes “ should be "

ampersands & should be &

apostrophes ‘ should be '

less than < should be &lt;

greater than > should be &gt;

Now, if you make a function or something in code to do these operations, there is one thing to make sure of. Replace the & first. If you do it last, like the quick and dirty function I wrote first, then you could end up replacing a less than with &lt; and then replacing that new ampersand with &amp; so you end up with &amp;lt; – whoops …

Another few little gotchas.. Excel tab names have to be distinct. If you have duplicates, the spreadsheet will still create just fine, you just wont be able to open it. Another thing with the tabs, you can have a division sign in them – / – same thing will happen.

Categories
Geeky/Programming

Technology Hypocrites?

Why do IT Departments, SysAdmins, and techno geeks love to push the latest and greatest server products, use the latest technologies in hardware, etc, yet they are scared out of their boots about Vista?

(Side Note: Testing out the QuickPress feature of WordPress 2.7)

Categories
Geeky/Programming

HowTo: Delete a Custom Event Log Source – XP

I installed RedGate trial software, and it created a custom event log source in my event logs, but I uninstalled it, and the event source still remains, how do you remove it?

run regedit, go to [HKEY_LOCAL_MACHINESYSTEMControlSet001ServicesEventlog] and remove the custom event log

Categories
Business Intelligence Geeky/Programming SQLServerPedia Syndication

SSIS – Slowly Changing Dimensions with Checksum

In the Microsoft Business Intelligence world, most people use SQL Server Integration Services (SSIS) to do their ETL. Usually with your ETL you are building a DataMart or OLAP Database so then you can build a multi-dimensional cube using SQL Server Analysis Services (SSAS) or report on data using SQL Server Reporting Services (SSRS).

When creating the ETL using SSIS, you pull data from one or many sources (usually an OLTP database) and summarize it, put it into a snowflake or star schema in your DataMart. You build you Dimension tables and Fact tables, which you can then build your Dims and Measures in SSAS.

When building and populating your Dimensions in SSIS, you pull data from a source table, and then move it to your dimension table (in the most basic sense).

You can handle situations in different ways (SCD type 1, SCD type 2, etc) – basically, should you update records, or mark them as old and add new records, etc.

The basic problem with dimension loading comes in with grabbing data from the source, checking if you already have it in your destination dimension table, and then either inserting, updating, or ignoring it.

SSIS has a built in transformation, called the “Slowly Changing Dimension” wizard. You give it some source data, you go through some wizard screens, choosing what columns from your source are “business keys” (columns that are the unique key/primary key in your source table) and what other columns you want. You choose if the other columns are changing or historical columns, and then choose what should happen when data is different, update the records directly, or add date columns, etc.

Once you get through the SCD wizard, some things happen behind the scenes, and you see the insert, update transformations are there and things just “work” when you you run your package.

What most people fail to do, is tweak any settings with the automagically created transformations. You should tweak some things depending on your environment, size of source/destination data, etc.

What I have found in my experience though, is that the SCD is good for smaller dimensions (less than 10,000 records). When you get a dimension that grows larger than that, things slow down dramatically. For example, a dimension with 2 million, or 10 million or more records, will run really slow using the SCD wizard – it will take 15-20 minutes to process a 2 million row dimension.

What else can you do besides use the SCD wizard?

I have tried a few other 3rd party SCD “wizard” transformations, and I couldn’t get them to work correctly in my environment, or they didn’t work correctly. What I have found to work the best, the fastest and the most reliable is using a “checksum SCD method”

The checksum method works similarly to the SCD wizard, but you basically do it yourself.

First, you need to get a checksum transformation (you can download here: http://www.sqlis.com/post/Checksum-Transformation.aspx)

The basic layout of your package to populate a dimension will be like this:


What you want to do, is add a column your dimension destination table called “RowChecksum” that is a BIGINT. Allow nulls, you can update them all to 0 by default if you want, etc.

In your “Get Source Data” Source, add the column to the result query. In your “Checksum Transformation”, add the columns that you want your checksum to be created on. What you want to add is all the columns that AREN’T your business keys.

In your “Lookup Transformation”, your query should grab from your destination the checksum column and the business key or keys. Map the business keys as the lookup columns, add the dimension key as a new column on the output, and the existing checksum column as a new column as well.

So you do a lookup, and if the record already exists, its going to come out the valid output (green line) and you should tie that to your conditional transformation. You need to take the error output from the lookup to your insert destination. (Think of it this way, you do a lookup, you couldn’t find a match, so it is an “error condition, the row doesn’t already exist in your destination, so you INSERT it).

On the conditional transformation, you need to do a check if the existing checksum == the new checksum. If they equal, you don’t need to do anything. You can just ignore that output, but I find it useful to use a trash destination transform, so when debugging you can see the row counts.

If the checksums don’t equal, you put that output the OLEDB Command (where you do your update statement).

Make sure in your insert statement, you set the “new checksum” column from the lookup output to the RowChecksum column in your table. In the update, you want to do the update on the record that matches your business keys, and set the RowChecksum to the new checksum value.

One thing you might also run into is this. If your destination table key column isn’t an identity column, you will need to create that new id before your insert transformation. Which probably consists of grabbing the MAX key + 1 in the Control Flow tab, dumping into a variable, and using that in the Data Flow tab. You can use a script component to add 1 each time, or you can get a rownumber transformation as well (Both the trash and rownumber transformations are on the SQLIS website – same as the checksum transformation)>

After getting your new custom slowly changing dimension data flow set up, you should see way better performance, I have seen 2.5 million rows process in around 2 minutes.

One other caveat I have seen, is in the Insert Destination, you probably want to uncheck the “Table Lock” checkbox. I have seen the Insert and Update transformations lock up on each other. Basically what happens is the run into some type of race condition. The cpu on your server will just sit low, but the package will just run forever, never error out, just sit there. It usually happens with dimension tables that are huge.

Like I said earlier, the checksum method is good for large dimension tables. I use the checksum method for all my dimension population, small or large, I find it easier to manage when everything is standardized.

By no means is this post an extensive list of everything that you would run across, but more of a basic overview of how it works with using the checksum method.

In any event, you can get huge performance gains by using the checksum method and not using the SCD Wizard that comes with SSIS, and you will also feel like you have more control of what is going on, because you will. Happy ETL’ing 🙂