In SQL Server Integration Services (SSIS), pretty much every task or transformation lets you set “expressions” up. Expressions are basically ways to set property values programmatically.
Here are two scenarios where you might use expressions (there are 100’s of uses, these are just two that are kind of related).
- Multiple Database Select – You have multiple databases – same schema, let’s say you have 300 installs of a 3rd party product and they all need their own database. I know it might sound impossible, but trust me, it can happen. Now, you want to run the same query over all databases, and pull data from a table, and dump into a data warehouse, for example. You could write 300 queries, and keep adding/removing based on the databases, you could create some elaborate dynamic SQL proc using loops, or you might have some other way, or, you could use SSIS Expressions.
Now, how would you go about doing this? It is pretty easy actually. First step, you need to set up a loop in SSIS. You would want to grab a recordset of database names using an Execute SQL Task, or however you’d like, and store in an object variable. Then you can loop through that list. Your only difference in your query would be database name, so what you would do is have a variable for your SELECT statement. Name it whatever, but what you want to do is click on the variable, the properties of it. You will see Expression. Open the expression box and then set it to something like this
”SELECT Col1,Col2,Col3 FROM “ + @[User:CurrentDatabaseName] + ".dbo.MyTable"
@[User:CurrentDatabaseName] is another variable to store the databasename that you would grab as you loop through your list of databasenames.
Finally, in your dataflow, OLE DB source, you can change the Data Access Mode to “SQL Command From Variable”, and then it will let you choose your variable. As your for loop loops through your database names, and updates your SELECT variable, you can then select data from each database as you loop through them.
- Non-Standard Database Select – Not sure how to label this one, but here is what I am talking about. I like to make all my queries as stored procedures in SSIS, at least as much as possible. This works great when you are doing SQL Server to SQL Server, but what happens if its Oracle to SQL Server, DB2 to SQL Server, etc? Yes I know you can create stored procs on those systems, but you might be in a place or position where you just can’t or don’t want to. In that case you would want to use just standard T-SQL select statements to get data. You can easily put in params if the source is an OLE DB source, but what if it is an ODBC Source? You have to use the DataReader source, and you can’t easily set params – like a WHERE statement. You HAVE to use Expressions in order to have a query with a dynamic WHERE statement or passing in a variable as WHERE statement filter.
So, throw a DataFlow on your package, and inside that, throw a DataReader source, and then set the connection to your ODBC Connection (ADO.NET Connection) and set the command text. Good to go. But where to set the connection? Not very intuitive. Go back to your DataFlow and look at the expressions for it. You will see one for DataReaderSource.CommandText (where DataReaderSource is the name of your DataReaderSource). You can set the expression up there. Now you can change an Oracle SQL Statement or DB2 or whatever to something that takes params without the need for a stored proc on that other database server.
So, while there are hundreds if not thousands of uses for expressions in SSIS, these are just a couple of uses that can make your life easier when trying to do more dynamic type queries in your DataFlow. Happy ETL’ing!