No, the title isn’t a misprint-this installment of the Baker’s Dozen will visit both sides of the planet. These days, many .NET user group meetings focus on database and business intelligence topics as well as hardcore .NET content. Over the last several months, I’ve spent roughly half my time modifying my own development framework for WCF. The result is some basic but functional factory classes I’d like to share. The other half of the time, I’ve been looking at different capabilities in the SQL Server 2005 and 2008 Business Intelligence stack, and solving requirements that BI developers often face. So rather than pick one side and make the other side wait two months, I decided to combine the two.
Perspectives
Recently I had a conversation with a highly-skilled .NET programmer about SQL Server Transaction Isolation Levels. While the entire conversation is too long to repeat, the person didn’t seem very interested in talking about them. Not long after that, I had a talk with a DBA who had even less interest in talking about .NET assemblies designed for SQL Server.
Sure, no one person can have an interest in everything-or should they? Consider this, an old baseball story about Willie Mays. Early in Mays’ career, his manager, Leo Durocher, would ask Willie such questions as, “there’s one out, you’re on second base, there’s a left-handed hitter at the plate who can’t get around on the pitcher’s fastballs…how long of a lead should you take?” Mays’ reply was, “I don’t need to know, that’s for the manager to tell me, I’m the player.” Durocher replied, “No, Willie, you NEED to know these things, and when you develop an understanding, it’ll make you a better ballplayer.”
OK, it’s not practical to be an expert in everything-but in the Microsoft computing world, being a Jack (or Jill) of many disciplines will ultimately increase the value that you bring to the table. (And it doesn’t preclude you from being a master at one or two.) For all the buzzwords that you hear every day, “value” is one to always keep in the front of your mind.
What’s on the Menu?
This installment will start out with a set of C# tips for WCF, and then cover a few miscellaneous tips for using C# with both SQL Server and other tools in the Microsoft BI stack. Finally, we’ll look at some powerful capabilities in the BI stack. So, here are the menu items for today!
- The Baker’s Dozen Spotlight: Creating some basic WCF factory classes (Part of 1 of 3)
- Creating some basic WCF factory classes (Part of 2 of 3)
- Creating some basic WCF factory classes (Part of 3 of 3)
- Setting up SQL Server 2005 stored procedures as HTTP endpoints and then calling them from .NET
- Executing SQL Server Integration Services (SSIS) packages from within .NET
- Generating e-mails dynamically from inside SSIS
- A revisit of some basic productivity classes in C# for handling data in Crystal Reports.NET
- Reading a Microsoft Analysis Services OLAP database from C#
- Defining roles in an OLAP/Analysis Services environment so that certain users/roles will only see certain dimension members (i.e. brand managers)
- Building an OLAP date dimension and writing some MDX code to perform seasonal analysis of sales over the period of two Easter years
- MDX Named Sets in SQL Server 2008
- Searching data dynamically in SQL Server without using dynamic SQL
- Using the PerformancePoint Service Web Service from .NET
Some Background for the First 3 Tips…
Windows Communication Foundation (WCF) has been out for roughly two years now. Of all the new .NET features in the last few years, WCF is one of my favorites. I never enjoyed maintaining multiple code bases for .NET Remoting and Web services, and even wrote some factory classes in the June/July 2005 issue of CoDe Magazine to provide a single model for different communication protocols. (In retrospect, those classes were a VERY poor man’s WCF!)
Even though WCF may seem like “old news” to some, there are many developers who are still learning WCF. I have a community session (a beginner session) called “Crash Course on WCF Programming” that is well-attended. Because so many developers are still seeking information to get started on WCF, I’ll provide a brief walk-though of a solution in Tips 1 - 3.
However, I also want to provide some value to those who have been through the basics of WCF. So I’ll provide a client-side factory class that you can use to avoid repeated WCF calls to ChannelFactory and CreateChannel. I’ll also talk about some issues you may encounter when hosting WCF services over HTTP.
Tip 1: The Baker’s Dozen Spotlight: Creating Basic WCF Factory Classes (Part 1 of 3, Building Contracts)
OK, I’ll dive right in and talk about the fairly simple requirements for this little WCF exercise: a back-end function to validate a user ID and password. While that may seem like a mundane function, the process allows you to see the basics of WCF in action. So I’ll cover those basic steps in this order:
- Create a class called LoginResult as a WCF DataContract, with WCF DataMember properties for Full User Name, ActiveFlag, LoginMessage, and a Boolean property for whether the user should be allowed in the system. The LoginResult class will reside in a separate assembly with other DataContract classes.
- Create an interface called IUser as a WCF ServiceContract, with a WCF OperationContract called ValidateUserID.
- Create a service that implements the IUser Service Contract, and then host the service in IIS.
- Create a client-side remote factory called GetRemoteFactoryChannel that returns a WCF channel for the type of service being referenced.
This tip will cover steps 1 and 2 of the exercise and then the next two tips will cover steps 3 and 4.
I’ll start by creating a class library project called DataContracts. In the project, I’ll add a .NET reference to System.RunTime.Serialization. This reference is required as the DataContract represents the default serialization model in WCF. In the project, I’ll add the class from Listing 1, which is essentially a basic class with four properties, all marked with WC DataContract and DataMember attributes. I’ll compile this as DataContracts.DLL.
Next, I’ll create a second class library project called ServiceContracts. In the project, I’ll add a .NET reference to the DataContracts.DLL assembly, as well as the main WCF namespace, System.ServiceModel. In this project, I’ll add the class from Listing 2, which is essentially a .NET interface with a function call to ValidateUserID of type LoginResult. Similar to Listing 1, I’ve marked the interface and method with WCF ServiceContract and OperationContract attributes. I’ll compile this as ServiceContracts.DLL.
Tip 2: Creating Basic WCF Factory Classes (Part 2 of 3, Hosting a Service in IIS)
Now that I’ve built a data contract and service contract, I can create a service that implements the data contract IUser. To do this, I’ll create a third class library project called Services. In this project, I’ll add .NET references to the DataContracts.DLL and ServiceContracts.DLL assemblies, as well as the main WCF namespace, System.ServiceModel.
In the project, I’ll add the class from Listing 3. The class (svcUser) implements the IUser contract, and contains the OperationContract method ValidateUserID. I’ll compile this as Services.DLL.
So at this point I have three DLLs: DataContracts.DLL, ServiceContracts.DLL, and Services.DLL. Next I’ll use IIS to host the Services.DLL so that I can access the service from a client piece.
Generally speaking, hosting a WCF service in IIS is a breeze-most of the time! A while back, I had to host a WCF service on a shared Web hosting service provider. When I tried to test my service, I received the following error: This collection already contains an address with scheme http. There can be at most one address per scheme in this collection.
Many Web service providers have more than one host header bound to the site. Fortunately, two blogs (http://www.darkside.co.za/Default.aspx and http://www.robzelt.com/blog) have offered excellent workarounds for this. The workaround is to override the ServiceHostFactory with code to check the number of Uri base addresses.
Listing 4 shows a class called CustomServiceHostFactory that overrides the system ServiceHostFactory class, and creates a new ServiceHost object from either the first base address or the second base address. (You may want to modify this to iterate through the baseAddresses collection for the one you want.)
The Service.SVC file for the Web project that references the CustomServiceHostFactory is as follows:
<%@ ServiceHost Language="C#" Debug="true"
Service="StoneWorksJob.Services.svcUser"
Factory="CustomServiceHostFactory" %>
Tip 3: Creating Basic WCF Factory Classes (Part 3 of 3)
At this point, I can build a client piece to call the WCF service. In this example, I’ll call the service from a .NET Windows Forms application. In the past, I might do something like this:
ChannelFactory<IUser> UserFactory =
new ChannelFactory<IUser>("User");
IUser userProxy =
UserFactory.CreateChannel();
For years, some have made fun of me for trying to combine two lines of code into one. Guilty as charged! I’m usually interested in simplifying things, especially if I can create a simpler interface where others (or I myself) don’t have to worry about the exact syntax. So given that any client-side call to the WCF service requires two lines of code above, I’d like to combine that into one, for all the different services that I might call.
Listings 5 - 6 demonstrate a small factory class (in namespace ClientRemoteAccess) called GetRemoteFactoryChannel. All I have to do is pass it the name of the contract interface that I’m using and the function will return a factory channel that I can use to call my service in a strongly-typed fashion.
// Create instance of the remote access class
ClientRemoteAccess oRemoteAccess = new
ClientRemoteAccess();
// Call the factory, passing the interface
IUser oUser =
(IUser)oRemoteAccess.GetRemoteFactoryChannel
<IUser>();
// Use the factory class!
clsLoginResult oResult =
oUser.ValidateUserID(UserID, Password,
DatabaseKey);
Tip 4: SQL Server 2005 Stored Procedures, HTTP Endpoints, and Web Services
While this next tip may seem “old news” to black-belt SQL developers, many people are just now getting into the new features in SQL Server 2005. Here’s a feature that doesn’t receive as much attention as it deserves-exposing stored procedures through HTTP endpoints. This means you can make SQL Server 2005 an HTTP listener, and consume your stored procedures as a Web service.
I’ll do two things to make this happen. First, I’ll define an HTTP endpoint in SQL Server 2005. Listing 7 shows an example where I’ve exposed the AdventureWorks stored procedure uspGetEmployeeManagers. Note the names I provided for the endpoint, the webmethod, and the namespace.
Second, I’ll create a Web service reference using the URL http://localhost/Employees?wsdl. Once I do that, I can call the stored procedure as a Web service (Listing 8). Note in Listing 8 that it may be necessary to set Credentials based on the current network credentials. Additionally, the account associated with the Web application must have connection permissions to the endpoint.
Tip 5: Executing SQL Server Integration Services (SSIS) Packages from .NET
When I started using SSIS 2005, it was love at first sight. With each passing day, I love it more.
Sometimes I’ve needed to execute an SSIS 2005 package from within .NET. Listing 9 shows how to execute an SSIS package, regardless of whether the package was deployed to the File System or SQL Server. Note that you must add a .NET reference to Microsoft.SQLServer.ManagedDTS.
Tip 6: Dynamic E-mail Generation in SSIS 2005
A frequent SSIS question is how to set properties dynamically. For example, SSIS contains an e-mail task so that you can send e-mails and e-mail attachments after a task has completed. Oftentimes, the e-mail name and attachment file locations will be dynamic, and must be read from variables. Fortunately, this is easy to do and I’ll strip it down to the basics.
I’ll start by creating SSIS variables for e-mail attachment location and file, as well as Contact information (Figure 1). Then in the SSIS Send Mail task, I need to do two things (both in Figure 2):
- First, because I want to set e-mail properties for run-time evaluation, I need to set the DelayValidation property to true.
- Second, I can go into the Expression mode and map my variables to the corresponding properties. Note that the e-mail subject line is a concatenation of two variables.
Before I finish on SSIS, here’s some important news on SSIS 2008. I’ve had to write a few SSIS scripts over the last year. SSIS 2005 only supports the VB language for SSIS scripts. Alas, I am a C# developer, and while I can write VB code, I (and others) have longed for the ability to write C# code in SSIS scripts. Fortunately, SSIS 2008 allows you to choose which language you want for scripts! Figure 3 demonstrates the option in SSIS 2008.
Tip 7: Crystal Reports.NET Redux (redux)
No matter how many times I’ve blogged or written about reusable classes to set data sources in Crystal Reports .NET, many developers still aren’t sure about setting runtime data sources for reports in Crystal.
(Before I continue, I want to clarify that by Crystal Reports.NET, I’m referring to the version of Crystal Reports that comes with the higher-end versions of Visual Studio 2005 and 2008.)
In the January/February 2007 issue of CoDe Magazine, I wrote an article that presented a reusable library (cgsCrystalTools) with some productivity classes for working with Crystal Reports.NET. One of the functions in the main class shows how to push the contents of any DataSet into a Crystal Reports object-the function takes care of all the tables in the main report object, as well as any tables in any subreport objects. I’ve reprinted it here in Listing 10. You can simply create your own class and include this code, and then call the function (PushReportData) from your application.
I’ve also reprinted a small listing (Listing 11) to show how you can push a .NET collection (i.e. a List<> class) into Crystal Reports.
Despite the fact that it’s possible to build reports in Crystal Reports against custom lists, I still recommend strongly-typed ADO.NET DataSets as the best practice for applications that make moderate to heavy use of Crystal Reports. I’ve written about the methodology that I use in my book, Pro VS 2005 Reports with Crystal Reports and .NET (published by Apress). In a nutshell, here is the approach I take:
- I’ll start by designing a physical model for my typed DataSets, in a separate DLL (e.g. DataSets.DLL). If I have many typed DataSets, sometimes I will generate the XSD schemas from SQL Server.
- I’ll design the reports using the Crystal Reports designer and point to the XSD for the design-time data source.
- I’ll build the reports as a separate DLL (Reports.DLL).
- At run time, I’ll set references in my .NET application to both DataSets.DLL and Reports.DLL. I’ll also set a reference to the Crystal productivity library cgsCrystalTools that I mentioned earlier in this section. This allows me to generate reports to the screen if I’m working in a client/UI tier, or directly to a PDF or other storage if the report is executing in an application layer unattended.
I’ll grant that other methodologies will work, though at the cost of a workaround. For instance, using a custom entity class as a design-time data source means that the class (or a proxy) must reside in the same project as the report. So this day, I continue to recommend strongly-typed DataSets as the best overall approach for reporting applications in .NET.
Tip 8: Defining Data Access Roles in Analysis Services/OLAP Environment
Do any of the following scenarios sound familiar?
- “User X should only see data for Accounts A, B, and C”
- “User Y should only see data for Market A and Product B”
- “For Product A, User Z can see all Markets. But for Product B, user Z should only see data for Market C”
Organizations expect software systems to filter data based on user roles and definitions (and will react very harshly if users are allowed to see data beyond their authorization levels!) In relational database systems using SQL Server, developers usually must build their own authorization methodology-SQL Server does not contain any built-in functionality for row-level authorization. This means developers must modify SQL queries to take into account any user definitions.
By contrast, Microsoft Analysis Services 2005 makes it a breeze to define user authorization roles in OLAP databases. Figures 2 - 8 inclusive contain a walkthrough for defining access to the Bikes product category in the AdventureWorks DW database.
- Connect to SQL Server Analysis Services (in SQL Server Management Studio). Then navigate in Object Explorer to the OLAP database you want to access, expand the database, and right-click on the Roles list to add a new role (Figure 4).
- In the edit role dialog box (Figure 5) provide a Role Name and a description. Also, set the database permissions (for most users, only the read definition will be checked).
- In the membership page (Figure 6) add the users and/or user groups that will be part of this role. (The user names/groups come from the list of active Windows user names.)
- Select the database and cubes for access (Figures 7 - 8).
- Important! Go to the Dimension Data page (Figure 9) and define the dimension/attribute/dimension member associated with the role. In this example, I’ve chosen the Bike category. (I also could have selected the Geography dimension to further refine access by any level in the geography hierarchy.)
- Finally, I can also restrict access to specific measures in the OLAP cube (Figure 10).
For anyone who has ever had to write custom SQL code to implement user authorizations, the capabilities in Analysis Services to utilize the UDM (Unified Dimension Model) make this task a breeze!
Tip 9: Custom OLAP Date Dimension and Handling Seasonality with MDX
Without question, one of the many reasons that organizations seek OLAP functionality is to analyze performance over a period of time. Therefore, period (i.e., date) handling functions in MDX are critical.
MDX contains several functions, such as PeriodsToDate, ParallelPeriod, LastPeriods, etc., for analytical purposes. The AdventureWorks DW sample database contains examples of these functions, as well as a good example of multiple date hierarchies (Year, Quarter, Month, both for a Fiscal calendar as well as a regular calendar).
Those capabilities are sufficient for many organizations. However, suppose a company analyzes data by seasonality, where the exact dates differ ever year? For instance, Thanksgiving (and the famous shopping day, “Black Friday”) fall on a different date every year-sometimes closer to December 1 than others. Companies might want to analyze sales across years for the week before/week after. Similarly, sporting events such as the Super Bowl and World Series fall on different dates every year.
An even larger example is Lent and Lent weeks, which begin at a different time every year. A company that tracks sales by Lent (i.e., fish products) cannot simply compare sales between February/March of 2007 to February/March of 2006.
In this instance, those responsible for maintaining an OLAP database must define additional information in the source of a date calendar dimension. One solution would be to add a LentWeekNum numeric column into the source staging area for a date calendar, where the value represents the Lent Week number associated with that specific date/week.
With all the supporting data in place, a developer could write an MDX query like the one in Listing 12, which lines up Lent Weeks for 2007 and 2006 using the Lent Week Num and the MDX ParallelPeriod function.
Tip 10: MDX Named Sets in SQL Server Analysis Services 2008
T-SQL developers making the transition to OLAP will often try to seek the counterpart for stored procedures and user-defined functions. Some discover that OLAP/MDX calculated members server as a nice counterpart for scalar user-defined functions.
The counterpart for stored procedures and table-valued user-defined functions is a very different story. MDX provides named sets, which allows developers to define a stored MDX expression that returns a set of information. Developers can refer to the stored MDX expression with an alias, as needed.
For example, I want to create a named set that will return the 10 most profitable product categories from the AdventureWorks DW database. I also want to show the ranking number for each of the 10 product categories.
Figure 11 shows an example of an MDX named set called [Top10ProductsByProfit]. Note that the MDX expression uses the TOPCOUNT function to retrieve the 10 most profitable product categories.
TOPCOUNT(
[Product].[Product Categories].
[Product].Members,
10,
[Measures].[Internet Gross Profit])
A developer can enter the code for the named set in Visual Studio for a project associated with the OLAP cube (as shown in Figure 11), or directly into the OLAP cube (e.g. in SSMS 2005).
Next, the following code (an MDX calculated member) will calculate the ranking number for each product within the top 10 set. Note that the second parameter for the MDX RANK function is the actual named set alias from above.
RANK([Product].
[Product Categories].CurrentMember,
[Top10ProductsByProfit])
A developer can then use the named set on the ROW axis of any MDX query, like follows:
// Note the pre-defined named set and
// calculated member
SELECT
{[Measures].[Internet Gross Profit],
[Measures].[ProductProfittRank]}
ON COLUMNS,
[Top10ProductsByProfit] ON ROWS
FROM [Adventure Works]
This produces the result set in Figure 12. So far, so good. However, the query above generated the top 10 products by profit for all geographies as an aggregate. Now suppose an analyst wants to see the top 10 selling products for the country of France:
SELECT
{[Measures].[Internet Gross Profit],
[Measures].[ProducProfittRank]}
ON COLUMNS,
[Top10ProductsByProfit] ON ROWS
FROM [Adventure Works]
WHERE [Customer].[Country].[France]
The query above will generate the result set in Figure 13. Does something look incorrect? While the product profit rank column appears “correct,” the numbers for gross profit are not in order. In fact, a closer examination reveals that the result set contains the same 10 products that the query displayed in the first result set (for all geographies).
This example demonstrates a serious issue with permanently stored MDX named sets in Analysis Services 2005. In 2005, named sets are static; in other words, SSAS evaluates the MDX expression ONCE, and ignores any subsequent dimension slicing that a user might perform. So essentially, SSAS never sees the slicing for the country of France. While the result set correctly shows the profit dollars for each product for France, the ranking and Top 10 logic are incorrect.
Some are unaware of this issue in SSAS 2005 (and may only become aware of it after someone points out incorrect numbers on a report). Others are aware, and use the only real workaround: include the MDX named set expression “in-line” in a query, thereby losing out on reusability.
One of the top enhancements in SQL Server Analysis Services 2008 is dynamic named sets. As implied from the name, dynamic named sets will perform as developers expected all along. Figure 14 shows how you can define a named set as dynamic in the Visual Studio interface. After executing the same code, Figure 15 shows the correct results, demonstrating that dynamic named sets will recognize any subsequent dimension slicing.
Tip 11: Reading OLAP Data with .NET
.NET developers may need to query OLAP data from within a .NET application. Fortunately, developers can use the same OleDbConnection, OleDbDataAdapter, and OleDbCommand objects that they use when querying relational data. Listing 13 shows an example of querying an OLAP cube, using MDX as the command text
Additionally, developers may want to “discover” the entire dimension model of an OLAP cube. Using the Microsoft Analysis Services ADOMD Client, a developer can open an AdomdConnection and then iterate through the Cubes, Dimensions, Hierarchies, and Levels collections (Listing 14). A developer can use this approach to construct a custom query/browser-like interface for the OLAP dimension hierarchy.
Tip 12: Searching Dynamically in SQL Server without Using Dynamic T-SQL
Not only are stored procedures often preferable, they are also often necessary. Many companies lock down tables in SQL Server (or whatever back-end database applies) so that stored procedures are the only access to data.
Critics of stored procedures claim that stored procedures aren’t as flexible as other approaches, such as dynamic SQL. Take, for example, a search screen for an end-user customer representative-the screen prompts the user for name, address, city, state, zip, etc. The end user may need to search on a full name and a partial address, or a name that starts with “SMI” and an address that ends in “AVE”, etc. (Presumably, the screen might have a small pull-down beside each field, with options for “equals”, “begins with”, “ends with”, “contains”.)
Again, those advocating dynamic SQL will construct the query based on the fields the user filled in, and feel that stored procedures aren’t as flexible. Listing 15 challenges that opinion, with a stored procedure that does the following:
- Accepts all the possible lookup options as parameters
- Sets any NULL parameters (i.e. fields that the user left blank) to ‘%’ (SQL Server wildcard character)
- Uses a series of <Column> LIKE @ColumnParameter statements in the WHERE clause. Any non-specified parameters are simply compared to themselves (which SQL Server optimizes)
All the application must do is place the SQL Server wild card character (‘%’) at the beginning and/or end of any search string, based on how the user wanted to apply the search for that column. Here are some examples:
-- Last Name starts with Smith,
-- lives in NY
EXEC [dbo].[GetCustomers]
@lastname = 'SMITH%',
@STATE = 'NY'
-- First Name is Laura,
-- lives on an AVE, in PA
EXEC [dbo].[GetCustomers]
@FIRSTNAME = 'LAURA',
@ADDRESS = '%AVE%',
@STATE = 'PA'
Assuming you have SQL indexes on all of the columns (names, address, etc.), the execution plan demonstrates that the query costs are in index seeks and key lookups. I’ve used this approach for customer databases with nearly a million rows, and the searches are usually instantaneous.
Tip 13: The PerformancePoint Server Web Service API
Microsoft PerformancePoint Server 2007 is one of the more exciting products for Business Intelligence. While PPS 2007 is a 1.0 product, Microsoft will surely enhance it to position it firmly as a capable performance monitoring, score-carding, and planning application.
While the user base for PPS 2007 is primarily OLAP developers and power-users, .NET developers can take advantage of the PPS 2007 Web service if they need to create/access scorecards and dashboards programmatically. Figure 16 shows some of the available methods in the PPS Web service, accessed via the PPS Web Service URL:
http://localhost: 40000/WebService/pmservice.asmx
Final Thoughts
Check my blog (www.TheBakersDozen.net) for any follow-up tips, additional resources, and notes on Baker’s Dozen articles.