Danny Sebestyen recently wrote up his experience with reporting on SharePoint data using SSRS and SSIS. Thanks Danny!
To effectively report on SharePoint data using SQL Server Reporting Services (SSRS) you must copy the data into SQL Server using SSIS. Unfortunately, SharePoint out of the box does not have a solution for reporting on its own data other than configuring list views or building data views. Building a reporting database gives developers a powerful toolkit for building high-performance reports while off-loading the burden of report processing from SharePoint.
So how do you get the SharePoint data loaded into SQL Server so reports can be built and executed in SSRS? Simple, write an SSIS package that executes the following steps:
- Executes a small custom utility to extract the SharePoint data to xml files using the built-in 'Lists' web service.
- Delete the current reporting data in the database
- Load each of the list's xml data files into the corresponding tables
- Send e-mail notifying reporting data has been successfully loaded
Finished Control Flow Example

Small Utility to the Generate the List XML Data Files (see the "Run SharePoint Extractor" task in the Control Flow above):
First create custom views for each SharePoint list to contain the data needed for reporting. The views must be defined to return a set amount of records used for making multiple calls to get all the data from large lists . Also, the views can be used to only extract the data needed for reports instead of all the data in the list. Next, write a method named something like ExtractListData that takes the following parameters: string listName, string fileName, string elementName, string viewName. The ExtractListData method basically makes a call to the SharePoint Lists' GetListItems method to retrieve the data. Once you retrieve the data into the NodeListItems, then there are several important items that need to be done:
- Write out the root xml element
- Scrub all lookup values to other lists you only include the ID value, so you have to remove the #;<data value> from the columns value
- Scrub all the Person/Group values to only the data part of the value so you have to remove the ID and the "#;"
- Remove the z: from the z:row element name because the SSIS XML Source Data Flow Component will not load the data
- Convert any number values to integers if they are defined as integers in the database
- Write out the transformed rows
- Write out the ending root xml element
NOTE: There are probably more types of transformations based on your data.
Delete the Data from the Reporting Tables (see the "Delete All Records" task in the Control Flow above):
The design is to completely delete all the data in the SQL Server reporting tables and then reload all the data from the xml data files generated above. Since the entire process happens rather quickly I did not feel the need to do incremental updates based on my performance requirements, but may be needed if there is a tremendous amount of data in the SharePoint lists. I basically had 16 lists where most lists were < 1000 records and one list with ~50,000 records.
Load Each List's XML Data Files:
To create the SSIS XML Source Data Flow Components you need an XML Schema file for each of the lists. I just opened up the XML files generated above in Visual Studio 2005 and selected the XML->Create Schema item for a quick and dirty way to get the XML Schema. The generated schema will need to cleaned up a little, such as setting lookups to xs:integer.

Now you can create a new SSIS Data Flow task and put an XML Source component in it. Right-click and select Edit… on the XML Source to configure the settings including the schema file and the columns used to load the reporting database.

Next, drop a Row Count Data Flow Transformation and create a link between the XML Source and the Row Count tasks. Next, drop an OLE DB Destination Data Flow Destination and configure it to connect to the reporting database and select the appropriate table. You also need to link all the columns from the XML Source columns to the reporting database table columns to define where the data gets loaded.
Data Flow Task Example:

Send Email to Notify Successful Data Load:
Drop a Send Email Task on the control flow and link it to the last Data Flow task. First, you will need to create an SMTP Connection to configure which email server to use to send the email.

Next, right-click on the task and select Edit… and select the SMTP connection created above and fill out the rest of the email information.
