Many organizations rely heavily upon the data sent by their vendors/customers/branches for planning purposes. Organizations depend on various templates for capturing day today operational and other business related data. This data most often is stored and shared in MS Excel/flat file/ CSV formats. The templates specify the format of the Excel files/flat files/ CSV files and the required fields, the data types, etc. that should be in the file. Most often, an automated Report Generator (using Data Transformation Services) is employed to process these files and give the required analysis output to, say, the business stake holders. The system is supposed to work perfectly every time but it rarely happens, the Excel files/flat files/CSV files do not adhere to the data mapping specifications, resulting in processing failures by the automated Report Generator. The common errors observed are:
- Filenames of the MS Excel spreadsheets are not in conformance with the specifications.
- Worksheet names within the MS Excel files are not in compliance with the specifications.
- Date fields are not uniform (clients/branches use different date formats).
- Data in one or more of the mandatory fields are absent or partially filled.
- Data mismatches occur within fields (E.g. Numeric fields contain string data).
The Report Generator will naturally raise an error when it encounters these small exceptions as it is not configured to monitor and manage them. The process stops and modifications to the input data have to be made manually for the process to resume. This greatly reduces the efficiency of an automated process. Ironically, the automated system now involves intensive manual error-detection and correction. This seemingly time-saving automated process now becomes a time-consuming one, involving the intervention of a development team to find out the errors and rectifying them. In a real business scenario, new clients/branches are being added to the list all the time. The Report Generator has to be configured to process the data from the new clients as well, thus involving development team intervention again.
Manipulating the input files or adding/modifying client details involves working with Data Transformation Services (DTS) which can be efficiently performed only by Database Administrators and seasoned professionals. Instead, a new system can be introduced wherein a normal user can work on it and make the same changes on the DTS packages without actually working directly on the DTS packages. This increases the overall operational ease and the efficiency of the report generation process.
Proposed Solution Design
The process would be a lot more streamlined if the development team is almost entirely bypassed and the operations team is able to handle these situations. A person from the operations team can be given control of the proposed system, which will consist of a User Interface (UI) that works directly on top of the Data Transformation Services (DTS). The UI will facilitate indirect handling of the DTS packages without having to go through the hassles of manual creation/modification of the DTS design. This makes the report generation system more user-friendly and controllable. The main aim of the UI is to achieve the following:
- Decrease the Turnaround Time
- Cost reduction
- Increase the overall maintainability
- Increase operational ease
On successful implementation, the operations team will be able to easily handle the various report generation processes greatly minimizing the need for developer intervention. Efficiency of the system is considerably increased as it removes the involvement of an external team to manage and troubleshoot it. This, in turn, brings down the cost involved as the number of teams working on it has been cut down.
Solution Description - General
The UI can be designed using the Active Template Library (ATL) custom task basic template provided by Microsoft. Active Template Library is a set of template classes provided by Microsoft that allows the programmer to build Component Object Models (COM) that can be invoked programmatically. The time taken to write COM components is greatly reduced if the ATL framework is used. The objects that are built using ATL include a whole array of controls like Web controls, property pages, dialog boxes, etc.
The ATL provides a way to create User Interfaces for any custom Data Transformation Service (DTS) tasks that are created. This will give direct access to the DTS packages and enables easy manipulation and addition of Data Pump tasks to the existing packages. The Operations Team will be working on the UI, providing it with the required data such as Client/Branch details, file name/worksheet name changes, ignore/discard inconsistent data, etc. The following diagrams show the proposed solution in action:
Currently (see diagram above), the Development Team plays a vital role in maintaining the automated system. Even minor modifications have to be performed by them, hence resulting in increased turnaround time and cost. The proposed system (see following diagram) almost completely eliminates Development Team intervention, thus removing the delay. The Operations Team monitors and manages the process and has complete control over the troubleshooting too. They can specify the changes that have to be performed on the DTS and the UI will accomplish it using the COM component designed using the ATL.
UI – Details
The following diagram shows the functions that the UI allows to be performed without the intervention of the development team. The operations team will control the UI and provide inputs to it for performing tasks like adding/modifying client/branch details, changing the file names, ignoring/discarding unnecessary data, etc.
Technologies that can be used
The proposed UI can be built with VB/ASP/C# or ASP.NET using the MS Excel Object Model/ Active Template Library (ATL). Visual Studio .NET 2003 can be used for this project development.
Techniques to achieve the solution
This solution makes use of the Active Template Library (ATL) and the Excel Object Model (EOM). The ATL is used for the creation of UI-based custom DTS tasks while the EOM is used for the creation of components performing direct MS Excel file manipulation. The ATL-based component acts as an interface between the Operations Team and the DTS package, while the EOM-based component forms part of the former, performing the required corrections/additions to the Excel files.
Active Template Library (ATL)
The Active Template Library helps in the creation of custom DTS tasks which are UI-based. The custom task framework is designed using the
UserIF class files. Users can provide inputs to these custom tasks and indirectly create or modify new data pumps. This process almost completely eliminates the intervention by the Development Team.
MS Excel Object Model
Using MS Excel Object Model (Visual Studio tools for the MS Office System), components can be designed to manipulate MS Excel files. The Object Model provides several objects such as:
- Application Object
- Workbook Object
- Worksheet Object
- Range Object
These objects can be used to manipulate Excel files. Filenames can be altered, worksheets within the MS Excel file can be changed, Date formats can be changed, data can be removed or added to existing fields in the worksheet, etc.
Another feasible solution would be to directly handle Package Events from Visual Studio .NET using C#. The
CustomTaskUI method (referenced from Microsoft.SQLServer.DTSPkg80.dll) can be employed for this purpose. This solution will result in an UI which works directly with the Package events and can be used for complex Package event handling. The EOM model will be employed in this solution for Excel file manipulation.
Another feasible solution would be to employ the use of a Stored Procedure to create new data pumps in existing DTS packages. The Operations Team will feed the Client’s details into the UI and this will then be stored into a table in the database. After this, the Stored Procedure will be called from the UI and it will use those details for creating/modifying the data pumps. Again, an EOM-component is incorporated into the solution for Excel File Manipulation.
Comparison of Techniques
Technique 1 involves the design of two separate components - the ATL and EOM-component, thus requiring more development time. Nevertheless, this model is highly modularized, hence, easily maintainable.
Technique 2 involves the design of only the EOM-Component with the UI and the Package-handling components incorporated into it. This is an ideal solution with an efficient and quick implementation, but may require minor Development Team intervention.
Technique 3 is similar to technique 2 with the exception that the system interacts with stored procedures to handle packages. After implementation, the Development Team may still be required to collaborate with the system in case of failed stored procedure executions. It follows a very direct way of handling the DTS packages as the stored procedures work directly on them, thus removing the need for indirect code-level manipulation of packages.
This solution will serve as a very good utility that can be used by the organizations facing problems managing their periodic reporting needs. In addition to the operational ease that it brings to the teams, it also reduces the turnaround time and achieves cost reduction involved in the maintenance of the reporting process. It will be a highly efficient supplement to the already existing Report Generator System in the sense that, the amount of manual intervention is dramatically brought down, hence, resulting in a truly automated system. Developer intervention may still be needed, but only at times when core components need to be modified or added. Security will not be compromised as all the three techniques discussed can be made to strictly enforce security policies. To conclude, this solution will greatly aid in converting the existing system into a failure-free, highly-manageable, time & cost-saving and an easily operable automated system.