The Data Tune System






4.98/5 (36 votes)
The DataTune system was one of the earliest data cleansing systems.
Introduction
In 1998 I developed the first version of the DataTune software which was part of the “TargetData” venture.
Further reading
- http://dataoptimisation.wordpress.com/
- http://datatune.wordpress.com/
- https://www.facebook.com/pages/DataTune/239398039404905?ref=ts
- https://www.facebook.com/pages/DataTune/203757142972477
The purpose was different and the focus was on displaying statistic and demographic data about the state of Israel.
During 2001-2003 I have developed DataTune, a data cleansing system, under the brand name of TargetData, which is part of Target Eye Limited.
Among our clients were government institutes, large corporations including The Standards Institution of Israel, The Israel Export and International Cooperation Institute (IEICI), Microsoft Israel, Bezeq Call, and People and Computers. Here is an historic view of the TargetData web site (2002):
The target data services include
- Data enhancement
- Filtering redundancies (duplicates)
- Data conversion
- Unification and filtering of records (including between various databases)
- Location of potential customers (for Database marketing).
- Integration of various information systems
- Generation of automatic reports
- System analysis
- Development of efficient tools for normal and correct entry of new data
- Location of updated information about private individuals and companies (including investigations, sending people into the field, etc., that is, in case there is no updated information)
- Development of customized software for execution of current activities on the customer’s premises, without the need for external help.
- Improvement of information systems performance
- Development of software and tools based on Excel
- Training and assimilation
Information systems and the importance of maintaining their serviceability
One of the most important assets of the organization, if not the most important, are the information systems. Most of the organization’s information is currently stored in its computer network. In most cases use is made of a number of information systems, that are frequently different from one another, thus creating the need for mutual synchronization and updating of items of information between the various databases.
Furthermore, since a significant proportion of the information is entered manually, a large number of errors arise, causing duplication and loss of information. As a result it becomes difficult to locate items of information at a later stage, overloading is produced, mail is returned, and malfunctions related to customers or suppliers occur.
Major importance is attached to the current updating of parts of the information without affecting the information systems in general. Most of the information items comprising the information systems require updating at some given frequency.
In fact, the frequency of updating, which differs from one item to another, also represents part of the problem. Some of the information items are related to the organization’s work:
- Products
- Prices
- Quotations
- Invoices
- Customer databank.
An additional part, over which there is even less control, includes updating of:
- Addresses. For example: a street name that has been changed.
- Phone numbers. For example: changes in Area Codes or changes in the first few digits of the phone number.
All these subjects require complex processes for the current enhancement of the information systems. In most cases it is necessary to occasionally or periodically refresh all the information systems, and to formulate procedures for maintaining the information systems constantly serviceable.
Here is a screenshot of DataTune's main screen:
General data enhancement
The concept of data enhancement refers to a series of operations intended to significantly improve the quality and efficiency of the organization’s information systems. The following is a description of these operations:
- Separation of fields
- Updating addresses
- Location of addresses containing the names of well-known places instead of the street name
- Interpretation of house numbers
- Data enhancement using an Error Bank
- Name of a street
- Name of a company
- Name of a town
- Data enhancement using Soundex-based algorithms
- Attaching postal (zip) codes and other codes
- Enhancing names of companies and/or private individuals
- Locating updated addresses and phone numbers of companies and private individuals
- Filtering redundancies (duplicates)
In some information systems, a number of fields are kept as single fields. This makes it difficult to perform updating, retrieval, and current maintenance.
For example, saving a customer’s address as a single field (i.e.
“1020 Main St. Appt #5”), will make it difficult to locate all the customers living in a suitable street (i.e.
Main St), or alternatively, will make it difficult to update the street name, if replaced, in the entire records of all customers living in this street.
Target Data has developed a unique method for splitting fields into their components. This method comprises separating full addresses, street addresses, names of people and names of companies. In addition, this method may be used to perform further manipulations of data structures of all kinds, at the customer’s request.
From time to time street names are changed. Furthermore, most information systems contain different versions of streets and towns names, as well as different syntax for address format.
Using Target Data’s exclusive method, a process has been developed to attach to each address a unique identifier, based on the coding system employed by the Ministry of the Interior in each country, giving each town and each street it’s unique numeric code.
The process consists of interpretation of the address in the information system and its conversion into town code, street code, house number, entrance number, apartment number, and floor number.
These data permit immediate attachment of the precise, official address: town name, street name, Zip Code, etc. whenever the street name is changed in any town. This process permits easy and effective attachment of the postal code to every address
In many cases the address field contains the name of a well-known place instead of the street name. For example, DATATECH, a firm situated in the Central Bus Station might mistakenly give its official address as follows: DATATECH, Central Bus Station, floor 5.
We shall identify a wide variety of places in major European towns, such as bus and railway stations, airports, cultural centers, squares, bathing beaches (which in certain cases form the address of hotels), as well as sites which do not belong to a town or a local authority (such as Highway projects). We can also identify shopping malls, commercial centers, industrial zones, etc.
In cases of an un-separated Street Address, the system is capable of interpreting and locating house numbers in addresses, using a variety of methods. In general, an address may contain a house number in several ways: house number/ apartment number, apartment number/ house number, and other combinations, such as floor number and entrance number. The most important of the data (an error in which may cause the mail to be returned) is the house number. We cross-check this number in the known range of numbers for that street. For example, Barnauer St. in Berlin should not contain house numbers greater than 200. In addition we check the apartment number and verify that it is logical, by comparing it with the number of floors usual in that district. For example, it is impossible to find an address on the tenth floor in a neighborhood of cottages.
The Error Bank was a concept used by TargetData to collect common errors and misprints in a databases, built up over the years we do this business. This bank permits DataTune to identify common misprints and relate them to the correct name, which may be:
This is all based on the assumption that errors frequently repeat themselves.
In order to identify unfamiliar misprints, we make use of Soundex-based algorithms that help DataTune to identify misprints according to the language root of the names. According to this method, the words Meerdevoor street in Den Haag can be misprinted as Merdevur, Mardevoor, etc. and yet receive an identical root. This method helps to locate and correct difficult and strange misprints of names of companies, streets, and towns, as well as of people’s names, and to enhance the data in an automatic, rapid and quality manner.
DataTune attaches / verify the updated postal code to every address in the customer’s database in accordance with the updated table of postal codes of the postal authority of each country. DataTune also attaches the codes for streets and towns in accordance with the updated table of codes of the Ministry of the Interior in each country. This enables DataTune to update the changes to street names published by the Ministry of the Interior from time to time.
This process is based on the technique of separating fields, developed by Target Data. At the end of the process an updated database is created permitting access to each element in the customer’s name (individual or company). This process permits locating duplications, locating family relations from the customer database and, if necessary, combining households. The process permits creating a single record for each company, and combining all the contact people with whom the organization is in contact into the main record. In this way the need is removed for updating separately details of information common to a company (address, fax number, Internet website), while at the same time access is retained to details of information specific to each contact person (extension number, or direct phone number, email address, etc.).
In our experience the best way to locate the most up-to-date address and phone number of the customer is to use Bezeq’s records. The file of records of residents is actually updated only after the citizen informs the Ministry of the Interior, and is not always up-to-date. In contrast, since most people own a telephone today (even if they are renting their homes), because of the reduced prices of phone lines, during projects done mostly for Israeli companies and institutes, we have found that Bezeq’s (the Israeli local telephone company) records are more up-to-date.
DataTune can update and verify addresses and phone numbers of private individuals and companies using this method, by querying several sources of information until DataTune locates the most up-to-date address and phone number.
As part of the data enhancement services Target Data offered facilities of marking, followed by filtering of redundant records. This service is implemented after data enhancement since only at this stage it is possible to identify duplicate records which originally contained apparently different data because of errors. Only the enhancement process will permit identification and correction of redundancies at a later stage.
The following source code taken from the DataTune project filters duplicates found in a given table of the Database:
Private Sub FilterDuplicates_Click()
Dim rr As Recordset
Dim Ignore() As Long
Dim IgnoreCount As Long
If Combo1.Text = "Choose" Then
MsgBox "Please select " + Label1
Exit Sub
End If
If Combo2.Text = "Choose" Then
MsgBox "Please select " + Label2
Exit Sub
End If
Open Text1.Text For Output As #1
Print #1, "Target Data Duplicates Report"
Print #1, Date$
Print #1, ""
Print #1, ""
Set CustRS = CustDB.OpenRecordset("select * from " + CurrentTable)
CustRS.MoveLast
CustRS.MoveFirst
RecsTotal = CustRS.RecordCount
ProgressBar1.Max = CustRS.RecordCount
ReDim Ignore(1)
While Not CustRS.EOF
RecsRead = Val(RecsRead) + 1
ProgressBar1.Value = Val(RecsRead)
DoEvents
If IsNull(CustRS(Combo1.Text)) Or IsNull(CustRS(Combo1.Text)) Then GoTo 2
For i = 1 To IgnoreCount
If CustRS(Combo2.Text) = Ignore(i) Then GoTo 2
Next
q$ = "select * from " + CurrentTable + " where " + Combo1.Text + " = " + _
GetRightDelim(CustRS(Combo1.Text)) + " and not(" + Combo2.Text + _
" = " + GetRightDelim(CustRS(Combo2.Text)) + ")"
Set rr = CustDB.OpenRecordset(q$)
If rr.RecordCount > 0 Then
RecsFound = Val(RecsFound + 1)
rr.MoveLast
rr.MoveFirst
ReDim Preserve Ignore(UBound(Ignore) + rr.RecordCount)
While Not rr.EOF
IgnoreCount = IgnoreCount + 1
Ignore(IgnoreCount) = rr(Combo2.Text)
rr.MoveNext
Wend
RecsUpdated = Val(RecsUpdated) + rr.RecordCount + 1
Print #1, "Customer: "; CustRS(Combo2.Text)
Print #1, Chr(9); CustRS(Combo1.Text)
Print #1, Chr(9); "Appears "; rr.RecordCount + 1; " times in your database"
rr.MoveFirst
Print #1, "Records are:"
While Not rr.EOF
Print #1, rr(Combo2.Text),
rr.MoveNext
Wend
Print #1, ""
Print #1, " ----------------------- "
Print #1, ""
'Debug.Print "Found "; rr.RecordCount; " "; CustRS(Combo1.Text)
End If
2:
CustRS.MoveNext
Wend
Close #1
End Sub
Private Sub Form_Load()
Combo1.Text = "Choose"
Combo2.Text = "Choose"
For i = 0 To CustRS.Fields.Count - 1
Combo1.AddItem CustRS(i).Name
Combo2.AddItem CustRS(i).Name
Next
End Sub
For example, the firm of “NewCo” appears in the database as “New Co” because of a mistake in data entry, and also under its correct name. Only after enhancement of the company’s name, i.e. after correcting “New Co” to “NewCo” will the redundancy be identified.
After identification of redundant records, the report is sent to the customer, containing several options: unification of the information in each of the redundant records into a single record (i.e. sending details of the information appearing in separate redundant records, in order not to lose important information).
Another options is to ignore and delete one or more of the redundant appearances of the record.
The third option is to leave the records unchanged. (This option is intended for customers who prefer to delete these records by themselves at a later stage.) It should be noted that, even then, data enhancement permits fast and easy identification of redundancies by the customer himself, by making simple queries.
One of the most frequently encountered problems in data enhancement is that of enhancing records containing names and addresses of company branches. In most cases the official publications and most of the databases contain a record of the company management and/or the head office only. On the other hand, a record giving details of one of the branches may appear in the customer’s databank.
For example, under the heading: Bank of America, there will actually appear one of the branches since this branch is the company’s customer. However, if another branch is added to the database of customers, we do not want this addition to be identified as a redundancy but as an additional record, preferably with a link to Bank Leumi as a major name for each of the branch records.
DataTune is prepared to handle data related to the branches of:
- Banks
- Schools
- Health fund branches
- Income tax branches
- National Insurance branches
- Government ministries
- Large companies
- Marketing chains
DataTune can attach the number of the bank and of the branch in each European country, the address, phone number and fax number. DataTune will also give a unique Sort Code to each branch.
DataTune will attach the school network name (e.g., Ort), the classification (e.g., primary school), the address, phone number, and fax number.
DataTune will attach the name of the fund, the branch name, the manager’s name, address, phone number, and fax number.
DataTune will attach the name of the branch, address, phone number, and fax number.
DataTune will attach the name of the branch, address, phone number, and fax number.
DataTune will attach most of the details for branches of government ministries, such as the Ministry of the Interior, the Licensing Offices, etc.
In the case of records containing information about one or more branches of a large company, DataTune will attach the company’s official name to every record, but will store the information for every branch.
DataTune will identify the branch and the chain to which it belongs.
Target Data has unique methods for improving information system performance. DataTune can improve the processes of retrieval, generation of reports, queries, and transfer of data from one database to another. All these processes may be made more efficient and rapid. This is done by analyzing the most and least frequent processes in the organization, and by specifying priorities by making trade-off between the following criteria:
- Storage area and speed. (This is done by constructing search keys as required or deleting keys no longer required.)
- Saving copies of items of information separately, in order to provide fast access to them, as compared to saving pointers to these items of information, in order to save storage space at the expense of the response time required to retrieve them.
The Target Data venture specialized in providing support for companies and organizations that are just starting, or are in the middle of the process of, assimilation of a CRM system.
It is frequently necessary to once-only or currently convert the contents of one database to the format of another. In many cases the company receives a file of data from a customer or from another source and it is necessary to import these data to the organization’s database. In other cases use is made of a number of applications or database and it is necessary to import or export between different formats.
Target Data offers one-time services for data conversion between different systems, providing support for most information systems including databases, off-the-shelf applications such as Access, Excel, Office, dedicated applications such as WinFax, and applications supporting organizers, such as Palm, Visor, etc.
In addition DataTune offers organizations an application for current synchronization between different systems and platforms. This is a more complex process and involves an initial investment of resources, but as a result the need is removed in the future for conversion, and import and export processes. A synchronization system such as the MultiSyncTM system of Target Data is installed in the company’s computers and constantly checks which data have been updated and in which information system. If necessary automatic updating and conversion is performed on the relevant data in the required information systems or software products. For example, in a computer in which the MultiSync system is installed, linking Palm and WinFax applications, the system will identify the addition of a new customer to the WinFax address book, and will automatically update the Palm address book accordingly, and vice versa.
A unique type of applications in which the TargetData venture specialized in, were addons and tools for MS Excel. These are tools used to automatically execute a variety of operations, such as conversion of accounts between various formats (e.g., a customer requires an application to convert accounts submitted to customers in a new format recently specified by one of his major customers). As a result, the need arises to convert hundreds of accounts from the old to the new format. This tools aids in performing automatic conversion of formats.
The following source code is the Excel class used by DataTune
VERSION 1.0 CLASS
BEGIN
MultiUse = -1 'True
Persistable = 0 'NotPersistable
DataBindingBehavior = 0 'vbNone
DataSourceBehavior = 0 'vbNone
MTSTransactionMode = 0 'NotAnMTSObject
END
Attribute VB_Name = "cExcel"
Attribute VB_GlobalNameSpace = False
Attribute VB_Creatable = True
Attribute VB_PredeclaredId = False
Attribute VB_Exposed = False
' Class : cExcel
' Description : Class for working with Microsoft Excel through Automation
' Usage : Instantiate an object of the class, use the class methods and
': vairables to access the object of excel
'Set m_objExcel = new Excel.Application
'In the StartExcel Method
Private m_objExcel As New Excel.Application
Private m_objWorkbook As Excel.Workbook
Public Property Get AppExcel() As Excel.Application
' Returns: A handle to the current instance of Excel
Set AppExcel = m_objExcel
End Property
Public Property Get CurWorkbook() As Excel.Workbook
' Returns: A handle to the currently open workbook
Set CurWorkbook = m_objWorkbook
End Property
Public Sub CloseExcel()
' Comments : Closes Excel
' Parameters: None
' Returns : Nothing
'
On Error GoTo PROC_ERR
m_objExcel.Quit
Set m_objExcel = Nothing
PROC_EXIT:
Exit Sub
PROC_ERR:
MsgBox "Error: " & Err.Number & ". " & Err.Description, , _
"CloseExcel"
Resume PROC_EXIT
End Sub
Public Sub CloseWorkbook( _
fSave As Boolean)
' Comments : Closes the current workbook
' Parameters: fSave - True to save changes, False to discard changes
' Returns : Nothing
'
On Error GoTo PROC_ERR
m_objWorkbook.Close SaveChanges:=fSave
PROC_EXIT:
Exit Sub
PROC_ERR:
MsgBox "Error: " & Err.Number & ". " & Err.Description, , _
"CloseWorkbook"
Resume PROC_EXIT
End Sub
Public Sub CreateTableFromAccess( _
strDatabase As String, _
strDataSource As String, _
fFieldNames As Boolean, _
Optional varMaxRecs As Variant)
' Comments : Gets the contents of an Access table or query into
' the current document
' Parameters: strDatabse - full path and name of the Access database
' you want to read from
' strDataSource - name of a table or query in the database
' to read records from
' fFieldNames - True to put the field names in the first
' row, false otherwise.
' varMaxRecs - optional: set to the maximum number of
' records you want to retrieve. To include all records,
' don't specify this argument.
' Returns : Nothing
'
Dim dbs As DAO.Database
Dim rst As DAO.Recordset
Dim intCounter As Integer
Dim intFieldCount As Integer
Dim lngRowCount As Long
Dim varField As Variant
Dim intRow As Integer
Dim intCol As Integer
On Error GoTo PROC_ERR
' Open the database objects
Set dbs = DAO.DBEngine.OpenDatabase(strDatabase)
Set rst = dbs.OpenRecordset(strDataSource)
intFieldCount = rst.Fields.Count
intRow = 1
intCol = 1
' Add the field names if specified
If fFieldNames Then
For intCounter = 1 To intFieldCount
m_objWorkbook.ActiveSheet.Cells(1, intCounter).Value = _
rst.Fields(intCounter - 1).Name
Next intCounter
End If
' Start inserting data on the second row of the table
lngRowCount = 2
With rst
' Loop through all records
Do Until .EOF
For intCounter = 1 To intFieldCount
' Add each fields value
varField = .Fields(intCounter - 1).Value
' Handle null field values
If IsNull(varField) Then
varField = "<null>"
End If
m_objWorkbook.ActiveSheet.Cells(lngRowCount, intCounter).Value = _
varField
Next intCounter
lngRowCount = lngRowCount + 1
' See if we are still in range
If Not IsMissing(varMaxRecs) Then
If lngRowCount > varMaxRecs Then
Exit Do
End If
End If
' Move to the next record
.MoveNext
Loop
End With
' Cleanup
rst.Close
dbs.Close
PROC_EXIT:
Exit Sub
PROC_ERR:
MsgBox "Error: " & Err.Number & ". " & Err.Description, , _
"CreateTableFromAccess"
Resume PROC_EXIT
End Sub
Public Sub CreateWorkbook( _
strName As String, _
fSave As Boolean)
' Comments : Creates a new workbook and saves it
' Parameters: strName - name for the new workbook
' fSave - True to save, False to leave unsaved
' Returns : Nothing
'
On Error GoTo PROC_ERR
Set m_objWorkbook = m_objExcel.Workbooks.Add
m_objWorkbook.SaveAs FileName:=strName
PROC_EXIT:
Exit Sub
PROC_ERR:
MsgBox "Error: " & Err.Number & ". " & Err.Description, , _
"CreateWorkbook"
Resume PROC_EXIT
End Sub
Public Sub InsertValue( _
strRange As String, _
varValue As Variant)
' Comments : Inserts values into cells
' Parameters: strRange - string defining the range to insert into
' varValue - value to insert
' Returns : Nothing
'
On Error GoTo PROC_ERR
m_objWorkbook.ActiveSheet.Range(strRange).Value = varValue
PROC_EXIT:
Exit Sub
PROC_ERR:
MsgBox "Error: " & Err.Number & ". " & Err.Description, , _
"InsertValue"
Resume PROC_EXIT
End Sub
Public Function OpenWorkbook( _
strFileName As String, _
fReadOnly As Boolean, _
Optional varPassword As Variant) As Excel.Worksheet
' Comments : Opens the named file and associates it with the class
' Parameters: strFileName - full path and name of the file to open
' fReadOnly - True to open readonly
' varPassword - Optional: specify the password if the
' workbook file is password protected.
' Returns : Nothing
'
On Error GoTo PROC_ERR
If Not IsMissing(varPassword) Then
Set m_objWorkbook = m_objExcel.Workbooks.Open( _
strFileName, _
, _
fReadOnly, _
, _
varPassword)
Else
Set m_objWorkbook = m_objExcel.Workbooks.Open( _
strFileName, _
, _
fReadOnly)
End If
PROC_EXIT:
Exit Function
PROC_ERR:
MsgBox "Error: " & Err.Number & ". " & Err.Description, , _
"OpenWorkbook"
Resume PROC_EXIT
End Function
Public Sub OpenWorkbookFromLib( _
strFileName As String, _
fReadOnly As Boolean, _
Optional varPassword As Variant)
' Comments : Opens the named file and associates it with the class.
' This version looks in the Excel library folder.
' Parameters: strFileName - name of the file to open
' fReadOnly - True to open readonly
' varPassword - Optional: specify the password if the
' workbook file is password protected.
' Returns : Nothing
'
Dim strLibPath As String
On Error GoTo PROC_ERR
strLibPath = m_objExcel.LibraryPath & _
m_objExcel.PathSeparator & _
strFileName
If Not IsMissing(varPassword) Then
Set m_objWorkbook = m_objExcel.Workbooks.Open( _
strLibPath, _
, _
fReadOnly, _
, _
varPassword)
Else
Set m_objWorkbook = m_objExcel.Workbooks.Open( _
strLibPath, _
, _
fReadOnly)
End If
PROC_EXIT:
Exit Sub
PROC_ERR:
MsgBox "Error: " & Err.Number & ". " & Err.Description, , _
"OpenWorkbookFromLib"
Resume PROC_EXIT
End Sub
Public Sub PrintSheet( _
intFrom As Integer, _
intTo As Integer, _
intCopies As Integer, _
fPreview As Boolean, _
fPrintToFile As Boolean, _
fCollate As Boolean)
' Comments : Prints the active workbook
' Parameters: intFrom - starting page number
' intTo - ending page number
' intCopies - number of copies
' fPreview - True for print preview
' fPrintToFile - True to print to a file. Excel will prompt
' for the filename when this is set to True.
' fCollate - True to collate copies
' Returns : Nothing
'
On Error GoTo PROC_ERR
m_objWorkbook.PrintOut _
intFrom, _
intTo, _
intCopies, _
fPreview, _
, _
fPrintToFile, _
fCollate
PROC_EXIT:
Exit Sub
PROC_ERR:
MsgBox "Error: " & Err.Number & ". " & Err.Description, , _
"PrintSheet"
Resume PROC_EXIT
End Sub
Public Sub SortRange( _
strRange As String, _
strKey As String, _
Optional fAscending As Boolean = False)
' Comments : Sorts the specified range
' Parameters: strRange - range to sort
' strKey - range to use as the key for sorting
' fAscending - True for ascending, False for descending
' Returns : Nothing
'
Dim lngSort As Integer
If fAscending Then
lngSort = xlAscending
Else
lngSort = xlDescending
End If
m_objWorkbook.ActiveSheet.Range(strRange).Sort _
Key1:=ActiveSheet.Range(strKey), order1:=lngSort
PROC_EXIT:
Exit Sub
PROC_ERR:
MsgBox "Error: " & Err.Number & ". " & Err.Description, , _
"SortRange"
Resume PROC_EXIT
End Sub
Public Function ReadRange(strRange As String) As Variant
' Parameters: strRange - range to sort
' strKey - range to use as the key for sorting
' fAscending - True for ascending, False for descending
' Returns : Nothing
'
Dim lngSort As Integer
ReadRange = m_objWorkbook.ActiveSheet.Range(strRange)
PROC_EXIT:
Exit Function
PROC_ERR:
MsgBox "Error: " & Err.Number & ". " & Err.Description, , _
"SortRange"
Resume PROC_EXIT
End Function
Public Sub StartExcel(fVisible As Boolean)
' Comments : Starts an instance of Excel
' Parameters: fVisible - True to make Excel visible
' Returns : Nothing
On Error GoTo PROC_ERR
'SEE COMMENTS IN DECLARATIONS FOR REASONS THIS IS COMMENTED OUT
'AND WHY YOU MIGHT WANT TO UNCOMMENT IT
'Set m_objExcel = New Excel.Application
m_objExcel.Visible = fVisible
PROC_EXIT:
Exit Sub
PROC_ERR:
MsgBox "Error: " & Err.Number & ". " & Err.Description, , _
"StartExcel"
Resume PROC_EXIT
End Sub
Special projects
Projects like the one in the screenshot were created using the DataTune technology as turn-key solutions for specific customers. In the image: a project for Del Engineering.
Attaching details about companies
It is possible to attach to the names of companies a broad range of details, including private company number, number of workers, date of establishment, sales turnover, names of contact persons, area of activities.
Location of potential customers
The TargetData venture offered the option of locating potential customers amongst companies and authorized traders, based on criteria such as number of workers, sales turnover, geographic region, where the proposed companies database includes only those companies that do not already exist in your organization’s database. The TargetData venture offered this service as an addition to the project, to be executed on completion of the project, so that the data regarding potential customers will be integrated into the existing data only after enhancement. In other words, if the company name is incorrect it should be enhanced first of all in order to identify it as a recognized name in the database of potential customers, so that it can be filtered from this database.
- Development of a user interface and search keys
DataTune allows defining its outputs of its work in any format the customer chooses. However, it is possible to include in the work output an Access or Excel application including the following:
- A user-friendly interface containing buttons and menus for data retrieval, report generation, and updating of data.
- Search keys permitting extremely rapid data retrieval.
- Construction of ready-made queries for performing retrieval operating known in advance (e.g. retrieval of all the data about customers whose enlistment date lies in a range of given dates).
- Issue of reports based on user’s characteristics.
- Control of several databases from a single window.
Systems analysis
The project may include comprehensive systems analysis. This is required when:
- It is necessary to convert an existing system.
- It is necessary to develop a system which will interface with an existing system.
- It is necessary to develop customized software.
Systems analysis permits the TargetData staff to thoroughly learn about the work processes currently existing, in order to develop a system which will adopt the work processes and the logic behind them so that they can be assimilated in the computerized system. The systems analysis will take about three weeks and will contain the following elements:
- A written questionnaire: At this stage you will be sent a list of questions about various subjects related to the existing work processes, professionals, roles, time durations and products.
- An oral questionnaire: In this stage the TargetData staff interviews the key personnel in the organization in order to acquire in-depth understanding of the work process, locate bottlenecks, and construct a flow chart of the work process. Analysis of the information systems includes the analysis of current and future systems.
The Project Manager assigned by TargetData to work with the customer makes an in-depth investigation of which items of information are required by the customer, which items are collected, which items are collected but not saved, which items may be retrieved at a later stage (for example, for purposes of marketing additional projects to customers), which items are unnecessary, etc.
Technology and infrastructure
Methodology
Target Data makes use of unique tools that it has developed for automatic data processing on a tremendous scale.
At the end of the automatic process a list is produced of the data requiring manual examination and enhancement.
Manual enhancement
Development tools
The unique system for enhancement of addresses
DataTune was developed under the following platforms:
- Microsoft Visual C++ 6.0
- Microsoft Visual Basic 6.0
- Microsoft Access
- VBA (Visual Basic for Application) — automation and macro programs for Office and other Microsoft tools.
The DataTune Scrambler
DataTune has a scrambling feature which can be used for providing a "pilot" prior to starting a project. The advantage of a "pilot" is the ability to randomly select a number of records (usually 500) and perform data cleansing on these records. TargetData has provided such pilot free of charge. The part in the DataTune application that does that is this subroutine:
Private Sub DoScramble()
Dim rs1, rs2 As Recordset
Dim NumRecs As Long
Randomize Timer
Status = "Create temp table"
DoEvents
CreateTDTable "td_temp1"
Status = "Create Table " + Text2.Text
DoEvents
CreateTDTable Text2.Text
Status = "Copy data to temp table"
DoEvents
CopyAllData CurrentTable, "td_temp1"
Status = "Start scrambling"
DoEvents
Set rs2 = CustDB.OpenRecordset(Text2.Text)
Set rs1 = CustDB.OpenRecordset("td_temp1")
rs1.MoveLast
rs1.MoveFirst
NumRecs = rs1.RecordCount
ProgressBar1.Max = Val(Text1.Text)
For k = 1 To Val(Text1.Text)
ProgressBar1.Value = k
DoEvents
jump = Int(Rnd(1) * (NumRecs - 1))
Status = "Sampling record #" + Trim(Str(jump))
rs1.Move jump
rs2.AddNew
For i = 0 To rs1.Fields.Count - 1
rs2(i) = rs1(i)
Next
rs2.Update
rs1.Delete
Set rs1 = CustDB.OpenRecordset("td_temp1")
rs1.MoveLast
rs1.MoveFirst
NumRecs = rs1.RecordCount
Next
rs1.Close
CustDB.TableDefs.Delete "td_temp1"
Status = "Completed"
rTables.FindFirst "TableName = '" + Text2.Text + "'"
If rTables.NoMatch Then
rTables.AddNew
rTables("TABLENAME") = Text2.Text
rTables("PROJECTID") = ProjectID
rTables.Update
End If
DoEvents
End Sub
Test and optimization tools
- Numega — Bounds Check was used for source code testing.
Platforms supported
DataTune supports the following Operating Systems:
- Windows
- DOS
- UNIX
- Mainframe systems
- AS 400
- Mac OS
- LINUX
The unique mechanism for learning the structure of unfamiliar files
The output of DataTune may be supplied as an EXE file. In certain cases the source code may be purchased from TargetData.
The work process Target Data offered to clients was generally implemented as follows:
No. | Activity | Time required |
0 | Meeting in the customer’s offices. Alternatively a pilot scheme may be initiated. | |
1 | Executing a pilot scheme of 500 records (if the customer is interested in this). | Generally within 4 working days |
2 | Preparing a proposal. | Generally within 3 working days |
3 | Executing the work and installing the products. | Generally within 10 working days |
4 | Performing training in the customer’s offices, or by phone, if necessary. | When installing the work products, or at a different time at the customer’s request |
5 | Assimilation and accompaniment. | During one year from the date of completion of the project |
6 | Receiving feedback and introduction improvements to the software as required. | Up to one year from the date of completion of the project |