Sunday, March 4, 2007

Combining Datafiles

Dealing with text based data extraction can be time consuming and cause real hassle, especially if you have to combine data files without causing a file to "blow out" with duplicated data. To help me with automating a couple of processes, I wrote this console application in VB .NET. The file is small (15KB) and does not require installing... But you will need to have loaded the Microsoft .NET Framework Version 1.1 on any workstation on which you want to run this application.

The zip file download of NGCombine.exe is mounted on my personal website. The downloaded zipfile will need to be opened and NGCombine.exe can be copied into the system directory of your workstation or to a directory of your choosing. If you load the file into the system directory you will not need to use a full filepath to call it.

NGCombine.exe has been limited to processing 1,000,000 records (that is lines of text) per file, which I think is plenty for most of us! Details of the call syntax follows:

NGCombine.exe
Combines the contents of two text files sorting the data and eliminating empty lines. (If applied to a single file, the file is sorted.)

Syntax NGCombine [/a [X:\...]] [/n [X:\...]] [/o [X:\...]] [/e]

Parameters
/a [X:\...]
Required: Specifies filepath for the file containing original or "Archive" data.

/n [X:\...]
Specifies filepath for the file containing "New" or incoming data. If this file is not specified, a data sort will occur on the original or "Archive" data only.

/o [X:\...]
Specifies filepath for the output data. If this file is not specified, the default output filepath is the original or "Archive" data filepath.

/e
Eliminates any duplicate lines of data.

/r
Reverses the sort order.

/?
Displays help at the command prompt.

No comments: