SORT/MERGE/XTRACT A Package Of OS/8 Compatible Sort - Merge Programs Small Computer Laboratory Department of Physiology and Biophysics West Virginia University Medical Center Morgantown, West Virginia 26506 SORT Version 8 June 1, 1977 MERGE Version 2 June 1, 1977 XTRACT Version 1 June 1, 1977 SORT/MERGE/XTRACT This document describes a package of programs for dealing with OS/8 ASCII files. The principal utility is SORT which was written in it's original version by C. G. Roby Jr.. MERGE and XTRACT are companion programs to assist in the efficient sorting of large data sets. MERGE was written by James B. Coryell of Dataproducts Corporation and XTRACT by Thomas W. McIntyre and Alan Smothers of WVU. JBC also extensively revised SORT from version 6 to 7. TMC is responsible for the present versions of all the programs. SORT SORT is a program to sort OS/8 compatible ASCII files. The sorting is by records. A record is defined to be a string of no more than 512 ASCII characters. A record is terminated by "n" lines or an arbitrary Record-Mark (rm). Normally any of the characters LF, FF, or VT will be taken as a line terminator. CR is always ignored. If /Y is given as an option, FF is also ignored. The string consists of one or more lines. A line is terminated with a line feed. All other control characters except tabs are ignored. A record which has only control characters is called a 'null' record. 'Null' records are ignored. That is, there will be no null records in the output file. The user has the option to define fields for the sorting either by fixed column positions or bounded by arbitrary delimiting characters. The sorting can be either ascending or descending within each field. The sorting procedure used is a multi-pass sort-merge with intermediate temporary output files. The devices for the output files may be specified by the user to optimize the sorting. To run SORT under OS/8, type in response to the monitor ".": .R SORT or .R SORT (field definitions) or .R SORT (field definitions) n or .R SORT (field definitions) n #rm When SORT is loaded it calls Command Decoder for user input of I/O specifications. In response to the asterisk the user may specify up to nine input files and up to three output files. The input files will be construed to be a single file. The first output file is the sorted output, the next two are the intermediate work files. If the intermediate files are omitted, the files DSK:SRTWK1.TM and DSK:SRTWK2.TM are created. If only devices are specified for the second and third output files, then SRTWK1.TM and SRTWK2.TM will be created on the user specified devices. For small or slow I/O devices, the temporary files should be on SORT-2 different devices to improve the sort speed. The first pass output is to the third output file. The lengths of the intermediate files will be one and one-half to twice the length of all the input files. The intermediate files must reside on directory devices. If SORT cannot open these two files, an error will be printed out. The option switch "S" may be used to specify BASIC compatible stripped ASCII compares. The input files are still assumed to be in full ASCII, but the strange compares of BASIC are used (i.e. the control characters and numbers are "later" in the ASCII sequence than the alphabetics). If this option is used, the delimiter form of field specification can not be used with a numeric field. Examples of command decoder lines follow: *SRTOUT