Using DBMS/Copy

DBMS/Copy is a very useful program for converting data from one format, such as an Excel spreadsheet, to another format such as a Stata ".dta" or SPSS ".sav" file. On the PC, it has an easy-to-use graphical interface and can convert among dozens of different file types. On UNIX, it also has an easy-to-use graphical interface as well as a command-line interface. It can even be used to extract data from ascii/text files. Although the instructions here describe many steps, they are rather simple. Typically, converting a file from SAS to Stata, for example, should only take a few minutes.

DBMS/Copy on the PC can be found in any of the CIT clusters as well as the Data and Statistical Services data lab. DBMS/Copy on UNIX can be found on any of the Arizona machines. If you need help using DBMS/Copy, please contact Data and Statistical Services.


Tips for Using DBMS/COPY

When you don't Need DBMS/COPY to Convert File Formats
Some statistical packages can read data files created by other programs. These statistical packages can read the data files directly, without having to convert them first. Here are some instances when you don't need DBMS/COPY to convert file formats.

STATA includes the insheet command for reading tab- and comma-delimited files. Excel and Datastream files can be saved in this format. For details see STATA: How to Import Excel and Datastream Files.

The S-PLUS package has built-in procedures for accessing SAS data files as well as Ascii comma- or tab-delimited files.

A Note for SAS users
SAS has several different types of files and how you convert them depends on how they were created. If you have problems converting a SAS file, please look at these instructions

Verification
Even when a file converts successfully, it is recommended that you verify the conversion with some summary statistics. First generate some summary statistics (e.g, means) and a few frequency distributions using the original file, and then compare them against the same statistics generated from the new file.

Missing data values often do not convert in a way that users expect, and these are easily detected by checking the minimum and maximum values for particular variables along with the number of non-missing observations.

Finally, you can print all of the data for a few observations and compare them across the original and converted files. Select a few observations from the beginning of the file and a few from the end of the file.

Transferring Files
Often you need to not only covert the format of the file you are using, but you also need to copy the file from one computing platform to another. Some file formats, such as STATA, can be transferred from one computer to another without any special procedures. Other files formats, such as SAS and SPSS, must be saved in transport or portable formats before they can be transferred from the computer where they were created.

File Transfer Protocol (FTP) is one of the most common programs used to copy files between UNIX, MVS, PCs and other types of computers. See the documentation on this web for How to Transfer Files.

Errors
When you invoke DBMS/COPY, a file named dbmscopy.log is automatically written to your UNIX home directory. This file contains the commands that were sent to DBMS/COPY as well as responses to the commands. This file is appended with subsequent invocations of DBMS/COPY.

For example, the following entry in dbmscopy.log shows the commands that were used to convert the format of a file. The log file indicates that the commands were sent from an interactive (Xwindows) session. For diagnostic purposes, you can also see the results of the conversion, namely the number of records and variables written to the new file. You should compare these figures against the original file. Any errors encountered during processing would appear in the log file.

*** Input From Interactive Copy              Thu Jan  2 12:23:10 1997
compute;
in=/scratch/bigdata/hsbstud.ssdsun
   out=/scratch/bigdata/hsb.spssport;
run;
3941 Records Of 89 Variables Written To /scratch/bigdata/hsb.spssport