Home
Online Help
Statistical Packages
Stata
Data Management
Converting to and from Excel files
Converting Excel to Stata
Converting Stata to Excel
Spreadsheet and database programs, such as Excel and Datastream, can save and read files in tab- and comma-delimited format. These files contain one observation per line where the values are separated by tabs or commas. In addition, the first line of the file may contain variable names. Comma-delimited files often have an extension of .csv for "comma-separated values."
This is an example of a spreadsheet from Excel. It cannot be brought into Stata without some editing and saving it as a ".csv" file.
The following conventions must be followed for Stata to read your spreadsheet successfully:

Once you have completed the changes and you are in Stata, give the command:
to read in the file.
Here are step-by-step instructions for saving an Excel worksheet as a comma-delimited file and reading that file into Stata. Follow these instructions if you have saved your data as an Excel file (.xls) or as a comma-delimited file (.csv).
a. Make sure that missing data values are coded as blank or as numeric values (e.g., 999 or -1). Do not use periods (.) or any other character values (e.g., N/A) to represent missing data. Simply leave the cells empty or code a number.
b. Make sure that there are no commas in the numbers. You can change this under Format menu, then select Cells....
c. Make sure that variable names are included only in the first row of your spreadsheet. There should be only one row of variable names (some files produced by databases have several header rows). Variable names should be 32 characters or less, start with a letter and contain no special characters, such as $ or &, except the underscore [_]. You should eliminate embedded blanks (spaces).
insheet using country.csv
where country.csv is the name of the comma-delimited file.One particular thing to be aware of after insheet is that sometimes (for various quirky reasons) Stata will mistakenly read numeric variables as strings. Issue the command describe (abreviated d) and look at the output. The way you can tell from describe output is to look in the "storage type" column. Anything that starts with "str" is a string and anything else is a number. If you see that a variable that is supposed to be numeric has a storage type that starts with "str" you know Stata's messed up and you will need to fiddle with your text file to get it right.
Sample Data
This is what the data should look like in Excel.
Notice that variable names are in the first row only,
and variable names are 32 characters or less.
Notice also that the data values contain no
extraneous symbols and that cells with missing data
have been left blank.
country year nomdepr realdeva infl changegdp
CHILE 1981 -.6033136 .1875 .1839487
CHILE 1982 .8828205 -.6142148 .1052632 -.0267065
CHILE 1983 .1920196 -.6747882 .2380952 .2571221
MEXICO 1989 .1578255 -.0900467 .2006079 .2890742
MEXICO 1990 .1152594 -.1551962 .2658228 .4331929
MEXICO 1991 .0426428 -.2513676 .2072532
SPAIN 1990 .1980529 -.3306062 .1549296 .1490039
SPAIN 1991 .229653 -.3478578 .1463415 .1237474
Here is the same data saved in
comma-delimited format.
The first row of the file contains the names of the variables.
Each subsequent row contains a value for each of the six variables.
Each value is separated by a comma.
"country","year","nomdepr","realdeva","infl","changegdp"
"CHILE",1981,,-.6033136,.1875,.1839487
"CHILE",1982,.8828205,-.6142148,.1052632,-.0267065
"CHILE",1983,.1920196,-.6747882,.2380952,.2571221
"MEXICO",1989,.1578255,-.0900467,.2006079,.2890742
"MEXICO",1990,.1152594,-.1551962,.2658228,.4331929
"MEXICO",1991,.0426428,-.2513676,,.2072532
"SPAIN",1990,.1980529,-.3306062,.1549296,.1490039
"SPAIN",1991,.229653,-.3478578,.1463415,.1237474
Notes
The general syntax of the Stata command insheet is:
insheet using [filename]
where filename is the name of your comma-delimited file. For example:
insheet using mydata.csv
Since file extensions vary from program to program, you should provide the file extension as part of the filename on the insheet command. File extensions of .txt and .csv are common.
While it is generally easier to convert an Excel file to Stata with the insheet command, there are some cases where you may want to use DBMS/COPY to accomplish the conversion. You should consider using DBMS/COPY if you have a lot of missing data (see step #2 above) or you are converting multiple worksheets, i.e., a workbook with several spreadsheets.
Sometimes you may want to work with a dataset in Excel rather than Stata. To convert a Stata file to Excel, use the "outsheet" command:
outsheet using mydata.csv,c
This command will create a comma-delimited file called "mydata.csv". This file can be opened in Excel by simply double-clicking on the file's icon. Be sure to use the "csv" extension, as Excel recognizes this by default. Occasionally, when you open the file in Excel, some of the values will not look "right." For example, you may see scientific notation where there was none in the original Stata file. This is due to differences in how Excel and Stata format the values. To correct this, first try using the "Format...Cells" command in Excel. If that does not work, then you must go back to Stata and use Stata's format command. For more information about the format command, see Stata's online help, the Stata manual, or ask a DSS consultant.