Princeton University Library Data and Statistical 

Search DSS

Finding Data Analyzing Data Citing data

About Us

DSS lab consultation schedule
Sep 1-Nov 3By appt. here
Nov 6-Dec 15Walk-in, 2-5 pm*
Dec 18-Feb 2By appt. here
Feb 5-May 4Walk-in, 1-5 pm*
May 7-May 15Walk-in, 2-5 pm*
May 16-Aug 31By appt. here
For quick questions email
*No appts. necessary during walk-in hrs.
Note: the DSS lab is open as long as Firestone is open, no appointments necessary to use the lab computers for your own analysis.

Follow DssData on Twitter
See DSS on Facebook

Home Online Help Statistical Packages Stata Introduction to Stata

Introduction to Stata

Stata is a statistical analysis package with programming capabilities. A variety of tasks can be accomplished by issuing commands interactively from the command line. There are commands built into Stata that allow the user to do statistical analysis such as regression on pre-formatted data sets.

Stata's commands can also be combined in sequences to solve complex data management and analysis problems. These sequences of commands can be saved in Stata "do files" and run over and over.

Issuing Commands

Commands can be executed one at a time at the Stata prompt (command window in Stata for Windows). Just type the command and hit the "Enter" key. Alternatively, groups of commands can be entered into do files which can then be executed.

Stata's Online Help

There is online help available inside Stata. To get help for a command, simply type "help" and the name of the command. If the command exists and you typed in the name correctly then the help screen for that command will display.

If you are not sure which command you need, you can type "search" and a keyword. Stata will display a list of commands and other resources associated with that keyword, if there are any. Click on the name of one of the commands or resources to display the help screen.

Alternatively, you can use the "Help" menu, and click on "Stata Command" if you know the command or "Search" if you don't.

Stata displays information one screen at a time. To proceed to the next screen, hit the space bar or click on the more prompt at the bottom left corner of the results window. If you've seen enough, hit control-k (hold down the control key and type k) on Windows or control-c on Unix to cut off the flow of information. Alternatively, click the break button, which is a red circle with an X through it near the top of the Stata window.

Operating System Interface

Stata starts in its default working folder or directory, typically C:\data or C:\stata. If you don't change it to something else, Stata will assume that any file name you type is in the default directory. Since normally your data will be in other directories, you need the cd (Change Directory) command:

	. cd c:\temp

This changes the directory to C:\temp, and until you issue another cd command, Stata will now look for files there.

In case you're not sure where you are, you can always find out by giving the pwd (Print Working Directory) command:

. pwd

Memory requirements

Sometimes you may need to allocate additional memory for your Stata session, such as when you are working with a large file. If you recieve this message from Stata when trying to open a data file:

     no room to add more observations

then you should increase the amount of memory available to your Stata session. Here's how.

  1. Find out how large the file is. First, issue the clear command to remove the file from memory. Then issue the desc using filename command:
          desc using mydata.dta
    At the top of the information listed is the size of the file, in bytes. There are 1,000,000 bytes in a megabyte, so if the size is 11,000,000 then the file is 11 megabytes. For example:
    Contains data                                
      obs:         1,248                                                            
     vars:           104                                                            
     size:    11,576,600                                                            
    This shows that the file has 1248 observations, 104 variables, and is just slightly over 11.5 megabytes.

  2. Issue the command set memory to increase the amount of memory. For example, the following command allocates 15 megabytes of memory to the current Stata session:
         set memory 15m
    Set the memory to a number one and a half to two times the size of the file you are trying to read. Note: setting the memory to an unnecessarily large number will slow Stata down, or even cause it to freeze completely.
  3. Now read your data file.

How to Reduce Memory Requirements

You may be able to reduce your memory requirements by saving your data more efficiently. You can use the compress command to reduce the amount of memory your data consumes.


You need to save the new version of the file in order for the changes to remain in effect. Issue the Stata save command followed either by a new name for your Stata file or by the replace operand. For example

   save myfile2

will save a new Stata file named myfile2.dta. You will have the original file, myfile.dta plus the new file.

Alternatively, the following command will overwrite the original file:

   save, replace

Keeping Track of Your Work

It's very important to keep permanent copies of the work you do in Stata. The way to do this is using the log command:

. log using 090904.log

Once you've issued the log command, from then on everything you type and all the output Stata produces automatically is recorded in the .log file you specify. The .log file saves automatically when you exit Stata.

A .log file is a plain text file that you can open and edit in Word, Excel, Notepad, and many other software packages. It is important that you type ".log" at the end of the log command. Do not do this:

. log using 090904

If you do, Stata will automatically open a log file called 090904.smcl. The file type ".smcl" is a special Stata format that can only be opened in Stata - not very useful if you want to look at your results.

Stata's Built-in Calculator: display

There is a Stata command called display (abbreviated di) that when used from the command line acts as a built-in calculator:

. di (2+2)*5

This page last updated on: