Princeton University Library Data and Statistical 
Services

Search DSS





Finding Data Analyzing Data Citing data

About Us


DSS lab consultation schedule
(Monday-Friday)
Sep 1-Nov 4By appt. here
Nov 7-Dec 16Walk-in, 2-5 pm*
Dec 19-Feb 3By appt. here
Feb 6-May 5Walk-in, 1-5 pm*
May 8-May 16Walk-in, 2-5 pm*
May 17-Aug 31By appt. here
For quick questions email data@princeton.edu.
*No appts. necessary during walk-in hrs.
Note: the DSS lab is open as long as Firestone is open, no appointments necessary to use the lab computers for your own analysis.

Follow DssData on Twitter
See DSS on Facebook

Data Preparation for Event Studies using Stata

Preparing your own data

You may have downloaded datasets for an event study, or created ones by entering data into excel sheets. Usually people have two files, one for stock returns, and the other for your event of interest. In this example, we start with two data sets, one called eventdates and the other called stockdata. In the eventdates data, we have company id (company_id) and the date of event (event_date) as variables. In the stock data, we have matching company id (company_id), stock return date (date), stock return (ret), and market return value (market_return).

If a set of observations for each company can be matched to a single event date, the study will be much simpler. In some situations one may wish to examine more than one event date for each company. In multiple observations per company, it is necessary to create a duplicate set of observations for each event date/company combination. You need a full set of stock observations to examine each event. If you are not sure how many events per company you have on your dataset, we recommend that you go through this exercise to check the number of events per company.

If you already know that you have only one event per company, you may skip the instruction below, merge the eventdate and stockdata data files and go to the Event Study with Stata page.

Using example data

Alternatively, you may try the commands in our event studies example using our sample data set. There are two data sets: one called eventdates, that contain event information, and the other called stockdata. The events, in this example, are merger announcement dates for 2007 obtained from SDC Platinum. The stock return data for 2007 were obtained from CRSP daily stock (the sample dataset is only available to Princeton University users) .

The computer you are using Stata needs to be connected to the internet for this download to work. Please be patient with the download. You can see a counter on the bottom of Stata window which shows how much percent of the file has been downloaded. Once you finish the download, save the data on your computer or where you have write permission, like thesis folder in your H drive. Make sure that you have enough space in your drive to save these data files.

set memory 200m
use http://dss.princeton.edu/sampleData/eventdates.dta  /* about 11k */
save H:/thesis/eventdates
use http://dss.princeton.edu/sampleData/stockdata.dta, clear  /* about 90m */
save H:/thesis/stockdata

Combining event and stock data

First, set memory to a large enough size so that you can do the rest of the operations below. We will be creating some variables and possibly duplicating cases, so the dataset can get VERY BIG. To check how much memory you have allocated, the command is query, and to check how big your file is, the command is describe.

Now, we need to find out how many event dates there are for each company. Use the dataset of event dates and generate a variable that counts the number of event dates per company.

use eventdates, clear
by company_id: gen eventcount=_N

Cut the dataset down to just one observation for each company. Each company observation is associated with the count of event dates for that company. Save this as a new dataset - don't overwrite your dataset of event dates!

by company_id: keep if _n==1
sort company_id
keep company_id eventcount 
save eventcount

The next step is to merge the new 'eventcount' dataset with your dataset of stock data.

use stockdata, clear
sort company_id
merge company_id using eventcount
tab _merge
keep if _merge==3
drop _merge

Now use Stata's 'expand' command to create the duplicate observations. The 'eventcount' variable has been merged on to each stock observation, and tells Stata how many copies of that observation are needed. This is where your dataset can get VERY BIG, as we are duplicating the observations to however many counts of event we have per company.

expand eventcount		 

You need to create a variable that indicates which 'set' of observations within the company each observation belongs to. Then sort the dataset to prepare for another merge.

drop eventcount
sort company_id date
by company_id date: gen set=_n
sort company_id set
save stockdata2

Back in your original event dates dataset - not the 'eventcount' one! You need to create a matching set variable to identify the different event dates within each company. The final step is to use the set variable to match each event date with a set of stock observations.

use eventdates, clear
by company_id: gen set=_n
sort company_id set
save eventdates2
use stockdata2, clear
merge company_id set using eventdates2
tab _merge		 

Here, you may have observations where you have the events information but not stock information. You may examine which companies stock information is missing.

		  
list company_id if _merge==2 
keep if _merge==3
drop _merge
Finally, create a new variable that groups company_id and set so that you have a unique identifier to use in the rest of your analysis.
		  
egen group_id = group(company_id set)	  

During the rest of your analysis, use group_id wherever the event study instructions say company_id. You're now ready to return to the Event Study with Stata page.

This page was last updated on May 30, 2008