As an undergraduate economics student, I learned basic regression syntax in Stata in my econometrics courses, but I was never taught the tools I would need to perform research with Stata. When I got to graduate school, I learned that I was not alone. A bunch of would-be economists/researchers are “learning Stata” and being deprived of simple tools that would multiply their research efficiency and employability by orders of magnitude.
Over the coming years, I intend to build up this section of my website with brief, intuitive guides on how to do everything I wish I had learned after my first undergraduate econometrics course. If there is something you either wish you could learn or that you know now and wish you learned sooner, I would like to hear from you. The unlinked titles in the page map below includes guides I plan to write, in order of when I plan to write them. If you don’t see your idea below, DM me on Twitter or send me an email at edgel@wisc.edu.
Page Map:
Long and Short Data
Local and Global Macros
Conditional Statements
Looping
For loop
While loop
Graphics
Basic Coding in Stata
Do Files
Do files are text documents that run series of Stata commands. They can be thought of as a queue of commands that run one-by-one through Stata's command prompt as if you were copy-and-pasting them. The button near the top-left of the Stata window opens the do file editor:
Within the do file, lines are read sequentially. Each line is either empty, contains a command, or contains a comment.
Comments
Comments can be used to document the intent of the code. Beginning a line or set of text with //
or will designate everything that follows in the line as comments. You can also designate blocks of text using
/
and */
:
All commented text is made green in the do file.
Importing and Exporting Data
Excel and comma-delimited files can be seamlessly read into Stata. It is good practice to include the import command in your do file (even if you comment that command out after you import it, as you may want to do for large files), because you may want to re-import the original data after making an irreversible mistake, like deleting or replacing a variable.
All Excel files (.xls or .xlsx) can be imported with import excel using myfilename.xlsx
.1 There are a number of specifications you will need, depending on the data you're importing. For example, if the first line of “mydata.xlsx” Excel file includes variable names, you would insert the command import excel using mydata.xlsx, first
. All possible options for the import excel
command, along with explanations for what they do, are available by entering help import excel
in the Stata command prompt.
Comma-delimited files (*.csv) are imported with import delimited using myfilename.csv
. Though this command is nearly identical to the import excel
command, it has different options, even for the same actions. For example, importing “mydata.csv” and treating the first line as variable names would require the command import delimited using mydata.csv, varnames(1)
.
Exporting to Excel or comma-delimited files works essentially the same way that importing does, but with the export excel
and export delimited
commands. You can use Stata's help
command to see details about the options available for these commands.
Directories
Of course, if you have a file named “mydata.xlsx” saved in your documents (or anywhere else), then typing import excel mydata.xlsx
won't work. You can do one of two things to import this file: you can either include the full file path in the import command. For a Windows computer, this would mean something like import excel "C:\Users\edgel\Documents\mydata.xlsx"
.2
A cleaner method is to set your directory earlier in the do file. Setting the directory as a specific folder on your computer essentially tells Stata “everything I import or export goes to or comes from this folder”. You do this with the cd
command (which stands for 'current directory'). So in the above example, I would instead enter the following lines in my code:
cd "C:\Users\edgel\Documents"
import excel mydata.xlsx
Saving/Opening Stata Data Sets
An alternative to importing and exporting Excel or delimited files each time you use Stata is saving your data set in Stata's custom file type, the .dta file. To save whatever data you have loaded as a .dta file, simply use the command save mydata
. You can open this file with use mydata
. If you already have a data set loaded when you use the use
or import
command, Stata will throw an error stating that the command will overwrite the current data. You can prevent this error by either clearing the workspace with the clear
command, or by modifying your import
or use
command with the clear
option.
1 Some Excel files have a .xls extension. Make sure to know the extension of your Excel file before typing out the import command. return
2 If the file path or file name includes spaces, you must put it in quotation marks. If there are no spaces, quotation marks are not necessary. return
LJ Ristovska’s language-agnostic guide to coding for economists is especially helpful for establishing and following good coding and project management practices
This extensive introduction to Stata by the Harvard Institute for Quantitative Social Science