As an undergraduate economics student, I learned basic regression syntax in Stata in my econometrics courses, but I was never taught the tools I would need to perform research with Stata. When I got to graduate school, I learned that I was not alone. A bunch of would-be economists/researchers are “learning Stata” and being deprived of simple tools that would multiply their research efficiency and employability by orders of magnitude.

Over the coming years, I intend to build up this section of my website with brief, intuitive guides on how to do everything I wish I had learned after my first undergraduate econometrics course. If there is something you either wish you could learn or that you know now and wish you learned sooner, I would like to hear from you. The unlinked titles in the page map below includes guides I plan to write, in order of when I plan to write them. If you don’t see your idea below, DM me on Twitter or send me an email at edgel@wisc.edu.

Page Map:

Basic Coding in Stata

Do Files

Do files are text documents that run series of Stata commands. They can be thought of as a queue of commands that run one-by-one through Stata's command prompt as if you were copy-and-pasting them. The button near the top-left of the Stata window opens the do file editor:

plot of chunk unnamed-chunk-2

Within the do file, lines are read sequentially. Each line is either empty, contains a command, or contains a comment.

Comments

Comments can be used to document the intent of the code. Beginning a line or set of text with // or will designate everything that follows in the line as comments. You can also designate blocks of text using / and */:

plot of chunk unnamed-chunk-3

All commented text is made green in the do file.

Importing and Exporting Data

Excel and comma-delimited files can be seamlessly read into Stata. It is good practice to include the import command in your do file (even if you comment that command out after you import it, as you may want to do for large files), because you may want to re-import the original data after making an irreversible mistake, like deleting or replacing a variable.

All Excel files (.xls or .xlsx) can be imported with import excel using myfilename.xlsx.1 There are a number of specifications you will need, depending on the data you're importing. For example, if the first line of “mydata.xlsx” Excel file includes variable names, you would insert the command import excel using mydata.xlsx, first. All possible options for the import excel command, along with explanations for what they do, are available by entering help import excel in the Stata command prompt.

Comma-delimited files (*.csv) are imported with import delimited using myfilename.csv. Though this command is nearly identical to the import excel command, it has different options, even for the same actions. For example, importing “mydata.csv” and treating the first line as variable names would require the command import delimited using mydata.csv, varnames(1).

Exporting to Excel or comma-delimited files works essentially the same way that importing does, but with the export excel and export delimited commands. You can use Stata's help command to see details about the options available for these commands.

Directories

Of course, if you have a file named “mydata.xlsx” saved in your documents (or anywhere else), then typing import excel mydata.xlsx won't work. You can do one of two things to import this file: you can either include the full file path in the import command. For a Windows computer, this would mean something like import excel "C:\Users\edgel\Documents\mydata.xlsx".2

A cleaner method is to set your directory earlier in the do file. Setting the directory as a specific folder on your computer essentially tells Stata “everything I import or export goes to or comes from this folder”. You do this with the cd command (which stands for 'current directory'). So in the above example, I would instead enter the following lines in my code:
cd "C:\Users\edgel\Documents"
import excel mydata.xlsx

Saving/Opening Stata Data Sets

An alternative to importing and exporting Excel or delimited files each time you use Stata is saving your data set in Stata's custom file type, the .dta file. To save whatever data you have loaded as a .dta file, simply use the command save mydata. You can open this file with use mydata. If you already have a data set loaded when you use the use or import command, Stata will throw an error stating that the command will overwrite the current data. You can prevent this error by either clearing the workspace with the clear command, or by modifying your import or use command with the clear option.

1 Some Excel files have a .xls extension. Make sure to know the extension of your Excel file before typing out the import command. return

2 If the file path or file name includes spaces, you must put it in quotation marks. If there are no spaces, quotation marks are not necessary. return

# Other Resources

LJ Ristovska’s language-agnostic guide to coding for economists is especially helpful for establishing and following good coding and project management practices

This extensive introduction to Stata by the Harvard Institute for Quantitative Social Science