Writing stata ado files




















Also, since foreach has to construct the whole list of numbers before it can start, it can only handle relatively small lists. It's quicker to type too:.

This section may not be a programming topic, but it is a tool we'll use in our final example. And it's good to know anyway. Many Stata commands store values in an internal array you can access once you know it's there. Estimation commands create an array called e , and you can see what's in it by typing ereturn list. Almost all other commands that return results put them in an array called r , and you can see what's in r by typing return list. The manuals also describe what each command returns.

The only trick is that every command that uses the e or r arrays overwrites the previous tables. So if you want to do anything with the results of a command, you must do it before you issue another command that returns values. One option is to save the results in a variable or local macros for later use. Try the following:. If you want to demean weight subtract the mean from all observations , all you have to do is type.

Note that there are issues with numerical precision, but you've accomplished your purpose. Keep in mind that you have also replaced the old values of the r array with a new set of values referring to the second time you ran summarize. Good thing you were done with the old results. Let's put together everything you've learned by writing a program that demeans data.

This is a simple enough task that a program isn't really needed, but we'll go a step further and make it both flexible and error-resistant. In other words, we'll put a lot more effort into it than it's worth except as a learning experience, of course.

We'll start with the simplest possible version which is generally a good idea when programming. It will take one argument, a variable name, and demean that variable. Try it out and see how it does just reload the auto data set if you start running out of variables with non-zero means.

That's fine as far as it goes. But suppose you wanted to demean 20 different variables? It's time to add a foreach loop. We could have our foreach loop work with this as a variable list, or even a generic list. But local was created for exactly this kind of situation and will run a bit faster. So the next version is:. There's just one problem with your demean program.

To see it type demean make. The make variable is a string. It has no mean, and so your program crashes. Now, you may be thinking that anyone who tries to demean a string deserves what's coming to them, but let's fix it anyway, just so you can learn how.

You may not be able to demean a string, but you can give a better error message, and then proceed to demean any other variables that were requested and are valid. You're used to using if at the end of commands. That meant "execute the preceding command for a given observation only if this condition is true for that observation.

You're going to say "only execute the following commands for ANY observation if this condition is true. It is evaluated just once, not once for each observation.

If the condition includes a variable, the value of that variable for the first observation will be used. It is also possible to combine if with else, so you can make arbitrarily complex sets of conditions.

The syntax looks like this this is a fairly complex example so can you see how all the pieces work--we'll do something simpler in our program :. The problem with your program is that as soon as Stata sees you try to subtract something from a string variable, it crashes with the message. So your job is to detect strings before you try to demean them, and only subtract things that can be subtracted. You can do this using the confirm command. It's a bit like assert in that you use it to check on things you believe to be true, but it's designed for programmers.

Thus it allows you to check things like that a file actually exists, or in this case, that a variable is numeric and thus has a mean. The syntax is. It will do nothing if the variable is numeric, and cause an error if it is not. But you don't want it to crash the program, so put capture in front of it.

But how will you know the result if you use capture? A return code of zero means the command was successful. Any other value means something went wrong different errors give different return codes. If not, you give an error message but the program continues to run and processes the rest of the variables.

The file demean. You'll also notice some comments and a great deal of indenting to make the logical structure easy to see. Both practices are highly recommended. You now have a nice little program that could be useful in a variety of settings. But you have to run the code that defines it before you can use it.

What if you could make it act like any other Stata command and run as soon as you type it? You can, by making it an ado automatic do file. An ado file is just like a do file that defines a program, but the filename ends with. When you type a command, Stata checks the ado directories to see if there is an ado file with that name.

If there is, Stata automatically runs the ado file that defines the program and then executes it. Thus from the user's perspective, using an ado file is just like using a built-in Stata command. In fact many Stata commands are actually implemented as ado files. In order to create an ado file, you need isolate the demean program in a separate file and save it as demean.

You can identify your personal ado directory by typing sysdir. Once that's done, demean. Not quite though: note that we made no provision for standard Stata syntax like by: or if. Doing so isn't actually as hard as you might think, but still beyond the scope of this article.

You've now learned a powerful set of tools that can save you a great deal of time and trouble. At first you may need to consciously look for opportunities to use them. The file would then just need to be run using run levelslist. However, this command is not very useful at this stage: it outputs far too much useless information, particularly when variables take integer or continuous values with many levels.

The next section will introduce code that allows such commands to be customizable within each context you want to use them. The syntax command takes a program block and allows its inputs to be customized based on the context it is being executed in.

The help file for the syntax command is extensive and allows lots of automated checks and advanced features, particularly for modern features like factor variables and time series fv and ts. For advanced applications, always consult the syntax help file to see how to accomplish your objective. For now, we will take a simple tour of how syntax creates an adaptive command. First, let's add simple syntax allowing the user to select the variables and observations they want to include.

We might write:. There are several key features to note here. First, we write anything in the syntax command to allow the user to write absolutely anything they like as the arguments to be passed into the program.

Recall that local macros in Stata have strictly local scope; in this case, that means locals from the calling do-file will not be passed into the program, and locals from the program will not be passed back into the calling do-file.

Second, we write [if] in brackets to declare that the user can optionally declare an if-restriction to the command. However, Stata provides the implementation shortcut marksample to implement this restriction. Then, the if-restriction must be applied: we can preserve the data and then drop the ineligible observations before running more code. This is an appropriate choice here for several reasons: preserve will always restore the data to the original state at the end of program execution, no matter what happens later in the program, due to its scope; restore is not even needed here.

For this reason, we will often only use preserve in this context in programming, and prefer other methods for loading and re-loading data inside the program block. Other syntax elements work similarly, although they are not parsed through marksample except in. See the helpfile for details. There are two kinds of files that are used in Stata programming, do-files and ado-files.

Do-files are run from the command line using the do command, for example,. Ado-files, on the other hand, work like ordinary Stata commands by just using the file name in the command line, for example,. In fact, many of the built-in Stata commands are just ado-files, like the ttest command shown above.

You can look at the source code for the ado commands using the viewsource command, for example,. Do-files can be placed in the same folder as the data but ado-files need to go where Stata can find them. The location of this directory can vary for system to system. We will create a do-file, hsbcheck.

Macro variables have many uses including as variable names or numeric values. We will see additional uses of macro variables in other programs. Now that we know what errors there are in the data we can write a do-file that will fix the errors.

When we know the correct value of an observation, we will replace the incorrect value with the correct one. When we do not know the correct value for an observation, we will replace the incorrect value with missing. The do-file hsbfix. Here is what hsbfix. One important thing to note is that after we fix the incorrect values, we will save the data file with a new name.

We will never change any of the values in the original data file, hsberr. First, we will run hsbfix on the original file hsberr then, as a check, we will run hsbcheck on the new file hsbclean. Next, we will create a do-file that contains all of the commands that we need to run our data analysis. This do-file will be called hsbanalyze.

Return list are one of the most powerful and useful features of Stata. There are three commonly used return lists: 1 return list for ordinary nonestimation commands; 2 ereturn list for full estimation commands; and 3 creturn list for a list constants and system parameters.

You can access any the values using c name by replacing name with the name of the function. Here is an example of the return list abbr: ret lis following the summarize command. We can use this information to compute a statistics, such as, the coefficient of variation that Stata does not provide. The ereturn list abbr: eret lis is used following estimation commands, such as, regress , anova , logit , sem , etc.

Here is an example following regress. The matrix e b contains the parameter estimates while e V has the covariance matrix of the parameter estimates. For example, if you wanted the predicted score for a female with a reading score of 60, you could type the following. Macro variables are a good way to store values for later use.



0コメント

  • 1000 / 1000