I am trying to use Python for statistical analysis.
In Stata I can define local macros and expand them as necessary:
program define reg2
syntax varlist(min=1 max=1), indepvars(string) results(string)
if "`results'" == "y" {
reg `varlist' `indepvars'
}
if "`results'" == "n" {
qui reg `varlist' `indepvars'
}
end
sysuse auto, clear
So instead of:
reg2 mpg, indepvars("weight foreign price") results("y")
I could do:
local options , indepvars(weight foreign price) results(y)
reg2 mpg `options'
Or even:
local vars weight foreign price
local options , indepvars(`vars') results(y)
reg2 mpg `options'
Macros in Stata help me write clean scripts, without repeating code.
In Python I tried string interpolation but this does not work in functions.
For example:
def reg2(depvar, indepvars, results):
print(depvar)
print(indepvars)
print(results)
The following runs fine:
reg2('mpg', 'weight foreign price', 'y')
However, both of these fail:
regargs = 'mpg', 'weight foreign price', 'y'
reg2(regargs)
regargs = 'depvar=mpg, covariates=weight foreign price, results=y'
reg2(regargs)
I found a similar question but it doesn't answer my question:
There is also another question about this for R:
However, I could not find anything for Python specifically.
I was wondering if there is anything in Python that is similar to Stata's macros?
It looks like you just want the
*
and**
operators for calling functions:Use
*
to expand a list or tuple into positional arguments, or use**
to expand a dictionary into keyword arguments to a function that requires them.For your keyword example, you need to change the declaration a little bit:
Do it the Pythonic way.
The pervasive use of macros in Stata reflects a different programming philosophy. Unlike Python, which is an object-oriented general purpose programming language, Stata's
ado
language (notmata
) requires macros in order to function as something more than a simple scripting language.Macros can be used almost anywhere in Stata (even in macro definitions) for two purposes:
Using macros, the user can simplify their code, which in turn will reduce the potential for errors and keep it tidy. The disadvantage is that the use of macros renders the syntax of the language fluid.
To answer your question, Pyexpander provides some of this kind of functionality in Python but it is not really a substitute. For different use cases you will need a different approach to mimic macro expansion. In contrast with Stata, there is no uniform way of doing this everywhere.
My advice is to familiarize yourself with Python's conventions rather than trying to program things the "Stata way". For example, it is useful to remember that local and global macros in Stata correspond to variables in Python (local in a function, global outside), while variables in Stata correspond to
Pandas.Series
or a column of aPandas.DataFrame
. Similarly, Stataado
programs correspond to functions in Python.The solution provided in @g.d.d.c's answer can be a good tool towards achieving what someone would like. However, extra steps are required here if you want to re-use your code.
Using your toy example:
Let's assume you want to re-use the following snippet of code but with different variables:
How could you possibly do that?
First, create a function:
However, note that although string interpolation can 'expand' strings, here this approach will not work because the target function for regresson analysis does not accept a unified string of the kind
'weight, price, cons'
.Instead you need to define a list with the regressors:
You can also take this concept to the next level by constructing a decorator:
And use this in your
reg2()
function:The example is perhaps very simplistic but demonstrates the power of Python:
As you can see, the decorator further abstracts things but using fixed syntax.
In the Python universe dictionaries and classes also play important roles in re-using code/results. For example, a dictionary can act as the equivalent of Stata's
return
space for storing multiple macros, scalars etc.Consider the slightly altered version of our toy decorator
load_and_reg2
, which now saves individual objects in a dictionaryD
and returns it:You can then easily do:
Classes can introduce further flexibility at the cost of some additional complexity:
This version of our toy decorator returns: