Here is a sample of code where a function is run repeatedly with new information for most of the input variables except one, good_ens. The input variable good_ens that should never be changed, gets changed. What is going on here? This defies my understanding of scope.
def doFile(infileName, outfileName, goodens, timetype, flen):
print('infilename = %s' % infileName)
print('outfilename = %s' % outfileName)
print('goodens at input are from %d to %d' % (goodens[0],goodens[1]))
print('timetype is %s' % timetype)
maxens = flen # fake file length
print('%s time variable has %d ensembles' % (infileName,maxens))
# TODO - goodens[1] has the file size from the previous file run when multiple files are processed!
if goodens[1] < 0:
goodens[1] = maxens
print('goodens adjusted for input file length are from %d to %d' % (goodens[0],goodens[1]))
nens = goodens[1]-goodens[0]
print('creating new netCDF file %s with %d records (should match input file)' % (outfileName, nens))
# user settings
datapath = ""
datafiles = ['file0.nc',\
'file1.nc',\
'file2.nc',\
'file3.nc']
# fake file lengths for this demonstration
datalengths = [357056, 357086, 357060, 199866]
outfileroot = 'outfile'
attFile = datapath + 'attfile.txt'
# this gets changed! It should never be changed!
# ask for all ensembles in the file
good_ens = [0,-1]
# -------------- beyond here the user should not need to change things
for filenum in range(len(datafiles)):
print('\n--------------\n')
print('Input Parameters before function call')
print(good_ens)
inputFile = datapath + datafiles[filenum]
print(inputFile)
l = datalengths[filenum]
print(l)
outputFile = datapath + ('%s%03d.cdf' % (outfileroot,filenum))
print(outputFile)
print('Converting from %s to %s' % (inputFile,outputFile))
# the variable good_ens gets changed by this calling function, and should not be
doFile(inputFile, outputFile, good_ens, 'CF', l)
# this works, but will not work for me in using this function
#doNortekRawFile(inputFile, outputFile, [0,-1], 'CF', l)
Output for the first two iterations of the for loop is below. Note good_ens gets changed from [0, -1] to the value of goodens that is inside the function. Why? Never mind the difference in variable names, they don't even share the same scope.
--------------
Input Parameters before function call
[0, -1]
file0.nc
357056
outfile000.cdf
Converting from file0.nc to outfile000.cdf
infilename = file0.nc
outfilename = outfile000.cdf
goodens at input are from 0 to -1
timetype is CF
file0.nc time variable has 357056 ensembles
goodens adjusted for input file length are from 0 to 357056
creating new netCDF file outfile000.cdf with 357056 records (should match input file)
--------------
Input Parameters before function call
[0, 357056]
file1.nc
357086
outfile001.cdf
Converting from file1.nc to outfile001.cdf
infilename = file1.nc
outfilename = outfile001.cdf
goodens at input are from 0 to 357056
timetype is CF
file1.nc time variable has 357086 ensembles
goodens adjusted for input file length are from 0 to 357056
creating new netCDF file outfile001.cdf with 357056 records (should match input file)
--------------
There is a similar question here:
Python issue value of property changes when falling out of loop scope
However I do not want to embed the variable good_ens down in a for loop. I want its value to be set by the user once at the head of a script, then used in the for loop.
In python, lists are mutable.
If you want them to be immutable consider using a tuple.
The other answers cover the idea that lists are mutable. Below is a possible refactoring that gets around this issue in what I think is a sensible way.
This way, you can still call the function as you have been, your variables within the function are named more aptly, and you never mutate the provided list object.
When you call
doFile
try this instead:doFile(inputFile, outputFile, list(good_ens), 'CF', l)
I think of it this way: A list is a thing which points to the value of each element within the list. When you pass a list into a function, the thing that does the pointing gets copied, but the values of the things pointed to do not get copied.
doing
list(good_ens)
actually makes copies in memory of the elements of the list, and will keep the original values from getting changed. See below:EDIT: The reasoning for this is that, as the other answers have indicated, list is a mutable data type in python. Mutable data types can be changed, whereas immutable data types cannot be changed but rather return new objects when attempting to update.