I am new to Python as I want to expand skills that I learned using R. In R I tend to load a bunch of libraries, sometimes resulting in function name conflicts.
What is best practice in Python. I have seen some specific variations that I do not see a difference between
import pandas
, from pandas import *
, and from pandas import DataFrame
What are the differences between the first two and should I just import what I need. Also, what would be the worst consequences for someone making small programs to process data and compute simple statistics.
UPDATE
I found this excellent guide. It explains everything.
In general it is better to do explicit imports. As in:
Or:
Another option in Python, when you have conflicting names, is import x as y:
Here are some recommendations from PEP8 Style Guide.
Imports should usually be on separate lines, e.g.:
but it is okay to
Imports are always put at the top of the file, just after any module comments and docstrings, and before module globals and constants.
Absolute imports are recommended
They are more readable and make debugging easier by giving better error messages in case you mess up import system.
or explicit relative imports
Implicit relative imports should never be used and is removed in Python 3.
Wildcard imports (
from <module> import *
) should be avoided, as they make it unclear which names are present in the namespace, confusing both readers and many automated tools.Some recommendations about
lazy imports
from python speed performance tips.the given below is a scenario explained at the page,
They are all suitable in different contexts (which is why they are all available). There's no deep guiding principle, other than generic motherhood statements around clarity, maintainability and simplicity. Some examples from my own code:
import sys, os, re, itertools
avoids name collisions and provides a very succinct way to import a bunch of standard modules.from math import *
lets me writesin(x)
instead ofmath.sin(x)
in math-heavy code. This gets a bit dicey when I also import numpy, which doubles up on some of these, but it doesn't overly concern me, since they are generally the same functions anyway. Also, I tend to follow the numpy documentation —import numpy as np
— which sidesteps the issue entirely.from PIL import Image, ImageDraw
just because that's the way the PIL documentation presents its examples.Disadvantage of each form
When reading other people's code (and those people use very different importing styles), I noticed the following problems with each of the styles:
import modulewithaverylongname
will clutter the code further down with the long module name (e.g.concurrent.futures
ordjango.contrib.auth.backends
) and decrease readability in those places.from module import *
gives me no chance to see syntactically that, for instance,classA
andclassB
come from the same module and have a lot to do with each other. It makes reading the code hard. (That names from such an import may shadow names from an earlier import is the least part of that problem.)from module import classA, classB, functionC, constantD, functionE
overloads my short-term memory with too many names that I mentally need to assign tomodule
in order to coherently understand the code.import modulewithaverylongname as mwvln
is sometimes insufficiently mnemonic to me.A suitable compromise
Based on the above observations, I have developed the following style in my own code:
import module
is the preferred style if the module name is short as for example most of the packages in the standard library. It is also the preferred style if I need to use names from the module in only two or three places in my own module; clarity trumps brevity then ("Readability counts").import longername as ln
is the preferred style in almost every other case. For instance, I mightimport django.contrib.auth.backends as dj_abe
. By definition of criterion 1 above, the abbreviation will be used frequently and is therefore sufficiently easy to memorize.Only these two styles are fully pythonic as per the "Explicit is better than implicit." rule.
from module import xx
still occurs sometimes in my code. I use it in cases where even theas
format appears exaggerated, the most famous example beingfrom datetime import datetime
.essentially equals following three statements
That's it, that is it all.
import pandas
imports the pandas module under the pandas namespace, so you would need to call objects within pandas usingpandas.foo
.from pandas import *
imports all objects from the pandas module into your current namespace, so you would call objects within pandas using onlyfoo
. Keep in mind this could have unexepcted consequences if there are any naming conflicts between your current namespace and the pandas namespace.from pandas import DataFrame
is the same as above, but only importsDataFrame
(instead of everything) into your current namespace.In my opinion the first is generally best practice, as it keeps the different modules nicely compartmentalized in your code.