可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened,
visit the help center for guidance.
Closed 7 years ago.
In spirit of the existing "what's your most useful C/C++ snippet" - thread:
Do you guys have short, monofunctional Python snippets that you use (often) and would like to share with the StackOverlow Community? Please keep the entries small (under 25
lines maybe?) and give only one example per post.
I'll start of with a short snippet i use from time to time to count sloc (source lines of code) in python projects:
# prints recursive count of lines of python source code from current directory
# includes an ignore_list. also prints total sloc
import os
cur_path = os.getcwd()
ignore_set = set(["__init__.py", "count_sourcelines.py"])
loclist = []
for pydir, _, pyfiles in os.walk(cur_path):
for pyfile in pyfiles:
if pyfile.endswith(".py") and pyfile not in ignore_set:
totalpath = os.path.join(pydir, pyfile)
loclist.append( ( len(open(totalpath, "r").read().splitlines()),
totalpath.split(cur_path)[1]) )
for linenumbercount, filename in loclist:
print "%05d lines in %s" % (linenumbercount, filename)
print "\nTotal: %s lines (%s)" %(sum([x[0] for x in loclist]), cur_path)
回答1:
Initializing a 2D list
While this can be done safely to initialize a list:
lst = [0] * 3
The same trick won’t work for a 2D list (list of lists):
>>> lst_2d = [[0] * 3] * 3
>>> lst_2d
[[0, 0, 0], [0, 0, 0], [0, 0, 0]]
>>> lst_2d[0][0] = 5
>>> lst_2d
[[5, 0, 0], [5, 0, 0], [5, 0, 0]]
The operator * duplicates its operands, and duplicated lists constructed with [] point to the same list. The correct way to do this is:
>>> lst_2d = [[0] * 3 for i in xrange(3)]
>>> lst_2d
[[0, 0, 0], [0, 0, 0], [0, 0, 0]]
>>> lst_2d[0][0] = 5
>>> lst_2d
[[5, 0, 0], [0, 0, 0], [0, 0, 0]]
回答2:
I like using any
and a generator:
if any(pred(x.item) for x in sequence):
...
instead of code written like this:
found = False
for x in sequence:
if pred(x.n):
found = True
if found:
...
I first learned of this technique from a Peter Norvig article.
回答3:
The only 'trick' I know that really wowed me when I learned it is enumerate. It allows you to have access to the indexes of the elements within a for loop.
>>> l = ['a','b','c','d','e','f']
>>> for (index,value) in enumerate(l):
... print index, value
...
0 a
1 b
2 c
3 d
4 e
5 f
回答4:
zip(*iterable)
transposes an iterable.
>>> a=[[1,2,3],[4,5,6]]
>>> zip(*a)
[(1, 4), (2, 5), (3, 6)]
It's also useful with dicts.
>>> d={"a":1,"b":2,"c":3}
>>> zip(*d.iteritems())
[('a', 'c', 'b'), (1, 3, 2)]
回答5:
Fire up a simple web server for files in the current directory:
python -m SimpleHTTPServer
Useful for sharing files.
回答6:
A "progress bar" that looks like:
|#############################---------------------|
59 percent done
Code:
class ProgressBar():
def __init__(self, width=50):
self.pointer = 0
self.width = width
def __call__(self,x):
# x in percent
self.pointer = int(self.width*(x/100.0))
return "|" + "#"*self.pointer + "-"*(self.width-self.pointer)+\
"|\n %d percent done" % int(x)
Test function (for windows system, change "clear" into "CLS"):
if __name__ == '__main__':
import time, os
pb = ProgressBar()
for i in range(101):
os.system('clear')
print pb(i)
time.sleep(0.1)
回答7:
To flatten a list of lists, such as
[['a', 'b'], ['c'], ['d', 'e', 'f']]
into
['a', 'b', 'c', 'd', 'e', 'f']
use
[inner
for outer in the_list
for inner in outer]
回答8:
Huge speedup for nested list and dictionaries with:
deepcopy = lambda x: cPickle.loads(cPickle.dumps(x))
回答9:
Suppose you have a list of items, and you want a dictionary with these items as the keys. Use fromkeys:
>>> items = ['a', 'b', 'c', 'd']
>>> idict = dict().fromkeys(items, 0)
>>> idict
{'a': 0, 'c': 0, 'b': 0, 'd': 0}
>>>
The second argument of fromkeys is the value to be granted to all the newly created keys.
回答10:
To find out if line is empty (i.e. either size 0 or contains only whitespace), use the string method strip in a condition, as follows:
if not line.strip(): # if line is empty
continue # skip it
回答11:
I like this one to zip everything up in a directory. Hotkey it for instabackups!
import zipfile
z = zipfile.ZipFile('my-archive.zip', 'w', zipfile.ZIP_DEFLATED)
startdir = "/home/johnf"
for dirpath, dirnames, filenames in os.walk(startdir):
for filename in filenames:
z.write(os.path.join(dirpath, filename))
z.close()
回答12:
For list comprehensions that need current, next:
[fun(curr,next)
for curr,next
in zip(list,list[1:].append(None))
if condition(curr,next)]
For circular list zip(list,list[1:].append(list[0]))
.
For previous, current: zip([None].extend(list[:-1]),list)
circular: zip([list[-1]].extend(list[:-1]),list)
回答13:
Hardlink identical files in current directory (on unix, this means they have share physical storage, meaning much less space):
import os
import hashlib
dupes = {}
for path, dirs, files in os.walk(os.getcwd()):
for file in files:
filename = os.path.join(path, file)
hash = hashlib.sha1(open(filename).read()).hexdigest()
if hash in dupes:
print 'linking "%s" -> "%s"' % (dupes[hash], filename)
os.rename(filename, filename + '.bak')
try:
os.link(dupes[hash], filename)
os.unlink(filename + '.bak')
except:
os.rename(filename + '.bak', filename)
finally:
else:
dupes[hash] = filename
回答14:
Here are few which I think are worth knowing but might not be useful on an everyday basis.
Most of them are one liners.
Removing Duplicates from a List
L = list(set(L))
Getting Integers from a string (space seperated)
ints = [int(x) for x in S.split()]
Finding Factorial
fac=lambda(n):reduce(int.__mul__,range(1,n+1),1)
Finding greatest common divisor
>>> def gcd(a,b):
... while(b):a,b=b,a%b
... return a
回答15:
like another person above, I said 'Wooww !!' when I discovered enumerate()
I sang a praise to Python when I discovered repr() that gave me possibility to see precisely the content of strings that I wanted to analyse with a regex
I was very satisfied to discover that print '\n'.join(list_of_strings)
is displayed much more rapidly with '\n'.join(...) than for ch in list_of_strings: print ch
splitlines(1) with an argument keeps the newlines
These four "tricks" combined in one snippet very useful to rapidly display the code source of a web page , line after line, each line being numbered , all the special characters like '\t' or newlines being not interpreted, and with the newlines present:
import urllib
from time import clock,sleep
sock = urllib.urlopen('http://docs.python.org/')
ch = sock.read()
sock.close()
te = clock()
for i,line in enumerate(ch.splitlines(1)):
print str(i) + ' ' + repr(line)
t1 = clock() - te
print "\n\nIn 3 seconds, I will print the same content, using '\\n'.join(....)\n"
sleep(3)
te = clock()
# here's the point of interest:
print '\n'.join(str(i) + ' ' + repr(line)
for i,line in enumerate(ch.splitlines(1)) )
t2 = clock() - te
print '\n'
print 'first display took',t1,'seconds'
print 'second display took',t2,'seconds'
With my not very fast computer, I got:
first display took 4.94626048841 seconds
second display took 0.109297410704 seconds
回答16:
Emulating a switch statement. For example switch(x) {..}:
def a():
print "a"
def b():
print "b"
def default():
print "default"
apply({1:a, 2:b}.get(x, default))
回答17:
import tempfile
import cPickle
class DiskFifo:
"""A disk based FIFO which can be iterated, appended and extended in an interleaved way"""
def __init__(self):
self.fd = tempfile.TemporaryFile()
self.wpos = 0
self.rpos = 0
self.pickler = cPickle.Pickler(self.fd)
self.unpickler = cPickle.Unpickler(self.fd)
self.size = 0
def __len__(self):
return self.size
def extend(self, sequence):
map(self.append, sequence)
def append(self, x):
self.fd.seek(self.wpos)
self.pickler.clear_memo()
self.pickler.dump(x)
self.wpos = self.fd.tell()
self.size = self.size + 1
def next(self):
try:
self.fd.seek(self.rpos)
x = self.unpickler.load()
self.rpos = self.fd.tell()
return x
except EOFError:
raise StopIteration
def __iter__(self):
self.rpos = 0
return self
回答18:
For Python 2.4+ or earlier:
for x,y in someIterator:
listDict.setdefault(x,[]).append(y)
In Python 2.5+ there is alternative using defaultdict.
回答19:
A custom list that when multiplied by other list returns a cartesian product... the good thing is that the cartesian product is indexable, not like that of itertools.product (but the multiplicands must be sequences, not iterators).
import operator
class mylist(list):
def __getitem__(self, args):
if type(args) is tuple:
return [list.__getitem__(self, i) for i in args]
else:
return list.__getitem__(self, args)
def __mul__(self, args):
seqattrs = ("__getitem__", "__iter__", "__len__")
if all(hasattr(args, i) for i in seqattrs):
return cartesian_product(self, args)
else:
return list.__mul__(self, args)
def __imul__(self, args):
return __mul__(self, args)
def __rmul__(self, args):
return __mul__(args, self)
def __pow__(self, n):
return cartesian_product(*((self,)*n))
def __rpow__(self, n):
return cartesian_product(*((self,)*n))
class cartesian_product:
def __init__(self, *args):
self.elements = args
def __len__(self):
return reduce(operator.mul, map(len, self.elements))
def __getitem__(self, n):
return [e[i] for e, i in zip(self.elements,self.get_indices(n))]
def get_indices(self, n):
sizes = map(len, self.elements)
tmp = [0]*len(sizes)
i = -1
for w in reversed(sizes):
tmp[i] = n % w
n /= w
i -= 1
return tmp
def __add__(self, arg):
return mylist(map(None, self)+mylist(map(None, arg)))
def __imul__(self, args):
return mylist(self)*mylist(args)
def __rmul__(self, args):
return mylist(args)*mylist(self)
def __mul__(self, args):
if isinstance(args, cartesian_product):
return cartesian_product(*(self.elements+args.elements))
else:
return cartesian_product(*(self.elements+(args,)))
def __iter__(self):
for i in xrange(len(self)):
yield self[i]
def __str__(self):
return "[" + ",".join(str(i) for i in self) +"]"
def __repr__(self):
return "*".join(map(repr, self.elements))
回答20:
I actually just created this, but I think it's going to be a very useful debugging tool.
def dirValues(instance, all=False):
retVal = {}
for prop in dir(instance):
if not all and prop[1] == "_":
continue
retVal[prop] = getattr(instance, prop)
return retVal
I usually use dir() in a pdb context, but I think this will be much more useful:
(pdb) from pprint import pprint as pp
(pdb) from myUtils import dirValues
(pdb) pp(dirValues(someInstance))
回答21:
Iterate over any iterable (list, set, file, stream, strings, whatever), of ANY size (including unknown size), by chunks of x elements:
from itertools import chain, islice
def chunks(iterable, size, format=iter):
it = iter(iterable)
while True:
yield format(chain((it.next(),), islice(it, size - 1)))
>>> l = ["a", "b", "c", "d", "e", "f", "g"]
>>> for chunk in chunks(l, 3, tuple):
... print chunk
...
("a", "b", "c")
("d", "e", "f")
("g",)
回答22:
When debugging, you sometimes want to see a string with a basic editor. For showing a string with notepad:
import os, tempfile, subprocess
def get_rand_filename(dir_=os.getcwd()):
"Function returns a non-existent random filename."
return tempfile.mkstemp('.tmp', '', dir_)[1]
def open_with_notepad(s):
"Function gets a string and shows it on notepad"
with open(get_rand_filename(), 'w') as f:
f.write(s)
subprocess.Popen(['notepad', f.name])