I realise that if you have an iterable you should always use .join(iterable)
instead of for x in y: str += x
. But if there's only a fixed number of variables that aren't already in an iterable, is using .join()
still the recommended way?
For example I have
user = 'username'
host = 'host'
should I do
ret = user + '@' + host
or
ret = '@'.join([user, host])
I'm not so much asking from a performance point of view, since both will be pretty trivial. But I've read people on here say always use .join()
and I was wondering if there's any particular reason for that or if it's just generally a good idea to use .join()
.
If you're creating a string like that, you normally want to use string formatting:
>>> user = 'username'
>>> host = 'host'
>>> '%s@%s' % (user, host)
'username@host'
Python 2.6 added another form, which doesn't rely on operator overloading and has some extra features:
>>> '{0}@{1}'.format(user, host)
'username@host'
As a general guideline, most people will use +
on strings only if they're adding two strings right there. For more parts or more complex strings, they either use string formatting, like above, or assemble elements in a list and join them together (especially if there's any form of looping involved.) The reason for using str.join()
is that adding strings together means creating a new string (and potentially destroying the old ones) for each addition. Python can sometimes optimize this away, but str.join()
quickly becomes clearer, more obvious and significantly faster.
I take the question to mean: "Is it ok to do this:"
ret = user + '@' + host
..and the answer is yes. That is perfectly fine.
You should, of course, be aware of the cool formatting stuff you can do in Python, and you should be aware that for long lists, "join" is the way to go, but for a simple situation like this, what you have is exactly right. It's simple and clear, and performance will not be an issue.
(I'm pretty sure all of the people pointing at string formatting are missing the question entirely.)
Creating a string by constructing an array and joining it is for performance reasons only. Unless you need that performance, or unless it happens to be the natural way to implement it anyway, there's no benefit to doing that rather than simple string concatenation.
Saying '@'.join([user, host])
is unintuitive. It makes me wonder: why is he doing this? Are there any subtleties to it; is there any case where there might be more than one '@'? The answer is no, of course, but it takes more time to come to that conclusion than if it was written in a natural way.
Don't contort your code merely to avoid string concatenation; there's nothing inherently wrong with it. Joining arrays is just an optimization.
I'll just note that I've always tended to use in-place concatenation until I was rereading a portion of the Python general style PEP PEP-8 Style Guide for Python Code.
- Code should be written in a way that does not disadvantage other
implementations of Python (PyPy, Jython, IronPython, Pyrex, Psyco,
and such).
For example, do not rely on CPython's efficient implementation of
in-place string concatenation for statements in the form a+=b or a=a+b.
Those statements run more slowly in Jython. In performance sensitive
parts of the library, the ''.join() form should be used instead. This
will ensure that concatenation occurs in linear time across various
implementations.
Going by this, I have been converting to the practice of using joins so that I may retain the habit as a more automatic practice when efficiency is extra critical.
So I'll put in my vote for:
ret = '@'.join([user, host])
I use next:
ret = '%s@%s' % (user, host)