os.path.basename works with URLs, why?

2019-04-19 21:09发布

>>> os.path.basename('http://example.com/file.txt')
'file.txt'

.. and I thought os.path.* work only on local paths and not URLs? Note that the above example was run on Windows too .. with similar result.

标签: python url path
6条回答
ら.Afraid
2楼-- · 2019-04-19 21:15

Why? Because it's useful for parsing URLs as well as local file paths. Why not?

查看更多
小情绪 Triste *
3楼-- · 2019-04-19 21:16

On windows, look at the source code: C:\Python25\Lib\ntpath.py

def basename(p):
    """Returns the final component of a pathname"""
    return split(p)[1]

os.path.split (in the same file) just split "\" (and sth. else)

查看更多
聊天终结者
4楼-- · 2019-04-19 21:19

In practice many functions of os.path are just string manipulation functions (which just happen to be especially handy for path manipulation) -- and since that's innocuous and occasionally handy, while formally speaking "incorrect", I doubt this will change anytime soon -- for more details, use the following simple one-liner at a shell/command prompt:

$ python -c"import sys; import StringIO; x = StringIO.StringIO(); sys.stdout = x; import this; sys.stdout = sys.__stdout__; print x.getvalue().splitlines()[10][9:]"

Or, for Python 3:

$ python -c"import sys; import io; x = io.StringIO(); sys.stdout = x; import this; sys.stdout = sys.__stdout__; print(x.getvalue().splitlines()[10][9:])"
查看更多
淡お忘
5楼-- · 2019-04-19 21:26

Beware of URLs with parameters, anchors or anything that isn't a "plain" URL:

>>> import os.path
>>> os.path.basename("protocol://fully.qualifie.host/path/to/file.txt")
'file.txt'
>>> os.path.basename("protocol://fully.qualifie.host/path/to/file.txt?param1&param1#anchor")
'file.txt?param1&param1#anchor'
查看更多
地球回转人心会变
6楼-- · 2019-04-19 21:27

Forward slash is also an acceptable path delimiter in Windows.

It is merely that the command line does not accept paths that begin with a / because that character is reserved for args switches.

查看更多
贪生不怕死
7楼-- · 2019-04-19 21:29

Use the source Luke:


def basename(p):
    """Returns the final component of a pathname"""
    i = p.rfind('/') + 1
    return p[i:]

Edit (response to clarification):

It works for URLs by accident, that's it. Because of that, exploiting its behaviour could be considered code smell by some.

Trying to "fix" it (check if passed path is not url) is also surprisingly difficult

www.google.com/test.php
me@other.place.com/12
./src/bin/doc/goto.c

are at the same time correct pathnames and URLs (relative), so is the http:/hello.txt (one /, and only on linux, and it's kinda stupid :)). You could "fix" it for absolute urls but relative ones will still work. Handling one special case in differently is a big no no in the python world.

To sum it up: import this

查看更多
登录 后发表回答