NamedTemporaryFile speed underwhelming

Im trying to use a NamedTemporaryFile and pass this object to an external program to use, before collecting the output using Popen. My hope was that this would be quicker than creating a real file on the hard-disk and avoid as much IO as possible. This size of the temp files I am creating are small, on the order of a KB or so, and I am finding that creating a temp file to work with is actually slower than using a normal file for reading/writing. Is there a trick I am missing here? What is going on behind the scenes when I use a NamedTemporaryFile?

# Using named temp file
with tempfile.NamedTemporaryFile(delete=False) as temp:  # delete=False to keep a reference to the file for process calls
    for idx, item in enumerate(r):
        temp.write(">{}\n{}\n".format(idx, item[1]))
>>> 8.435 ms

# Using normal file io
with open("test.fa", "w") as temp:
    for idx, item in enumerate(r):
        temp.write(">{}\n{}\n".format(idx, item[1]))
>>> 0.506 ms

#--------

# Read using temp file
[i for i in open(name, "r")]
>>> 1.167 ms

[i for i in open("test.fa", "r")]
>>> 0.765 ms

Doing a bit of profiling it seems almost the entire time is spent creating the temp object. Using tempfile.NamedTemporaryFile(delete=False) takes over 8 ms in this example

标签： python subprocess

1条回答

看我几分像从前

2楼-- · 2019-09-06 06:50

I will try to answer your question although I am not very experienced with Python runtime efficiency.

Drilling in the code of Python's tempfile.py you can find a clue about what might take some time. The _mkstemp_inner function might open a few files and raise an exception for each one. The more temp files your directory contains, the more file name collisions you get, the longer this takes. Try to empty your temp directory.

def _mkstemp_inner(dir, pre, suf, flags):
    """Code common to mkstemp, TemporaryFile, and NamedTemporaryFile."""

    names = _get_candidate_names()

    for seq in range(TMP_MAX):
        name = next(names)
        file = _os.path.join(dir, pre + name + suf)
        try:
            fd = _os.open(file, flags, 0o600)
            _set_cloexec(fd)
            return (fd, _os.path.abspath(file))
        except OSError as e:
            if e.errno == _errno.EEXIST:
                continue # try again
            raise

    raise IOError(_errno.EEXIST, "No usable temporary file name found")

Hope that helped.

0人赞添加讨论(0) 举报

NamedTemporaryFile speed underwhelming

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间