tqdm in Jupyter Notebook

2019-01-22 16:43发布

问题:

I am using tqdm to print progress in a script I'm running in a Jupyter notebook. I am printing all messages to the console via tqdm.write(). However, this still gives me a skewed output like so:

That is, each time a new line has to be printed, a new progress bar is printed on the next line. This does not happen when I run the script via terminal. How can I solve this?

回答1:

Try using tqdm_notebook instead of tqdm, as outlined here. It's experimental at this stage, but works pretty well in most cases.

This could be as simple as changing your import to:

from tqdm import tqdm_notebook as tqdm

Good luck!

EDIT: After testing, it seems that tqdm actually works fine in 'text mode' in Jupyter notebook. It's hard to tell because you haven't provided a minimal example, but it looks like your problem is caused by a print statement in each iteration. The print statement is ouputting a number (~0.89) in between each status bar update, which is messing up the output. Try removing the print statement.



回答2:

This is an alternative answer for the case where tqdm_notebook doesn't work for you.

Given the following example:

from time import sleep
from tqdm import tqdm

values = range(3)
with tqdm(total=len(values)) as pbar:
    for i in values:
        pbar.write('processed: %d' %i)
        pbar.update(1)
        sleep(1)

The output would look something like this (progress would show up red):

  0%|          | 0/3 [00:00<?, ?it/s]
processed: 1
 67%|██████▋   | 2/3 [00:01<00:00,  1.99it/s]
processed: 2
100%|██████████| 3/3 [00:02<00:00,  1.53it/s]
processed: 3

The problem is that the output to stdout and stderr are processed asynchronously and separately in terms of new lines.

If say Jupyter receives on stderr the first line and then the "processed" output on stdout. Then once it receives an output on stderr to update the progress, it wouldn't go back and update the first line as it would only update the last line. Instead it will have to write a new line.

Workaround 1, writing to stdout

One workaround would be to output both to stdout instead:

import sys
from time import sleep
from tqdm import tqdm

values = range(3)
with tqdm(total=len(values), file=sys.stdout) as pbar:
    for i in values:
        pbar.write('processed: %d' % (1 + i))
        pbar.update(1)
        sleep(1)

The output will change to (no more red):

processed: 1   | 0/3 [00:00<?, ?it/s]
processed: 2   | 0/3 [00:00<?, ?it/s]
processed: 3   | 2/3 [00:01<00:00,  1.99it/s]
100%|██████████| 3/3 [00:02<00:00,  1.53it/s]

Here we can see that Jupyter doesn't seem to clear until the end of the line. We could add another workaround for that by adding spaces. Such as:

import sys
from time import sleep
from tqdm import tqdm

values = range(3)
with tqdm(total=len(values), file=sys.stdout) as pbar:
    for i in values:
        pbar.write('processed: %d%s' % (1 + i, ' ' * 50))
        pbar.update(1)
        sleep(1)

Which gives us:

processed: 1                                                  
processed: 2                                                  
processed: 3                                                  
100%|██████████| 3/3 [00:02<00:00,  1.53it/s]

Workaround 2, set description instead

It might in general be more straight forward not to have two outputs but update the description instead, e.g.:

import sys
from time import sleep
from tqdm import tqdm

values = range(3)
with tqdm(total=len(values), file=sys.stdout) as pbar:
    for i in values:
        pbar.set_description('processed: %d' % (1 + i))
        pbar.update(1)
        sleep(1)

With the output (description updated while it's processing):

processed: 3: 100%|██████████| 3/3 [00:02<00:00,  1.53it/s]

Conclusion

You can mostly get it to work fine with plain tqdm. But if tqdm_notebook works for you, just use that (but then you'd probably not read that far).



回答3:

It works as expected if you do as the following:

pbar = tqdm(range(10))
try:
    for i in pbar:
        sleep(1)
        if i==5:
            raise NotImplementedError()
finally:
    pbar.close()

or

with tqdm(range(10)) as pbar:
    for i in pbar:
        sleep(1)
        if i==5:
            raise NotImplementedError()