Simple way to choose which cells to run in ipython

2019-01-23 08:55发布

I have an ipython notebook that runs several steps in a data processing routine and saves information in files along the way. This way, while developing my code (mostly in a separate .py module), I can skip to and run various steps. I'd like to set it up so that I can Cell->run all but only have it execute certain chosen steps that would be easily chosen. e.g., I'd envision defining the steps I want to run in a dict like so:

process = {
    'load files':False,
    'generate interactions list':False,
    'random walk':True,
    'dereference walk':True,
    'reduce walk':True,
    'generate output':True
}

then the steps would run based on this dict. BTW, each step comprises multiple cells.

I think %macro is not quite what I want since anytime I changed anything or restarted the kernel I'd have to redefine the macro, with changing cell numbers.

Is there like a %skip or %skipto magic or something along those lines? Or perhaps a clean way to put at the beginning of cells, if process[<current step>]: %dont_run_rest_of_cell?

4条回答
Emotional °昔
2楼-- · 2019-01-23 09:29

You can create your own skip magic with the help of a custom kernel extension.

skip_kernel_extension.py

def skip(line, cell=None):
    '''Skips execution of the current line/cell if line evaluates to True.'''
    if eval(line):
        return

    get_ipython().ex(cell)

def load_ipython_extension(shell):
    '''Registers the skip magic when the extension loads.'''
    shell.register_magic_function(skip, 'line_cell')

def unload_ipython_extension(shell):
    '''Unregisters the skip magic when the extension unloads.'''
    del shell.magics_manager.magics['cell']['skip']

Load the extension in your notebook:

%load_ext skip_kernel_extension

Run the skip magic command in the cells you want to skip:

%%skip True  #skips cell
%%skip False #won't skip

You can use a variable to decide if a cell should be skipped by using $:

should_skip = True
%%skip $should_skip
查看更多
女痞
3楼-- · 2019-01-23 09:47

I am new to Jupyter Notebook and am loving it. I had heard of IPython before but didn't look into it seriously until a recent consulting job.

One trick my associate showed me to disable blocks from execution is to change them from "Code" type to "Raw NBConvert" type. This way I sprinkle diagnostic blocks through my notebook, but only turn them on (make them "Code") if I want them to run.

This method isn't exactly dynamically selectable in a script, but may suit some needs.

查看更多
Bombasti
4楼-- · 2019-01-23 09:49

Explicit is always better that implicit. Simple is better than complicated. So why don't use plain python?

With one cell per step you can do:

if process['load files']:
    load_files()
    do_something()

and

if process['generate interactions list']:
    do_something_else()

If you want to stop the execution when a particular step is skipped you could use:

if not process['reduce walk']:
    stop
else:
    reduce_walk()
    ...

stop is not a command so it will generate an exception and stop the execution when using Cell -> Run all.

You can also make conditional steps like:

if process['reduce walk'] and process['save output']:
    save_results()
    ...

But, as a rule of thumb, I wouldn't make conditions that are much more complex than that.

查看更多
Lonely孤独者°
5楼-- · 2019-01-23 09:51

If you are using nbconvert to execute your notebook, you can write a custom preprocessor that looks at cell metadata to know which cells to execute.

class MyExecutePreprocessor(nbconvert.preprocessors.ExecutePreprocessor):

    def preprocess_cell(self, cell, resources, cell_index):
        """
        Executes a single code cell. See base.py for details.
        To execute all cells see :meth:`preprocess`.

        Checks cell.metadata for 'execute' key. If set, and maps to False, 
          the cell is not executed.
        """

        if not cell.metadata.get('execute', True):
            # Don't execute this cell in output
            return cell, resources

        return super().preprocess_cell(cell, resources, cell_index)

By editing cell metadata, you can specify whether that cell should be executed.

You can get fancier by adding a master dictionary to your notebook metadata. This would look like the dictionary in your example, mapping sections to a boolean specifying whether that section would be called.

Then, in your cell metadata, you can use a "section" keyword mapping to the section ID in your notebook metadata.

When executing nbconvert, you can tell it to use your preprocessor.

See the docs on Notebook preprocessors for more information.

查看更多
登录 后发表回答