Assuming this list
nestedList = ["a", "b", [1, 2, 3], "c",[4, 5, 6, [100, 200, 300]], "d"]
I have a function that returns a position list for a nested list of arbitrary depth.
Examples:
[2, 1] -> "2"
[5] -> "d"
[4, 3, 2] -> "300"
As you can see it is not clear in the beginning how many levels of nesting there are.
Additional Problem
For list modifications I want to use the [:] or [4:] or [0:1] notations.
For a human its very easy to do: simply add as many index position as you need to.
nestedList[2][1]
nestedList[5]
nestedList[4][3][2]
nestedList[4][1:] = NewItem + nestedList[4][1:] #insert item
nestedList[2][1] = [] #remove item
However this approach does not lead anywhere since I had to append strings together and eval them later. Obvious nonsense :)
What is the best way to handle a nested list with an unknown number of index positions and still have the functionality to handle it like a normal list (read, modify, insert, remove)
I hope there is an answer to that.
P.S. the list must remain nested. Flattening is not an option.
I finally had some time to fiddle around with this. I got a little carried away. It's long, but I'm pasting it anyway. I added set_item
, insert
, delete
, find
, and find_left
methods, as well as some private methods to allow low-level manipulation that breaks the cursor abstraction. I also added a move_cursor
method that throws an IndexError
for index tuples that are out of range or point to a non-toplevel object.
Basically, it (should) be guaranteed that as long as you use public functions only, the cursor always points to a top-level object, and insertions and deletions all happen at the top level. From here, you should be able to safely implement __getitem__
, __setitem__
, __delitem__
, etc, and maybe even __getslice__
, __setslice__
.
However, there are a couple of wrinkles. The restriction that the cursor always points to a toplevel object makes it very easy to iterate over the nested list as though it were a flat list. But it also means that the cursor can't point at lower-level objects, and hence some kinds of insertions can't happen using insert
alone. For example, say you have three lists:
>>> l1 = [1, 2, 3, 4]
>>> l2 = [5, 6, 7, 8]
>>> l3 = [l1, l2]
>>> l3
[[1, 2, 3, 4], [5, 6, 7, 8]]
Now put this nested structure in a NLI, move to 5
, and try to insert.
>>> nli = NestedListIter(l3)
>>> nli.find(5)
>>> nli.insert(9)
>>> nli.nested_list
[[1, 2, 3, 4], [9, 5, 6, 7, 8]]
As you can see, you can insert something into l2
, but you can't easily insert something into l3
. In fact, to do so right now, you have to use a private function, which breaks the cursor abstraction in an unpleasant way:
>>> nli._insert_at(nli.stack[:-1], 10)
>>> nli.nested_list
[[1, 2, 3, 4], 10, [9, 5, 6, 7, 8]]
>>> nli.get_item()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "nestedlistiterator.py", line 130, in get_item
return self._get_item_at(self.stack)
File "nestedlistiterator.py", line 39, in _get_item_at
item = item[i]
TypeError: 'int' object is unsubscriptable
There are surely ways to implement a safe public insert_between_branches
method, but they involve more complication than I care to bother with right now.
Another problem appears when one tries to insert a value after 4
. As you've seen, you can insert a value into l2
before 5
, but if you move the cursor to 4
and insert
, you'll quickly realize that you can't insert something after 4
inside l1
.
>>> nli.go_to_head()
>>> nli.find(4)
>>> nli.insert(11)
>>> nli.nested_list
[[1, 2, 3, 11, 4], 10, [9, 5, 6, 7, 8]]
From the perspective of flat access, inserting after 4 and inserting before 5 are the same thing, but from the perspective of the nested list, they are different. Since insert
is effectively a left_insert
, this problem could be partially rectified with a right_insert
method (which would, in turn, be unable to insert at the beginning of l1).
These problems could probably be dealt with more generally by allowing the cursor to point to lower-level objects, but that would make flat access more complicated. In short, any attempt to rectify these problems will lead to greater complexity, either on the flat or the nested side of the interface.
(That is actually why I still prefer the simple enumerate_nested
method! A proper tree structure with values at all nodes (and not just top-level nodes) might also be simpler and better. But this was fun to code nonetheless.)
import collections
class NestedListIter(object):
'''A mutable container that enables flat traversal of a nested tree of
lists. nested_list should contain only a list-like mutable sequence.
To preserve a clear demarcation between 'leaves' and 'branches', empty
sequences are not allowed as toplevel objects.'''
def __init__(self, nested_list):
if not nested_list:
raise ValueError, 'nested_list must be a non-empty sequence'
self.nested_list = nested_list # at some point, vet this to make sure
self.go_to_head() # it contains no empty sequences
def _is_sequence(self, item=None):
'''Private method to test whether an item is a non-string sequence.
If item is None, test current item.'''
if item is None:
item = self._get_item_at(self.stack)
return isinstance(item, collections.Sequence) and not isinstance(item, basestring)
def _is_in_range(self, index_tuple=None):
'''Private method to test whether an index is in range.
If index is None, test current index.'''
if index_tuple is None:
index_tuple = self.stack
if any(x < 0 for x in index_tuple):
return False
try:
self._get_item_at(index_tuple)
except IndexError:
return False
else:
return True
def _get_item_at(self, index_tuple):
'''Private method to get item at an arbitrary index, with no bounds checking.'''
item = self.nested_list
for i in index_tuple:
item = item[i]
return item
def _set_item_at(self, index_tuple, value):
'''Private method to set item at an arbitrary index, with no bounds checking.
Throws a ValueError if value is an empty non-string sequence.'''
if self._is_sequence(value) and not value:
raise ValueError, "Cannot set an empty list!"
containing_list = self._get_item_at(index_tuple[:-1])
containing_list[index_tuple[-1]] = value
def _insert_at(self, index_tuple, value):
'''Private method to insert item at an arbitrary index, with no bounds checking.
Throws a ValueError if value is an empty non-string sequence.'''
if self._is_sequence(value) and not value:
raise ValueError, "Cannot insert an empty list!"
containing_list = self._get_item_at(index_tuple[:-1])
containing_list.insert(index_tuple[-1], value)
def _delete_at(self, index_tuple):
'''Private method to delete item at an arbitrary index, with no bounds checking.
Recursively deletes a resulting branch of empty lists.'''
containing_list = self._get_item_at(index_tuple[:-1])
del containing_list[index_tuple[-1]]
if not self._get_item_at(index_tuple[:-1]):
self._delete_at(index_tuple[:-1])
def _increment_stack(self):
'''Private method that tires to increment the top value of the stack.
Returns True on success, False on failure (empty stack).'''
try:
self.stack[-1] += 1
except IndexError:
return False
else:
return True
def _decrement_stack(self):
'''Private method that tries to decrement the top value of the stack.
Returns True on success, False on failure (empty stack).'''
try:
self.stack[-1] -= 1
except IndexError:
return False
else:
return True
def go_to_head(self):
'''Move the cursor to the head of the nested list.'''
self.stack = []
while self._is_sequence():
self.stack.append(0)
def go_to_tail(self):
self.stack = []
'''Move the cursor to the tail of the nested list.'''
while self._is_sequence():
self.stack.append(len(self.get_item()) - 1)
def right(self):
'''Move cursor one step right in the nested list.'''
while self._increment_stack() and not self._is_in_range():
self.stack.pop()
if not self.stack:
self.go_to_tail()
return False
while self._is_sequence():
self.stack.append(0)
return True
def left(self):
'''Move cursor one step left in the nested list.'''
while self._decrement_stack() and not self._is_in_range():
self.stack.pop()
if not self.stack:
self.go_to_head()
return False
while self._is_sequence():
self.stack.append(len(self.get_item()) - 1)
return True
def move_cursor(self, index_tuple):
'''Move cursor to the location indicated by index_tuple.
Raises an error if index_tuple is out of range or doesn't correspond
to a toplevel object.'''
item = self._get_item_at(index_tuple)
if self._is_sequence(item):
raise IndexError, 'index_tuple must point to a toplevel object'
def get_item(self):
'''Get the item at the cursor location.'''
return self._get_item_at(self.stack)
def set_item(self, value):
'''Set the item a the cursor locaiton.'''
return self._set_item_at(self.stack, value)
def insert(self, value):
'''Insert an item at the cursor location. If value is a sequence,
cursor moves to the first toplevel object in value after insertion.
Otherwise, cursor does not move.'''
temp_stack = self.stack[:]
self.left()
self._insert_at(temp_stack, value)
self.right()
def delete(self):
'''Deete an item at the cursor location. Cursor does not move.'''
temp_stack = self.stack[:]
self.left()
self._delete_at(temp_stack)
self.right()
def iterate(self):
'''Iterate over the values in nested_list in sequence'''
self.go_to_head()
yield self.get_item()
while self.right():
yield self.get_item()
def iterate_left(self):
'''Iterate over the values in nested_list in reverse.'''
self.go_to_tail()
yield self.get_item()
while self.left():
yield self.get_item()
def find(self, value):
'''Search for value in nested_list; move cursor to first location of value.'''
for i in self.iterate():
if i == value:
break
def find_left(self, value):
'''Search for value backwards in nested_list; move cursor to last location of value.'''
for i in self.iterate_left():
if i == value:
break
def _NLI_Test():
l = [1, 2, 3, ['a', 'b', 'c'], 4, ['d', 'e', [100, 200, 300]], 5, ['a', 'b', 'c'], 6]
nli = NestedListIter(l)
print nli.nested_list
for i in nli.iterate():
print i,
print
for i in nli.iterate_left():
print i,
print
nli.go_to_head()
for i in range(5):
nli.right()
nli.insert('cow')
nli.insert(['c', ['o', 'w']])
print nli.nested_list
nli.find('cow')
print nli.get_item()
nli.delete()
print nli.nested_list
nli.find('c')
nli.delete()
print nli.nested_list
nli.find_left('w')
nli.delete()
nli.find('o')
nli.delete()
print nli.nested_list
print nli.nested_list == l
nli.find(100)
nli.set_item(100.1)
print nli.nested_list
if __name__ == '__main__':
_NLI_Test()
The first part is easy enough.
>>> reduce(lambda x, y: x[y], [4, 3, 2], nestedList)
300
The second part requires much more effort, but is still doable. Hint:
>>> a = [1, 2, 3]
>>> a[slice(1, None)] = [4, 5]
>>> a
[1, 4, 5]