Why is num::One needed for iterating over a range?

2019-05-21 02:05发布

问题:

Why does the following for loop with a char range fail to compile?

fn main() {
    for c in 'a'..'z' {
        println!("{}", c);
    }
}

Error...

main.rs:11:5: 14:2 error: the trait `core::num::One` is not implemented for the type `char` [E0277]
main.rs:11     for c in 'a'..'z' {
main.rs:12         println!("{}", c);
main.rs:13     }
main.rs:14 }
main.rs:11:5: 14:2 error: the trait `core::iter::Step` is not implemented for the type `char` [E0277]
main.rs:11     for c in 'a'..'z' {
main.rs:12         println!("{}", c);
main.rs:13     }
main.rs:14 }

Why do you even need core::num::One for a iterating over a range?

回答1:

The x..y syntax is sugar for std::ops::Range { start: x, end: y }. This type (Range<A>) is iterable due to the implementation of Iterator for it, specifically, from that page:

impl<A> Iterator for Range<A>
   where A: One + Step,
         &'a A: Add<&'a A>,
         &'a A::Output == A {
    type Item = A;

This is saying that Range<A> can behave as an iterator over As if the type A implements One and Step, and can be added in the right way.

In this case, char satisfies none of those: it is semantically nonsense for char to have One or be addable, and it doesn't implement Step either.

That said, since char doesn't implement those traits (and hence Range<char> doesn't behave like an iterator via that impl), it should be possible to have a manual impl:

impl Iterator for Range<char> {
    type Item = char;

which would allow for x in 'a'..'z' to work.

However, this probably isn't semantically what we want: the .. range doesn't include the last element, which would be suprising for characters, one would have to write 'a'..'{' to get the letters A through Z. There's been proposals for inclusive-range syntax, e.g. one example is 'a'...'z' (more dots == more elements), and I would imagine that there would be an Iterator implementation for this type with chars.

As others have demonstrated, for ASCII characters one can use byte literals, and more generally, one can cast characters to u32s:

for i in ('à' as u32)..('æ' as u32) + 1 {
    let c = std::char::from_u32(i).unwrap();
    println!("{}", c);
}

Which gives:

à
á
â
ã
ä
å
æ

NB. this approach isn't perfect, it will crash if the range crosses the surrogate range, 0xD800-0xDFFF.

I just published a crate, char-iter, which handles the latter correctly and behaves like one would expect. Once added (via cargo), it can be used like:

extern crate char_iter;
// ...

for c in char_iter::new('a', 'z') {
    // ...
}
for c in char_iter::new('à', 'æ') {
    // ...
}


回答2:

Well, you need Step to denote that the structure can be stepped over in both directions.

  /// Objects that can be stepped over in both directions.
  ///
  /// The `steps_between` function provides a way to efficiently compare
  /// two `Step` objects.
  pub trait Step: PartialOrd 

One on the other hand is used to retrieve a value from mutable iterator, while simultaneously incrementing it:

  #[inline]
  fn next(&mut self) -> Option<A> {
      if self.start < self.end {
          let mut n = &self.start + &A::one();
          mem::swap(&mut n, &mut self.start);
          Some(n)
      } else {
          None
      }
  }

Source


What you could do is make your range a u8 and then convert it back to char, like this:

fn main() {
    for c in (b'a'..b'z'+1) {
        println!(" {:?}", c as char);
    }
}

Note: That range are exclusive so ('a'..'z') is actually ('a', 'b', ... 'y'). Or in math notation [a,z) ;) .

That's why I add b'z'+1 instead of b'z'.

Note: u8 is valid, only because the characters are ASCII.



标签: rust