How do I replace a single character in a string in

2020-04-21 09:08发布

问题:

This isn't the exact use case, but it is basically what I am trying to do:

let mut username = "John_Smith";
println!("original username: {}",username);
username.set_char_at(4,'.'); // <------------- The part I don't know how to do
println!("new username: {}",username);

I can't figure out how to do this in constant time and using no additional space. I know I could use "replace" but replace is O(n). I could make a vector of the characters but that would require additional space.

I think you could create another variable that is a pointer using something like as_mut_slice, but this is deemed unsafe. Is there a safe way to replace a character in a string in constant time and space?

回答1:

If you want to handle only ASCII there is separate type for that:

use std::ascii::{AsciiCast, OwnedAsciiCast};

fn main() {
    let mut ascii = "ascii string".to_string().into_ascii();
    *ascii.get_mut(6) = 'S'.to_ascii();
    println!("result = {}", ascii);
}

There are some missing pieces (like into_ascii for &str) but it does what you want. Current implementaion of to_/into_ascii fails if input string is invalid ascii. There is to_ascii_opt (old naming of methods that might fail) but will probably be renamed to to_ascii in the future (and failing method removed or renamed).



回答2:

In general ? For any pair of characters ? It's impossible.


A string is not an array. It may be implemented as an array, in some limited contexts.

Rust supports Unicode, which brings some challenges:

  • a Unicode code point might is an integral between 0 and 224
  • a grapheme may be composed of multiple Unicode code points

In order to represent this, a Rust string is (for now) a UTF-8 bytes sequence:

  • a single Unicode code point might be represented by 1 to 4 bytes
  • a grapheme might be represented by 1 or more bytes (no upper limit)

and therefore, the very notion of "replacing character i" brings a few challenges:

  • the position of character i is between the index i and the end of the string, it requires reading the string from the beginning to know exactly where though, which is O(N)
  • switching the i-th character in-place for another requires that both characters take up exactly the same amount of bytes

In general ? It's impossible.

In a particular and very specific case where the byte index is known and the byte encoding is known coincide length-wise, it is doable by directly modifying the byte sequence return by as_mut_bytes which is duly marked unsafe since you may inadvertently corrupt the string (remember, this bytes sequence must be a UTF-8 sequence).



回答3:

As of Rust 1.27 you can now use String::replace_range:

let mut username = String::from("John_Smith");
println!("original username: {}", username);  // John_Smith
username.replace_range(4..5, ".");
println!("new username: {}", username);       // John.Smith

(playground)

replace_range won't work with &mut str. If the size of the range and the size of the replacement string aren't the same, it has to be able to resize the underlying String, so &mut String is required. But in the case you ask about (replacing a single-byte character with another single-byte character) its memory usage and time complexity are both O(1).

There is a similar method on Vec, Vec::splice. The primary difference between them is that splice returns an iterator that yields the removed items.