Why does Rust have String
and str
? What are the differences between String
and str
? When does one use String
instead of str
and vice versa? Is one of them getting deprecated?
相关问题
- how to split a list into a given number of sub-lis
- Generate string from integer with arbitrary base i
- Share Arc between closures
- Converting a string array to a byte array
- Function references: expected bound lifetime param
相关文章
- How can I convert a f64 to f32 and get the closest
- JSP String formatting Truncate
- What is a good way of cleaning up after a unit tes
- Selecting only the first few characters in a strin
- Python: print in two columns
- extending c++ string member functions
- Google app engine datastore string encoding proble
- How can I unpack (destructure) elements from a vec
I have a C++ background and I found it very useful to think about
String
and&str
in C++ terms:String
is like astd::string
; it owns the memory and does the dirty job of managing memory.&str
is like achar*
(but a little more sophisticated); it points us to the beginning of a chunk in the same way you can get a pointer to the contents ofstd::string
.Are either of them going to disappear? I do not think so. They serve two purposes:
String
keeps the buffer and is very practical to use.&str
is lightweight and should be used to "look" into strings. You can search, split, parse, and even replace chunks without needing to allocate new memory.&str
can look inside of aString
as it can point to some string literal. The following code needs to copy the literal string into theString
managed memory:The following code lets you use the literal itself without copy (read only though)
In easy words,
String
is datatype stored on heap (just likeVec
), and you have access to that location.&str
is a slice type. That means it is just reference to an already presentString
somewhere in the heap.&str
doesn't do any allocation at runtime. So, for memory reasons, you can use&str
overString
. But, keep in mind that when using&str
you might have to deal with explicit lifetimes.String
is the dynamic heap string type, likeVec
: use it when you need to own or modify your string data.str
is an immutable1 sequence of UTF-8 bytes of dynamic length somewhere in memory. Since the size is unknown, one can only handle it behind a pointer. This means thatstr
most commonly2 appears as&str
: a reference to some UTF-8 data, normally called a "string slice" or just a "slice". A slice is just a view onto some data, and that data can be anywhere, e.g."foo"
is a&'static str
. The data is hardcoded into the executable and loaded into memory when the program runs.String
:String
dereferences to a&str
view of theString
's data.on the stack: e.g. the following creates a stack-allocated byte array, and then gets a view of that data as a
&str
:In summary, use
String
if you need owned string data (like passing strings to other tasks, or building them at runtime), and use&str
if you only need a view of a string.This is identical to the relationship between a vector
Vec<T>
and a slice&[T]
, and is similar to the relationship between by-valueT
and by-reference&T
for general types.1 A
str
is fixed length; you cannot write bytes beyond the end, or leave trailing invalid bytes. Since UTF-8 is a variable width encoding, this effectively forces allstr
s to be immutable. In general, mutation requires writing more or fewer bytes than there were before (e.g. replacing ana
(1 byte) with anä
(2+ bytes) would require making more room in thestr
).2 At the moment it can only appear as
&str
, but dynamically sized types may allow things likeRc<str>
for a sequence of reference counted UTF-8 bytes. It also may not,str
doesn't quite fit into the DST scheme perfectly, since there is no fixed size version (yet).They are actually completely different. First off, a
str
is nothing but a type level thing; it can only be reasoned about at the type level because it's a so-called dynamically-sized type (DST). The size thestr
takes up cannot be known at compile time and depends on runtime information — it cannot be stored in a variable because the compiler needs to know at compile time what the size of each variable is. Astr
is conceptually just a row ofu8
bytes with the guarantee that it forms valid UTF-8. How large is the row? No one knows until runtime hence it can't be stored in a variable.The interesting thing is that a
&str
or any other pointer to astr
likeBox<str>
does exist at runtime. This is a so-called "fat pointer"; it's a pointer with extra information (in this case the size of the thing it's pointing at) so it's twice as large. In fact, a&str
is quite close to aString
(but not to a&String
). A&str
is two words; one pointer to a the first byte of astr
and another number that describes how many bytes long the thestr
is.Contrary to what is said, a
str
does not need to be immutable. If you can get a&mut str
as an exclusive pointer to thestr
, you can mutate it and all the safe functions that mutate it guarantee that the UTF-8 constraint is upheld because if that is violated then we have undefined behaviour as the library assumes this constraint is true and does not check for it.So what is a
String
? That's three words; two are the same as for&str
but it adds a third word which is the capacity of thestr
buffer on the heap, always on the heap (astr
is not necessarily on the heap) it manages before it's filled and has to re-allocate. theString
basically owns astr
as they say; it controls it and can resize it and reallocate it when it sees fit. So aString
is as said closer to a&str
than to astr
.Another thing is a
Box<str>
; this also owns astr
and its runtime representation is the same as a&str
but it also owns thestr
unlike the&str
but it cannot resize it because it does not know its capacity so basically aBox<str>
can be seen as a fixed-lengthString
that cannot be resized (you can always convert it into aString
if you want to resize it).A very similar relationship exists between
[T]
andVec<T>
except there is no UTF-8 constraint and it can hold any type whose size is not dynamic.The use of
str
on the type level is mostly to create generic abstractions with&str
; it exists on the type level to be able to conveniently write traits. In theorystr
as a type thing didn't need to exist and only&str
but that would mean a lot of extra code would have to be written that can now be generic.&str
is super useful to be able to to have multiple different substrings of aString
without having to copy; as said aString
owns thestr
on the heap it manages and if you could only create a substring of aString
with a newString
it would have to copied because everything in Rust can only have one single owner to deal with memory safety. So for instance you can slice a string:We have two different substring
str
s of the same string.string
is the one that owns the actual fullstr
buffer on the heap and the&str
substrings are just fat pointers to that buffer on the heap.String are a vector of char, you can access to it and modify str are immutable
str
, only used as&str
, is a string slice, a reference to a UTF-8 byte array.String
is what used to be~str
, a growable, owned UTF-8 byte array.