What does "Stream does not contain valid UTF-8" mean?
I am creating a simple HTTP server. I need to read the requested image and send it to the browser. I am using this code:
fn read_file(mut file_name: String) -> String {
file_name = file_name.replace("/", "");
if file_name.is_empty() {
file_name = String::from("index.html");
}
let path = Path::new(&file_name);
if !path.exists() {
return String::from("Not Found!");
}
let mut file_content = String::new();
let mut file = File::open(&file_name).expect("Unable to open file");
let res = match file.read_to_string(&mut file_content) {
Ok(content) => content,
Err(why) => panic!("{}",why),
};
return file_content;
}
This works if the requested file is text based, but when I want to read the image, I get the following message:
stream does not contain valid UTF-8
What does this mean and how can I fix it?
source to share
The documentation forString
describes it as:
Encoded, growing UTF-8 string.
Wikipedia's definition of UTF-8 will give you a lot of information on what it is. The short version is that computers use a byte block to represent data . Unfortunately, these blobs of data, represented with bytes, have no intrinsic meaning; which must be provided externally. UTF-8 is one way of interpreting a sequence of bytes, just like file formats like JPEG .
UTF-8, like most text encodings, has specific requirements and byte sequences that are valid and invalid. Whichever image you are trying to load contains a sequence of bytes that cannot be interpreted as a UTF-8 string; this error message tells you.
To fix this, you shouldn't use String
to store arbitrary collections of bytes. In Rust, which is better represented Vec
:
fn read_file(mut file_name: String) -> Vec<u8> {
file_name = file_name.replace("/", "");
if file_name.is_empty() {
file_name = String::from("index.html");
}
let path = Path::new(&file_name);
if !path.exists() {
return String::from("Not Found!").into();
}
let mut file_content = Vec::new();
let mut file = File::open(&file_name).expect("Unable to open file");
file.read_to_end(&mut file_content).expect("Unable to read");
file_content
}
To evangelize a little, this is a great aspect of why Rust is a good language. Since there is a type representing "a set of bytes that is guaranteed to be a valid UTF-8 string", we can write safer programs because we know that this invariant will always be true. We don't have to constantly check our entire program to "make sure it is still a string."
source to share