How to Split a String in Rust

by | Programming, Rust, Tips

This tutorial will go through how to split a string in Rust.

Split String in Rust Using split()

We can use the split() method to split a string using a defined separator. The split() method returns an iterator, which we can loop over or call collect to get a vector. Let’s look at an example where we split a string using the comma separator, and we will then loop over the iterator using a for loop.

fn main() {
    let particles = "electron, muon, positron, neutrino, higgs boson, gluon".split(", ");
    for p in particles {
        println!("{}", p);
    }
}
electron
muon
positron
neutrino
higgs boson
gluon

Let’s look at an example of splitting a string using the comma separator; we will then call collect() to get a vector.

fn main() {
    let p: Vec<&str> = "electron, muon, positron, neutrino, higgs boson, gluon".split(", ").collect();
    println!("{:?}", p);
}
["electron", "muon", "positron", "neutrino", "higgs boson", "gluon"]

Split String in Rust Using split_whitespace()

If the string only has whitespace as a separator, we can use the split_whitespace() method, which performs the same way as the split() method. Let’s look at an example where we split a string using the whitespace separator; we will then loop over the strings in the iterator and print each one.

fn main() {
    let particles = "electron muon positron neutrino gluon".split_whitespace();
    for p in particles {
        println!("{}", p);
    }
}
electron
muon
positron
neutrino
gluon

Let’s look at an example where we split a string using the whitespace separator, and we will then call the collect() method to get a vector and print it to the console.

fn main() {

    let p: Vec<&str> = "electron muon positron neutrino gluon".split_whitespace().collect();

    println!("{:?}", p);
}
["electron", "muon", "positron", "neutrino", "gluon"]

Split String in Rust Using lines()

If we have a string where a newline \n or a carriage return with a line feed \r\n ends a line, you can split the string by these line endings using lines().

Let’s look at the use of lines() to split a string and collect it into a vector:

fn main() {

    let p: Vec<&str> = "electron\nmuon\npositron\nneutrino\ngluon".lines().collect();

    println!("{:?}", p);

}
["electron", "muon", "positron", "neutrino", "gluon"]

Split String in Rust Using Regex

We can use the regex crate to split strings in Rust. Regex provides a library for parsing, compiling, and executing regular expressions. To use regex, you must add it to your dependencies in your project’s Cargo.toml.

[dependencies]
regex = "1"

Let’s look at how to use regex to split a string by whitespace \s.

use regex::Regex;

fn main() {

    let s: &str = "electron muon positron neutrino gluon";

    let p: Vec<&str> =  Regex::new(r"\s").unwrap().split(s).collect();

    println!("{:?}",p);

}

Let’s run the code to see the result:

["electron", "muon", "positron", "neutrino", "gluon"]

We can use regex to split the string if the separators are whitespace, commas, or full stops. Let’s look at the code to do this:

use regex::Regex;

fn main() {

    let s: &str = "electron, muon, positron, neutrino, gluon";

    let p: Vec<&str> =  Regex::new(r"([ ,.]+)").unwrap().split(s).collect();

    println!("{:?}",p);

}

Let’s run the code to see the result:

["electron", "muon", "positron", "neutrino", "gluon"]

Split String on Multiple Delimiters in Rust

We can use the split() method to split using multiple delimiters. We can provide a conditional statement using the OR operator || to split if we find any specified separators. Let’s look at an example:

fn main() {

    let p: Vec<&str> = "electron:muon;positron neutrino-gluon".split(|c| (c == ':') || (c == ';') || (c ==' ') || (c == '-')).collect();

    println!("{:?}", p);

}
["electron", "muon", "positron", "neutrino", "gluon"]

We can also use split() with a list of separators as an argument and use as_ref(). Let’s look at an example:

fn main() {

    let p: Vec<&str> = "electron:muon;positron neutrino-gluon".split([':', ';', ' ', '-'].as_ref()).collect();

    println!("{:?}", p);

}
["electron", "muon", "positron", "neutrino", "gluon"]

Summary

Congratulations on reading to the end of this tutorial! We have gone through how to split a string with several methods and for different types of separators.

For further reading on Rust, go to the articles:

Have fun and happy researching!

Profile Picture
Senior Advisor, Data Science | [email protected] | + posts

Suf is a senior advisor in data science with deep expertise in Natural Language Processing, Complex Networks, and Anomaly Detection. Formerly a postdoctoral research fellow, he applied advanced physics techniques to tackle real-world, data-heavy industry challenges. Before that, he was a particle physicist at the ATLAS Experiment of the Large Hadron Collider. Now, he’s focused on bringing more fun and curiosity to the world of science and research online.

Buy Me a Coffee