Select Page

How to Split a String in Rust

by | Programming, Rust, Tips

This tutorial will go through how to split a string in Rust.

Split String in Rust Using split()

We can use the split() method to split a string using a defined separator. The split() method returns an iterator, which we can loop over or call collect to get a vector. Let’s look at an example where we split a string using the comma separator, and we will then loop over the iterator using a for loop.

fn main() {
    let particles = "electron, muon, positron, neutrino, higgs boson, gluon".split(", ");
    for p in particles {
        println!("{}", p);
    }
}
electron
muon
positron
neutrino
higgs boson
gluon

Let’s look at an example of splitting a string using the comma separator; we will then call collect() to get a vector.

fn main() {
    let p: Vec<&str> = "electron, muon, positron, neutrino, higgs boson, gluon".split(", ").collect();
    println!("{:?}", p);
}
["electron", "muon", "positron", "neutrino", "higgs boson", "gluon"]

Split String in Rust Using split_whitespace()

If the string only has whitespace as a separator, we can use the split_whitespace() method, which performs the same way as the split() method. Let’s look at an example where we split a string using the whitespace separator; we will then loop over the strings in the iterator and print each one.

fn main() {
    let particles = "electron muon positron neutrino gluon".split_whitespace();
    for p in particles {
        println!("{}", p);
    }
}
electron
muon
positron
neutrino
gluon

Let’s look at an example where we split a string using the whitespace separator, and we will then call the collect() method to get a vector and print it to the console.

fn main() {

    let p: Vec<&str> = "electron muon positron neutrino gluon".split_whitespace().collect();

    println!("{:?}", p);
}
["electron", "muon", "positron", "neutrino", "gluon"]

Split String in Rust Using lines()

If we have a string where a newline \n or a carriage return with a line feed \r\n ends a line, you can split the string by these line endings using lines().

Let’s look at the use of lines() to split a string and collect it into a vector:

fn main() {

    let p: Vec<&str> = "electron\nmuon\npositron\nneutrino\ngluon".lines().collect();

    println!("{:?}", p);

}
["electron", "muon", "positron", "neutrino", "gluon"]

Split String in Rust Using Regex

We can use the regex crate to split strings in Rust. Regex provides a library for parsing, compiling, and executing regular expressions. To use regex, you must add it to your dependencies in your project’s Cargo.toml.

[dependencies]
regex = "1"

Let’s look at how to use regex to split a string by whitespace \s.

use regex::Regex;

fn main() {

    let s: &str = "electron muon positron neutrino gluon";

    let p: Vec<&str> =  Regex::new(r"\s").unwrap().split(s).collect();

    println!("{:?}",p);

}

Let’s run the code to see the result:

["electron", "muon", "positron", "neutrino", "gluon"]

We can use regex to split the string if the separators are whitespace, commas, or full stops. Let’s look at the code to do this:

use regex::Regex;

fn main() {

    let s: &str = "electron, muon, positron, neutrino, gluon";

    let p: Vec<&str> =  Regex::new(r"([ ,.]+)").unwrap().split(s).collect();

    println!("{:?}",p);

}

Let’s run the code to see the result:

["electron", "muon", "positron", "neutrino", "gluon"]

Split String on Multiple Delimiters in Rust

We can use the split() method to split using multiple delimiters. We can provide a conditional statement using the OR operator || to split if we find any specified separators. Let’s look at an example:

fn main() {

    let p: Vec<&str> = "electron:muon;positron neutrino-gluon".split(|c| (c == ':') || (c == ';') || (c ==' ') || (c == '-')).collect();

    println!("{:?}", p);

}
["electron", "muon", "positron", "neutrino", "gluon"]

We can also use split() with a list of separators as an argument and use as_ref(). Let’s look at an example:

fn main() {

    let p: Vec<&str> = "electron:muon;positron neutrino-gluon".split([':', ';', ' ', '-'].as_ref()).collect();

    println!("{:?}", p);

}
["electron", "muon", "positron", "neutrino", "gluon"]

Summary

Congratulations on reading to the end of this tutorial! We have gone through how to split a string with several methods and for different types of separators.

For further reading on Rust, go to the articles:

Have fun and happy researching!

Research Scientist at Moogsoft | + posts

Suf is a research scientist at Moogsoft, specializing in Natural Language Processing and Complex Networks. Previously he was a Postdoctoral Research Fellow in Data Science working on adaptations of cutting-edge physics analysis techniques to data-intensive problems in industry. In another life, he was an experimental particle physicist working on the ATLAS Experiment of the Large Hadron Collider. His passion is to share his experience as an academic moving into industry while continuing to pursue research. Find out more about the creator of the Research Scientist Pod here and sign up to the mailing list here!