In this guide, we’ll explore different methods to convert strings to upper and lower case in C++. We’ll cover both traditional approaches and modern C++ techniques, helping you choose the most appropriate method for your needs.
Table of Contents
Basic Character Conversion
The simplest way to convert string case is using the C-style functions toupper()
and tolower()
from the cctype
library.
#include <iostream>
#include <string>
#include <cctype>
int main() {
std::string text = "Hello, World!"; // Original text
// Convert to upper case
for(char &c : text) { // Iterate through each character
c = std::toupper(c); // Convert character to upper case
}
std::cout << "Upper case: " << text << '\n';
// Convert to lower case
for(char &c : text) { // Iterate through each character
c = std::tolower(c); // Convert character to lower case
}
std::cout << "Lower case: " << text << '\n';
return 0;
}
Lower case: hello, world!
Using STL Algorithms
A more modern approach uses the STL algorithm transform
along with toupper
or tolower
.
#include <iostream>
#include <string>
#include <algorithm>
#include <cctype>
int main() {
std::string text = "Hello, World!"; // Original text
// Create a copy for upper case
std::string upper = text; // Make a copy of original text
std::transform(upper.begin(), // Transform each character
upper.end(), // in the string
upper.begin(), // starting from beginning
::toupper); // using toupper function
// Create a copy for lower case
std::string lower = text; // Make a copy for lower case
std::transform(lower.begin(), // Transform each character
lower.end(), // in the string
lower.begin(), // starting from beginning
::tolower); // using tolower function
// Print results
std::cout << "Original: " << text << '\n'
<< "Upper: " << upper << '\n'
<< "Lower: " << lower << '\n';
return 0;
}
Upper: HELLO, WORLD!
Lower: hello, world!
Custom Conversion Functions
Creating wrapper functions can make case conversion more convenient and reusable.
#include <iostream>
#include <string>
#include <algorithm>
#include <cctype>
// Function to convert string to upper case
std::string toUpper(std::string str) { // Takes string by value
std::transform(str.begin(), // Transform entire string
str.end(), // from beginning to end
str.begin(), // store result from beginning
::toupper); // using toupper function
return str; // Return modified copy
}
// Function to convert string to lower case
std::string toLower(std::string str) { // Takes string by value
std::transform(str.begin(), // Transform entire string
str.end(), // from beginning to end
str.begin(), // store result from beginning
::tolower); // using tolower function
return str; // Return modified copy
}
int main() {
std::string text = "Hello, World!";
// Using our custom functions
std::cout << "Original: " << text << '\n'
<< "Upper: " << toUpper(text) << '\n'
<< "Lower: " << toLower(text) << '\n'
<< "Original remains: " << text << '\n';
return 0;
}
Upper: HELLO, WORLD!
Lower: hello, world!
Original remains: Hello, World!
Locale-Aware Conversion
For proper handling of international characters, use locale-aware versions.
#include <iostream>
#include <string>
#include <algorithm>
#include <locale>
int main() {
std::string text = "Hello, München!"; // Text with umlaut
std::locale loc("en_US.UTF-8"); // Create locale object
std::string upper = text; // Copy for upper case
std::transform(upper.begin(), // Transform string
upper.end(), // from start to end
upper.begin(), // store from beginning
[&loc](char c) { // Lambda for conversion
return std::toupper(c, loc);
});
std::cout << "Upper: " << upper << '\n';
return 0;
}
Key Challenges in Unicode Case Conversion
Important: Simple case conversion methods in C++ often fall short when dealing with Unicode text. Here's why:
- Standard C++ string operations are byte-oriented, not Unicode-aware.
- Simple case conversion fails with multi-character mappings like 'ß' (which should become 'SS').
- Context-sensitive characters like Greek 'Σ' require different lowercase forms ('σ' or 'ς') based on their position in the word.
- Locale-specific rules (e.g., Turkish 'I' → 'ı') aren't handled properly.
- UTF-8 strings can be corrupted by naive substring operations.
The ICU Library Solution
For proper Unicode case conversion, use the ICU (International Components for Unicode) library:
#include <unicode/unistr.h>
#include <unicode/ustream.h>
#include <unicode/locid.h>
#include <iostream>
int main() {
// Greek text "Οδυσσέας" (Ulysses)
const char* text = u8"Οδυσσέας"; // UTF-8 encoded string
// Convert to ICU's Unicode string
icu::UnicodeString ustr1(text, "UTF-8");
icu::UnicodeString ustr2(text, "UTF-8");
// Greek locale
icu::Locale greekLocale("el");
// Convert to upper case with Greek locale
std::cout << ustr1.toUpper(greekLocale) << "\n";
// Convert to lower case with Greek locale
std::cout << ustr2.toLower(greekLocale) << "\n";
return 0;
}
οδυσσέας
Why ICU is Better:
- Handles multi-character mappings (ß → SS)
- Supports context-sensitive conversions (Σ → σ/ς)
- Proper locale handling (Turkish I → ı)
- Full Unicode normalization support
- Reliable across all platforms
Installing ICU4C
To use the ICU library, you'll need to install ICU4C. Follow the instructions below for your platform:
Windows
- Download the latest ICU4C precompiled binaries for Windows from the ICU GitHub Releases.
- Extract the downloaded ZIP file to a directory, e.g.,
C:\icu
. - Add
C:\icu\bin
to your system's PATH environment variable. - Link the library files (e.g.,
icuuc.lib
,icuin.lib
) in your project's build settings.
Linux
sudo apt update
sudo apt install libicu-dev
This installs the development headers and libraries required for building projects with ICU support on Debian-based systems (e.g., Ubuntu).
For Fedora/CentOS-based systems, use:
sudo dnf install libicu-devel
Or, if using yum
:
sudo yum install libicu-devel
Compile on Linux
g++ -Wall -std=c++17 main.cpp -I/usr/include -L/usr/lib -licuuc -licuio
This command compiles the program with ICU headers and libraries located in the default system paths. Ensure you have libicu-dev
installed.
Explanation of the flags:
-Wall
: Enables all warnings to help identify potential issues.-std=c++17
: Ensures the compiler uses C++17 features required by ICU.-I/usr/include
: Specifies the path to ICU header files (default for package manager installations).-L/usr/lib
: Specifies the path to ICU library files (default for package manager installations).-licuuc -licuio
: Links the Unicode (`icuuc`) and I/O (`icuio`) libraries from ICU.
Custom ICU Installation on Linux
If you built ICU from source or installed it in a custom directory, update the include and library paths accordingly. For example:
g++ -Wall -std=c++17 main.cpp -I/path/to/icu/include -L/path/to/icu/lib -licuuc -licuio
Replace /path/to/icu
with the actual installation directory of your ICU library, which you can check by doing:
find / -name "icu" 2>/dev/null
Verification
icuinfo
This command outputs information about the installed ICU version and its configuration. If it is not available, ensure libicu-dev
or libicu-devel
is correctly installed.
macOS
brew install icu4c
After installation, ensure you link the ICU library during compilation:
g++ -Wall -std=c++17 main.cpp -I/usr/local/opt/icu4c/include -L/usr/local/opt/icu4c/lib -licuuc -licuio
Verifying Installation
icuinfo
If installed correctly, this command will display detailed information about your ICU library installation.
Best Practices for String Case Conversion
When working with string case conversion in C++, consider these best practices:
- Use the ICU library for robust and reliable Unicode-aware transformations. This is essential when dealing with international text, locale-specific rules, or context-sensitive characters.
- Leverage STL's
std::transform
for clean, modern code when working with single-byte character sets. - Create wrapper functions to simplify case conversion logic and make your code reusable.
- Prefer locale-aware methods (e.g., ICU's
toLower
andtoUpper
) over basic functions liketoupper
andtolower
to handle special cases like Turkish 'I' → 'ı'. - For small-scale projects or performance-critical scenarios, basic C-style functions may suffice, but ensure proper validation for input encoding.
- Always validate and normalize Unicode input to prevent inconsistencies in text processing.
- When working with UTF-8 encoded strings, avoid direct manipulation (e.g., substring operations) and instead use libraries like ICU to preserve string integrity.
Conclusion
In this tutorial, we've explored a variety of methods for string case conversion in C++. From basic techniques to advanced Unicode handling, here's a recap of what we've covered:
- Challenges with Unicode Case Conversion: Recognizing the limitations of standard C++ functions and the importance of locale and Unicode-aware methods.
- ICU Library Solution: Leveraging the ICU library for robust, locale-aware string transformations, with installation and usage guidance for Windows, Linux, and macOS.
- Basic Character Manipulation: Using
toupper
andtolower
for simple, single-byte character conversions. - STL Algorithms: Applying
std::transform
for clean, modern C++ code and functional-style transformations. - Custom Functions: Writing reusable wrapper functions to streamline case conversion logic in your applications.
- Locale-Aware Methods: Ensuring international text is handled correctly by using locale-specific solutions.
- Best Practices: Choosing the right approach based on project requirements, performance considerations, and Unicode needs.
Choose the method that best fits your needs, whether it's simple loops for basic cases or the ICU library for advanced, internationalized applications.
Congratulations on reading to the end of this tutorial! We hope you now have a better understanding of string case conversion in C++. For further exploration of string manipulation, Unicode, and related documentation, check out the resources in our Further Reading section.
Have fun and happy coding!
Further Reading
-
C++ Reference: toupper
A comprehensive guide to the
toupper
function in C++, used for basic character case conversion. -
C++ Reference: std::transform
Documentation on the
std::transform
algorithm, ideal for applying transformations like case conversion to entire strings. -
C++ Reference: std::locale
An overview of the
std::locale
class, which enables locale-aware string processing and character transformations. -
ICU User Guide: Strings
An essential resource for understanding how the ICU library handles Unicode strings, including case conversion and normalization.
-
Unicode FAQ: Case Mapping and Character Properties
Answers to frequently asked questions about Unicode case mapping and character properties, including examples of complex transformations.
-
Try Your Code: Online C++ Compiler
Experiment with the code examples from this post using a convenient online C++ compiler hosted by Research Data Pod.
-
Wikipedia: Unicode Overview
A general introduction to Unicode, its history, and its importance in modern text processing and internationalization.
Attribution and Citation
If you found this guide and tools helpful, feel free to link back to this page or cite it in your work!
Suf is a senior advisor in data science with deep expertise in Natural Language Processing, Complex Networks, and Anomaly Detection. Formerly a postdoctoral research fellow, he applied advanced physics techniques to tackle real-world, data-heavy industry challenges. Before that, he was a particle physicist at the ATLAS Experiment of the Large Hadron Collider. Now, he’s focused on bringing more fun and curiosity to the world of science and research online.