utl::stre

utl::stre (aka string expansions) header contains implementations of most commonly used string utils.

Motivation: Despite the seeming triviality of the topic a lot of implementations found online are either horribly inefficient or contain straight up bugs in some edge cases. Here, the goal is to “get it right” once so no time would be spent reinventing the wheel in the future.

Definitions

// Trimming
std::string trim_left (std::string str, char trimmed_char = ' ');
std::string trim_right(std::string str, char trimmed_char = ' ');
std::string trim      (std::string str, char trimmed_char = ' ');

// Padding
std::string pad_left (std::string_view str, std::size_t length, char padding_char = ' ');
std::string pad_right(std::string_view str, std::size_t length, char padding_char = ' ');
std::string pad      (std::string_view str, std::size_t length, char padding_char = ' ');

std::string pad_with_leading_zeroes(unsigned int number, std::size_t length = 10);

// Case conversions
std::string to_lower(std::string str);
std::string to_upper(std::string str);

// Substring checks
bool starts_with(std::string_view str, std::string_view substr);
bool ends_with  (std::string_view str, std::string_view substr);
bool contains   (std::string_view str, std::string_view substr);

// Token manipulation
std::string replace_all_occurrences(std::string str, std::string_view from, std::string_view to);

std::vector<std::string> split_by_delimiter(std::string_view str, std::string_view delimiter, bool keep_empty_tokens = false);

// Other utils
std::string repeat_char  (            char  ch, std::size_t repeats);
std::string repeat_string(std::string_view str, std::size_t repeats);

std::string escape_control_chars(std::string_view str);

std::size_t index_of_difference(std::string_view str_1, std::string_view str_2);

[!Note] Functions that can utilize mutable input string for a more efficient implementation take std::string by value. This means r-value arguments are moved & reused, while l-values are copied.

Methods

Trimming

std::string trim_left (std::string str, char trimmed_char = ' ');
std::string trim_right(std::string str, char trimmed_char = ' ');
std::string trim      (std::string str, char trimmed_char = ' ');

Trims characters equal to trimmed_char from the left / right / both sides of the string str.

Padding

std::string pad_left (std::string_view str, std::size_t length, char padding_char = ' ');
std::string pad_right(std::string_view str, std::size_t length, char padding_char = ' ');
std::string pad      (std::string_view str, std::size_t length, char padding_char = ' ');

Pads string str with character padding_char from left / right / both sides until it reaches given length.

Note: If str.size() >= length the string is left unchanged.

std::string pad_with_leading_zeroes(unsigned int number, std::size_t length = 10);

Pads a number with leading zeroes until it reaches given length. Useful for numbering files/data entries are labeled in lexicographic order.

Note: If number has more than length digits, resulting string is the same as std::to_string(number).

Case conversions

std::string to_lower(std::string str);

Replaces all uppercase letters ABCDEFGHIJKLMNOPQRSTUVWXYZ in the string str with corresponding lowercase letters abcdefghijklmnopqrstuvwxyz.

std::string to_upper(std::string str);

Replaces all lowercase letters abcdefghijklmnopqrstuvwxyz in the string str with corresponding uppercase letters ABCDEFGHIJKLMNOPQRSTUVWXYZ.

Substring checks

bool starts_with(std::string_view str, std::string_view substr);
bool ends_with  (std::string_view str, std::string_view substr);
bool contains   (std::string_view str, std::string_view substr);

Returns true if string str starts with / ends with / contains the substring substr.

Token manipulation

std::string replace_all_occurrences(std::string str, std::string_view from, std::string_view to);

Scans through the string str and replaces all occurrences of substring from with a string to.

std::vector<std::string> split_by_delimiter(std::string_view str, std::string_view delimiter, bool keep_empty_tokens = false);

Splits string str into a vector of std::string tokens based on delimiter.

By default keep_empty_tokens is false and "" is not considered to be a valid token — in case of leading / trailing / repeated delimiters, only non-empty tokens are going to be inserted into the resulting vector. Setting keep_empty_tokens to true overrides this behavior and keeps all the empty tokens intact.

Other utils

std::string repeat_char  (            char  ch, std::size_t repeats);
std::string repeat_string(std::string_view str, std::size_t repeats);

Repeats given character or string a given number of times and returns the resulting string.

std::string escape_control_chars(std::string_view str);

Escapes all control & non-printable characters in the string str using standard C++ notation (see corresponding example for a better idea).

Useful when printing strings to the terminal during logging & debugging.

std::size_t index_of_difference(std::string_view str_1, std::string_view str_2);

Returns the index of the first character that is different between string str_1 and str_2.

When both strings are the same, returns str_1.size().

Throws std::logical_error if str_1.size() != str_2.size().

Examples

Trimming strings

[ Run this code ]

using namespace utl;

assert(stre::trim_left ("   lorem ipsum   ") ==    "lorem ipsum   ");
assert(stre::trim_right("   lorem ipsum   ") == "   lorem ipsum"   );
assert(stre::trim      ("   lorem ipsum   ") ==    "lorem ipsum"   );

assert(stre::trim("__ASSERT_MACRO__", '_') == "ASSERT_MACRO");

Padding strings

[ Run this code ]

using namespace utl;

assert(stre::pad_left ("value", 9) == "    value" );
assert(stre::pad_right("value", 9) == "value    " );
assert(stre::pad      ("value", 9) == "  value  " );

assert(stre::pad(" label ", 15, '-') == "---- label ----");

assert(stre::pad_with_leading_zeroes(17, 5) == "00017");

Converting string case

[ Run this code ]

using namespace utl;

assert(stre::to_lower("Lorem Ipsum") == "lorem ipsum");
assert(stre::to_upper("lorem ipsum") == "LOREM IPSUM");

Substring checks

[ Run this code ]

using namespace utl;

assert(stre::starts_with("lorem ipsum", "lorem"));
assert(stre::ends_with  ("lorem ipsum", "ipsum"));
assert(stre::contains   ("lorem ipsum", "em ip"));

Token manipulation

[ Run this code ]

using namespace utl;

// Replacing tokens
assert(stre::replace_all_occurrences("xxxAAxxxAAxxx",  "AA",  "BBB") == "xxxBBBxxxBBBxxx" );

// Splitting by delimiter
auto tokens = stre::split_by_delimiter("aaa,bbb,ccc,", ",");
assert(tokens.size() == 3);
assert(tokens[0] == "aaa");
assert(tokens[1] == "bbb");
assert(tokens[2] == "ccc");

// Splitting by complex delimiter while keeping the empty tokens
tokens = stre::split_by_delimiter("(---)lorem(---)ipsum", "(---)", true);
assert(tokens.size() == 3);
assert(tokens[0] == ""     );
assert(tokens[1] == "lorem");
assert(tokens[2] == "ipsum");

Other utilities

[ Run this code ]

using namespace utl;

// Repeating chars/strings
assert(stre::repeat_char  (  'h', 7) == "hhhhhhh"        );
assert(stre::repeat_string("xo-", 5) == "xo-xo-xo-xo-xo-");

// Escaping control chars in a string   
const std::string text = "this text\r will get messed up due to\r carriage returns.";
std::cout
    << "Original string prints like this:\n" <<                            text  << "\n\n"
    << "Escaped  string prints like this:\n" << stre::escape_control_chars(text) << "\n\n";

// Getting index of difference
assert(stre::index_of_difference("xxxAxx", "xxxxxx") == 3);

Output:

Original string prints like this:
 carriage returns.p due to

Escaped  string prints like this:
this text\r will get messed up due to\r carriage returns.

UTL

Collection of self-contained header-only libraries for C++17

utl::stre

Definitions

Methods

Trimming

Padding

Case conversions

Substring checks

Token manipulation

Other utils

Examples

Trimming strings

Padding strings

Converting string case

Substring checks

Token manipulation

Other utilities