Simple Strings (ss) is a string library for C designed to augment the libc string functionalities by adding dynamic and heap allocated strings. Those strings are easier to use and resemble the dynamic string types of other programming languages.
At the core of the ss strings library there is the struct showed below. This struct represents a dynamic
and heap allocated string. The string buffer itself is pointed to by the buf
pointer, has length len
(the len
-th char is the null terminator) and has additional free
bytes allocated and ready to be used
in further string manipulation. Functions that instantiate strings return pointers to them (the ss
type).
Both the struct and the string buffer are heap allocated. From now, we will use the term string to define
the ss
pointer type.
typedef struct ss {
size_t len;
size_t free;
char *buf;
} *ss;
The ss strings are generated and returned via dedicated constructor functions and they are usually
manipulated using the functions of the library. As a unique and obvious exception, the buf
field is
safer to be read and write until the position len-1
, i.e. one can write in the string buffer starting
at the position ss->buf[0]
until the position ss->buf[ss->len-1]
. If you need to write directly
into the string buffer make sure the buffer itself has enough space (otherwise grow it with the dedicated
ss_grow
function). The functions of the library automatically handle the allocated space and the mentioned
fields to both perform the needed operations and maintain the string state consistent. In any case, the
len
and free
fields must be considered read-only.
Both the string struct pointed to by the ss
pointer and the string buffer itself are heap allocated. After
use, they must be freed passing the ss
string to the ss_free
function. If a function of the library
creates a new string, also the new string must be freed after use. Similarly, string lists must be freed
after use with the dedicated ss_list_free
function.
This is the most basic example using ss strings:
ss name = ss_new_from_raw("John");
if (!name) {
// ... handle error
}
ss_concat_raw(name, " Dover");
ss_prepend_raw("My name is: ", name);
printf("len: %d buf: %s\n", name->len, name->buf);
ss_free(name);
// Output: len: 22 buf: 'My name is: John Dover'
The library offers a wide range of functions, both higher level functions for easy string manipulation but also a set of low level utilities that make possible to optimize code for high performance and reduce drastically the penalty for using a higher level string library. In general, the functions of the library preallocate additional memory to reduce the probability of future reallocations when the strings are manipulated.
The low level details can be tuned using specific functions. The ss_new_from_raw_len_free
constructor,
as an example, allows to specify the amount of available but still unused memory, while ss_reserve_free_space
controls the same parameter but for strings already created. The general line followed is to reduce the
number of allocations not the quantity of memory allocated for performance reasons. Still, programs that
need a minimal memory footprint can use those low-level functions to control the memory usage in very detail.
Some operations on strings can fail due to allocations errors. These functions could return an error in
the form of a ss_err
type, defined as the enum below. If the function is successful it returns err_none
,
which has value zero and could be conveniently tested with a if (err)
statement. Note that if the library
is compiled with the --with-exit
flag, memory allocation errors abort the program and there's no need
to check these errors and to check for NULL values returned from string constructors. The ss_err_str
function
could be used to get a static read-only string containing a textual representation of an error code.
typedef enum ss_err {
err_none = 0,
err_alloc = 1,
err_format = 2
} ss_err;
Library installation requires CMake version 3.10 or higher. A convenient setup script could be used to
easily install the library. The script is the setup.sh
file in project root. The script could also
be used to test the library before installing it (recommended). To install the library run the following
command in the root of the project.
./setup.sh install
The options --with-exit
could be used to customize the installation. If used, the functions of the
library abort the program in case of allocation errors (mainly malloc/realloc failures). In this case
there's no need to check for ss_err
errors or NULL values returned from the library functions.
./setup.sh install --with-exit
Finally, it is recommended to run the unit tests before installing the library. This can be done with the following command.
./setup.sh test --with-exit
ss_new_from_raw_len_free
ss_new_from_raw_len
ss_new_from_raw
ss_new_empty_with_free
ss_new_empty
ss_clone
ss_set_free_space
ss_reserve_free_space
ss_free
ss_grow
ss_shrink
ss_clear
ss_index
ss_index_last
ss_concat_raw_len
ss_concat_raw
ss_concat_str
ss_prepend_raw_len
ss_prepend_raw
ss_prepend_str
ss_slice
ss_trim
ss_trim_left
ss_trim_right
ss_to_lower
ss_to_upper
ss_split_raw
ss_split_str
ss_join_raw_cat
ss_join_raw
ss_join_str_cat
ss_join_str
ss_list_free
ss_sprintf_va_cat
ss_sprintf_va
ss_sprintf_cat
ss_sprintf
Build a new string copying the provided init
C string of length len
(the length argument doesn't
include the null terminator) and allocating additional avail
bytes. If the length of the initial
string is greater than len
, the exceeding bytes are discarded. Even if len
and avail
are both
equal to zero, one byte is allocated (for the null terminator). The caller is responsible for
providing the correct values of the three arguments. If the init
string is NULL a new empty ss
string is returned. The newly created string is heap allocated and a pointer to it is returned.
Useful shorthands are the ss_new_from_raw_len
and ss_new_from_raw
functions, which are more
ergonomic and easier to use. The string must be freed after use with the provided ss_free
function.
The returned string has length len
, but additional avail
bytes are allocated. This overallocation
is often useful because it's reduces the probability of future reallocations when the string is
manipulated.
Returns the newly generated string or NULL if the allocation fails.
ss ss_new_from_raw_len_free(const char *init, size_t len, size_t avail);
Build a new string copying the provided init
C string of length len
(the length argument
doesn't include the null terminator). If the length of the initial string is greater than the
provided len
, the exceeding bytes are discarded. The caller is responsible for providing
valid values for the arguments. If the init
string is NULL a new empty ss string is returned.
The new ss string is heap allocated and a pointer to it is returned. The function is basically
a shorthand for ss_new_raw_len_cap(init, len, len)
. The string must be freed after use with
the provided ss_free
function.
The returned string has length len
, but additional len
bytes are allocated. This overallocation
is often useful because it's reduces the probability of future reallocations when the string is
manipulated.
Returns the newly generated string or NULL if the allocation fails.
ss ss_new_from_raw_len(const char *init, size_t len);
Build a new string copying the provided null terminated init
C string. This function is a
shorthand for str_new_raw_len(init, strlen(init), strlen(init))
. If the init
string is
NULL a new empty ss string is returned. The returned string must be freed after use with the
provided ss_free
function.
The returned string has length strlen(init)
, but additional strlen(init)
bytes are allocated.
This overallocation is often useful because it's reduces the probability of future reallocations
when the string is manipulated.
Returns the newly generated string or NULL if the allocation fails.
ss ss_new_from_raw(const char *init);
Build and returns a new empty ss string with length zero and avail
bytes available. In any
case the string has an implicit null term, so 1 byte is allocated anyway. The ss must be freed
after use with the provided ss_free
function.
Returns the newly generated string or NULL if the allocation fails.
ss ss_new_empty_with_free(size_t avail);
Build and returns a new empty ss string with length zero and no further available space. Even in
this case the string always has an implicit null term, so 1 byte is allocated anyway. The ss must
be freed after use with the provided ss_free
function. Note that usually an empty string is built
to be filled, so it could be more convenient to directly preallocate some space via the
ss_new_empty_with_cap
function.
Returns the newly generated string or NULL if the allocation fails.
ss ss_new_empty(void);
Build and return a clone of the provided ss string s
. The new string and the old one are
independent and both of them must be freed after use with the ss_free
function. Cloning a
string is useful when, for example, we want to mutate a string while also retaining the
original content. The string s
is not modified.
Returns the cloned string or NULL if the allocation fails.
ss ss_clone(ss s);
Enlarge or shrink the allocated and available space not already used by the string s
to be equal
to avail
bytes. The operation doesn't change the stored string itself, both in the content and the
length. It only changes the allocated available space to be used for future operations on the string.
The function is useful for example to reserve more space earlier in order to avoid frequent
reallocations later. The string s
is modified in place.
Returns err_none
(zero) in case of success or a specific error if any reallocation fails. In case
of failure the ss string s
is still valid and must be freed after use.
ss_err ss_set_free_space(ss s, size_t avail);
Enlarge the allocated and available space not already used by the string s
to be at least avail
bytes long. The operation doesn't change the stored string, it only changes the available space beyond
the string end. The function is useful to reserve more space earlier in order to avoid frequent
reallocations later. If enough space is already present the function is a no-op. The string s
is modified in place.
Returns err_none
(zero) in case of success or an error if any reallocation fails. In case of
failure the ss string s
is still valid and must be freed after use.
ss_err ss_reserve_free_space(ss s, size_t avail);
Deallocate the memory used by the ss string s
. The string can't be used after being freed
since it will point to deallocated memory.
void ss_free(ss s);
Grow the s
string to have the specified length len
. Note that here the function enlarges
the string buffer itself, eventually allocates more space. New bytes inserted will be set
to zero and they will be safe to be written. If the specified length len
is smaller than the
current length, the function is a no-op. The string s
is modified in place.
Returns err_none
(zero) in case of success or an error if any reallocation fails. In case of
failure the ss string s
is still valid and must be freed after use.
ss_err ss_grow(ss s, size_t len);
Shrink the string s
to have the provided length len
. The string is shortened to contain only the
first len
bytes and a null termination character is written at the position len
of the string buffer.
The total allocated space is left untouched, while the free space grows. The bytes in the string buffer
after the len
-th one are not cleaned, they are just considered unused and not valid to be read or
written. If the length len
is greater than the current string length the function is a no-op.
The string s
is modified in place.
void ss_shrink(ss s, size_t len);
Erase the content of the string s
. The length of the string is set to 0 and a null termination
character is written in the first position of the string buffer. The total allocation space is
left untouched, while the free space grows. The old bytes are not cleaned, they are just considered
unused and cannot be read or written. It is a shorthand for ss_shrink(s, 0)
. The string s
is
modified in place.
void ss_clear(ss s);
Returns the position (0-indexed) of the starting position of the first occurrence of the substring
needle
in the ss string haystack
provided as first argument. Returns -1 if no occurrence is
found or if needle
is NULL or an empty string. The string haystack
is not modified.
size_t ss_index(ss haystack, const char *needle);
Returns the position (0-indexed) of the starting position of the last occurrence of the substring
needle
in the ss string haystack
provided as first argument. Returns -1 if no occurrence is
found or if needle
is NULL or an empty string. The string haystack
is not modified.
size_t ss_index_last(ss haystack, const char *needle);
Concatenate the ss string s1
with a C string s2
of length s2_len
. If the length of the C string
is greater than s2_len
, exceeding bytes are discarded. The s2
C string is appended to the string
s1
, eventually growing the allocated space for s1
. The strategy used in concat functions is the
following: if the string s1
has enough allocated space to contain also the string s2
the content
of the latter is simply appended, otherwise the s1
string will be grown in order to have total
allocated space equal to (2*n + 1 for the null terminator) bytes, where n is the resulting (concatenated)
string length. In this case, both the final len
and the free space free
will be equal to n. The
string s1
is modified in place.
Returns err_none
(zero) in case of success or an error if any reallocation fails. In case of
failure the ss string s1
is still valid and must be freed after use.
ss_err ss_concat_raw_len(ss s1, const char *s2, size_t s2_len);
Concatenate the ss string s1
with a null terminated C string s2
. The s2
string is appended
to s1
, eventually growing the allocated space for s1
. If the s2
string is NULL the function
is a no-op. Basically, it is a shorthand for str_concat_raw_len(s1, s1, strlen(s2))
(see that
function to have more info about the concat mechanism). The string s1
is modified in-place.
Returns err_none
(zero) in case of success or an error if any reallocation fails. In case of
failure the ss string s1
is still valid and must be freed after use.
ss_err ss_concat_raw(ss s1, const char *s2);
Concatenate the ss string s1
with a second ss string s2
. The s2
string is appended to s1
,
eventually growing the allocated space for s1
. The string s1
is modified in-place. Basically
it is a shorthand for ss_concat_raw_len(s1, s2->buf, s2->len)
(see that function to have more
info about the concat mechanism). Both strings are still valid after the function call and must
be freed separately.
Returns err_none
(zero) in case of success or an error if any reallocation fails. In case of
failure both strings are still valid and must be freed after use.
ss_err ss_concat_str(ss s1, ss s2);
Prepend a C string s1
of length s1_len
to a ss string s2
. If the length of the C string is
greater than s1_len
, exceeding bytes are discarded. The s1
C string is prepended to the ss
string s2
, eventually growing the allocated space for the latter. The strategy used in prepend
functions is the following: if the string s2
has enough allocated space to contain also the string
s1
the content is simply prepended, otherwise the s2
string will be grown in order to have
allocated space of (2*n + 1 for the null terminator) bytes, where n is the resulting string length.
In this case, both the final len
and the free space free
will be equal to n. The string s2
is modified in place.
Returns err_none
(zero) in case of success or an error if any reallocation fails. In case of
failure the string s2
is still valid and must be freed after use.
ss_err ss_prepend_raw_len(const char *s1, ss s2, size_t s1_len);
Prepend the null terminated C string s1
to a ss string s2
. The s1
string is prepended to s2
,
eventually growing the allocated space of s2
. The string s2
is modified in-place. Basically, it
is a shorthand for ss_prepend_raw_len(s1, s2, strlen(s1))
(see that function to have more info
about the concat mechanism). If the s1
string is NULL the function is a no-op.
Returns err_none
(zero) in case of success or an error if any reallocation fails. In case of failure
the string s2
is still valid and must be freed after use.
ss_err ss_prepend_raw(const char *s1, ss s2);
Prepend the ss string s1
to another ss string s2
. The s1
string is appended to s2
, eventually
growing the allocated space of s2
. The string s2
is modified in-place. Basically, it is a shorthand
for ss_prepend_raw_len(s1->buf, s2, s1->len)
(see that function to have more info about the concat
mechanism).
Returns err_none
(zero) in case of success or an error if any reallocation fails. In case of failure
both strings are still valid and must be freed after use.
ss_err ss_prepend_str(ss s1, ss s2);
Obtain a substring of the ss string s
slicing it with the provided indexes. The slicing boundaries
are provided via the str_index
and end_index
arguments (0-indexed) and the resulting substring
starts from the position str_index
(inclusive) and ends at end_index
(not inclusive). If the
str_index
is >= of the original string length no changes are made. If the end index is < of str_index
no changes are made. If end_index
> of the string length, end_index
is reduced to be equal to the
string length before slicing the string. The string s
is modified in place.
void ss_slice(ss s, size_t str_index, size_t end_index);
Removes characters contained in the cutset
string from both the start and the end of the
ss string s
. After the trimming operation, the string length is reduced while the allocation
size is left untouched, but more free space will be available. If all characters are trimmed
the result is a valid but empty string. The string s
is modified in place.
void ss_trim(ss s, const char *cutset);
Removes characters contained in the cutset
string from the start of the ss string s
.
After the trimming operation, the string length is reduced while the allocation size
is left untouched, but more free space will be available. If all characters are trimmed
the result is a valid but empty string. The string s
is modified in place.
void ss_trim_left(ss s, const char *cutset);
Removes characters contained in the cutset
string from the end of the ss string s
.
After the trimming operation, the string length is reduced while the allocation size
is left untouched, but more free space will be available. If all characters are trimmed
the result is a valid but empty string. The string s
is modified in place.
void ss_trim_right(ss s, const char *cutset);
Modify the ss string s
by turning each character into its lowercase version.
The string s
is modified in place.
void ss_to_lower(ss s);
Modify the ss string s
by turning each character into its uppercase version.
The string s
is modified in place.
void ss_to_upper(ss s);
Return all the ss substrings generated from splitting the C string s
with the delimiter string del
.
All the returned substrings are heap allocated and returned as an array (*ss
) of length n
. The array
of strings must be freed after use with the dedicated ss_list_free
function. If the delimiter del
is
empty or it didn't match anywhere in the string s
, the function returns only the original string.
Consecutive matches of the delimiter are treated as a single delimiter so no empty strings "in-between"
are returned in this case. Similarly, if the delimiter matches at the end of the string s
no empty
strings are returned. The value pointed to by the n
pointer will be the number of strings in the
returned array.
Returns an array of strings of length n
in case of success or NULL in case of allocation failures.
ss *ss_split_raw(const char *s, const char *del, int *n);
Return all the ss substrings generated from splitting the ss string s
with the delimiter string del
.
All the returned substrings are heap allocated and returned as an array (*ss
) of length n
. The array
of strings must be freed after use with the dedicated ss_list_free
function. If the delimiter del
is
empty or it didn't match anywhere in the string s
, the function returns only the original string.
Consecutive matches of the delimiter are treated as a single delimiter so no empty strings are returned
in this case. Also, if the delimiter matches at the end of the string s
no empty strings are returned.
The value pointed to by the n
pointer will be the number of strings in the returned array. The s
string is not modified.
Returns an array of strings of length n
in case of success or NULL in case of allocation failures.
ss *ss_split_str(ss s, const char *del, int *n);
Join an array of C strings str
of length n
using the provided string separator sep
between them
then concatenate this string to the provided s
string. The s
string is modified in place.
Returns err_none
(zero) in case of success or an error if any reallocation fails. In case of failure
the s
string is still valid and must be freed after use.
ss_err ss_join_raw_cat(ss s, const char **str, int n, const char *sep);
Join an array of C strings str
of length n
using the provided string separator sep
between them.
The resulting (joined) string is then returned as a new ss string. The returned string must be freed
after use with the provided ss_free
function.
Returns the joined string in case of success or NULL in case of allocation errors.
ss ss_join_raw(const char **str, int n, const char *sep);
Join an array of ss strings str
of length n
using the provided string separator sep
between them
then concatenate this string to the provided s
string. The s
string is modified in place.
Returns err_none
(zero) in case of success or an error if any reallocation fails. In case of failure
the s
string is still valid and must be freed after use.
ss_err ss_join_str_cat(ss s, ss *str, int n, const char *sep);
Join an array of ss strings str
of length n
using the provided string separator sep
between them.
The resulting (joined) string is then returned as a new ss string. The returned string must be freed
after use with the provided ss_free
function.
Returns the joined string in case of success or NULL in case of allocation errors.
ss ss_join_str(ss *str, int n, const char *sep);
Deallocate the memory used by a ss string array list
. The string array and all the
contained strings can't be used after being freed.
void ss_list_free(ss *list, const int n);
Formats the string and concatenates it to the s
string. Formatting is performed using the usual
C formatting directive. The function accepts a va_list
to accommodate a variable number of arguments.
The argument list should be started (va_start
) before providing it to this function and must be ended
(va_end
) after the function call. The s
string is modified in place.
Returns err_none
(zero) in case of success or an error if case of reallocation or formatting errors.
In case of failure the s
is still valid and must be freed after use.
ss_err ss_sprintf_va_cat(ss s, const char *format, va_list arg_list);
Formats and returns a string using the usual C formatting directives. The function accepts
a va_list
to accommodate a variable number of arguments. In case of success the returned
string must be freed after use with the dedicated ss_free
function. The argument list
should be started (va_start
) before providing it to this function and must be ended
(va_end
) after the function call.
Returns the formatted string in case of success or NULL in case of allocations errors.
ss ss_sprintf_va(const char *format, va_list arg_list);
Formats the string and concatenates it to the s
string. Formatting is performed using the usual
C formatting directive. The s
string is modified in place.
Returns err_none
(zero) in case of success or an error in case of reallocation or formatting errors.
In case of failure the s
string is still valid and must be freed after use.
ss_err ss_sprintf_cat(ss s, const char *format, ...);
Formats and returns a new string using the usual C formatting directive. The returned string
must be freed after use as usual with the dedicated ss_free
function.
Returns the formatted string in case of success or NULL in case of allocations errors.
ss ss_sprintf(const char *format, ...);
Return a pointer to a static read-only string containing a textual
representation of the error code err
passed in.
const char *ss_err_str(ss_err err);