String functions are used in computer programming languages to manipulate a string or query information about a string (some do both).
Most programming languages that have a string datatype will have some string functions although there may be other low-level ways within each language to handle strings directly. In object-oriented languages, string functions are often implemented as properties and methods of string objects. In functional and list-based languages a string is represented as a list (of character codes), therefore all list-manipulation procedures could be considered string functions. However such languages may implement a subset of explicit string-specific functions as well.
For function that manipulate strings, modern object-oriented languages, like C# and Java have immutable strings and return a copy (in newly allocated dynamic memory), while others, like C manipulate the original string unless the programmer copies data to a new string. See for example Concatenation below.
The most basic example of a string function is the <code>length(string)</code> function. This function returns the length of a string literal.
Other languages may have string functions with similar or exactly the same syntax or parameters or outcomes. For example, in many languages the length function is usually represented as len(string). The below list of common functions aims to help limit this confusion.
String functions common to many languages are listed below, including the different names used. The below list of common functions aims to help programmers find the equivalent function in a language. String concatenation and regular expressions are handled in separate pages. Statements in guillemets (ë ⦠û) are optional.
<pre>
"Hello, World"[2]; // 'e' </pre>
â Example in ALGOL 68 â string in string("e", loc int, "Hello mate"); â returns true â string in string("z", loc int, "word"); â returns false â
Tests if two strings are equal. See also #Compare and #Compare. Doing equality checks via a generic Compare with integer result is not only confusing for the programmer but is often a significantly more expensive operation; this is especially true when using "C-strings".
Examples
Given a set of characters, SCAN returns the position of the first character found, while VERIFY returns the position of the first character that does not belong to the set.
Tests if two strings are not equal. See also #Equality.
see #Find
see #Find
see #Find
see #rfind
see #rfind
see #length
see #Find
see #substring
see #substring
see #Format
see #trim
<code>trim</code> or <code>strip</code> is used to remove whitespace from the beginning, end, or both beginning and end, of a string.
Other languages
In languages without a built-in trim function, it is usually simple to create a custom function which accomplishes the same task.
APL can use regular expressions directly:
Alternatively, a functional approach combining Boolean masks that filter away leading and trailing spaces:
Or reverse and remove leading spaces, twice:
In AWK, one can use regular expressions to trim:
or:
There is no standard trim function in C or C++. Most of the available string libraries for C contain code which implements trimming, or functions that significantly ease an efficient implementation. The function has also often been called EatWhitespace in some non-standard C libraries.
In C, programmers often combine a ltrim and rtrim to implement trim:
The open source C++ library Boost has several trim variants, including a standard one:
With boost's function named simply <code>trim</code> the input sequence is modified in-place, and returns no result.
Another open source C++ library Qt, has several trim variants, including a standard one:
The Linux kernel also includes a strip function, <code>strstrip()</code>, since 2.6.18-rc1, which trims the string "in place". Since 2.6.33-rc1, the kernel uses <code>strim()</code> instead of <code>strstrip()</code> to avoid false warnings.
A trim algorithm in Haskell:
may be interpreted as follows: f drops the preceding whitespace, and reverses the string. f is then again applied to its own output. The type signature (the second line) is optional.
The trim algorithm in J is a functional description:
That is: filter (<code>#~</code>) for non-space characters (<code>' '&~:</code>) between leading (<code>+./\</code>) and (<code>*.</code>) trailing (<code>+./\.</code>) spaces.
There is a built-in trim function in JavaScript 1.8.1 (Firefox 3.5 and later), and the ECMAScript 5 standard. In earlier versions it can be added to the String object's prototype as follows:
Perl 5 has no built-in trim function. However, the functionality is commonly achieved using regular expressions.
Example:
or:
These examples modify the value of the original variable <code>$string</code>.
Also available for Perl is StripLTSpace in <code>String::Strip</code> from CPAN.
There are, however, two functions that are commonly used to strip whitespace from the end of strings, <code>chomp</code> and <code>chop</code>:
In Raku, the upcoming sister language of Perl, strings have a <code>trim</code> method.
Example:
The Tcl <code>string</code> command has three relevant subcommands: <code>trim</code>, <code>trimright</code> and <code>trimleft</code>. For each of those commands, an additional argument may be specified: a string that represents a set of characters to removeâÂÂthe default is whitespace (space, tab, newline, carriage return).
Example of trimming vowels:
XSLT includes the function <code>normalize-space(string)</code> which strips leading and trailing whitespace, in addition to replacing any whitespace sequence (including line breaks) with one space.
Example:
XSLT 2.0 includes regular expressions, providing another mechanism to perform string trimming.
Another XSLT technique for trimming is to utilize the XPath 2.0 <code>substring()</code> function.