Manipulating with strings

It's possible to do string operations with the match \0, with the subexpressions (\1, \2, etc.), or with the filename.

String literals

Strings should be enclosed into double or single quotes. The following escape sequences are supported:

There must be exactly 2 hexadecimal digits after \x, 4 digits after \u, and 6 digits after \U. For example, if your code point is U+1F642, you have to add a zero before it. Of course, you also can insert a Unicode character into a string; you don't have to use the escape sequences.

Search forReplace toExplanation
anything \("'test'") 'test' in single quotes
anything \('\'test\'') Same; when in single quotes, we have to escape
anything \('"test"') "test" in double quotes (enclosed into single quotes to avoid the escaping)
anything \('1\xA0234') one, then U+00A0 no-break space, then 234 (using the non-break space as a thousands separator)
anything \('\u2014') U+2014 em dash
anything \('—') the same character inserted literally

Concatenation

There is no explicit concatenation operator. To concatenate several strings, write them next to one another:

Search forReplace toExplanation
anything \('be' 'come') The result is become
\<([0-9]{1,2})/([0-9]{1,2})/((?:20)?[0-9]{2})\> \(\2 '.' \1 '.' \3) Replace a date in the US format (month/day/year) with the same date in the European format (day.month.year)
\<([0-9]{1,2})/([0-9]{1,2})/((?:20)?[0-9]{2})\> \2.\1.\3 Same using simple replacements
\<function (\w{2,}) function \(\1[0].toUpper() \1[1:]) Make the first letter of a function name uppercase

You can mix the concatenation with math operators, but if you have an unary minus or an unary not in the expression, please add parentheses:

Search forReplace toExplanation
anything \(5 ' ' -20+5) The result is -5-15; the space is gone because it's first concatenated with −20, then converted into a number
anything \(5 ' ' (-20+5)) The correct way: use parentheses
anything \('result: ' -20+5) An error; the string is first concatenated with −20, then Aba cannot add 5 to it
anything \('result: ' (-20+5)) The correct solution

Indexing and slicing

You can extract the n-th character of a string. The indexes are counted from zero; negative indexes are counted from the end of the string:

Search forReplace toExplanation
\w+ \(\0[0]) The first character of each word
\w{2,} \(\0[1]) The second character
\w+ \(\0[-1]) The last character
\w{2,} \(\0[-2]) The next to last character

The index should range from minus string length to the string length minus one, otherwise Aba shows an error message.

Aba counts Unicode scalar values (code points), not bytes, so an index of the next character is always an index of the previous character plus one. Combining characters and grapheme clusters still need special handling.

You can also extract a part of a string using positive or negative indices:

Search forReplace toExplanation
\w{2,} \(\0[0:2]) The first two characters of each word that is at least two characters long
\w{2,} \(\0[:2]) Same
\w{2,} \(\0[2:]) All characters excepts for the first two
^^ \(Aba.fileName()[-3:]) The last three characters of the filename (usually, it's the file extension, but it won't work for .json or .gitignore)

When slicing, both indices can span from minus string length to the string length. Unlike JavaScript or Python, Aba shows an error message if the slice index is out of bounds or if the first index is larger than the second one.

The toLower/toUpper functions

str.toLower()
str.toUpper()

These functions convert the string to lowercase/uppercase including Unicode characters. If the string contains non-alphabetic characters such as digits or spaces, they are not changed. The function returns a lowercase/uppercase equivalent of the string; the original string is not changed.

Search forReplace toExplanation
\w+ \(\0.toLower()) Convert all words to lowercase
\w{2,} \(\0[0].toUpper() \0[1:].toLower()) Start all words longer than two characters with a capital letter and make the remaining characters small letters

The trim functions

str.trimLeft()    str.trimLeft(charset)
str.trimRight()   str.trimRight(charset)
str.trim()        str.trim(charset)

These functions strip whitespace (or other characters) from the beginning or the end of the string. The trimLeft function removes whitespace from the beginning of the string; the trimRight function removes it from the end of the string; and the trim function removes it from both sides of the string.

The charset is the characters that you want to remove; if it's not specified, then the functions remove spaces, tab characters, newlines, and carriage returns. The functions return the string with whitespace removed; the original string is not changed.

Search forReplace toExplanation
\N+ \(\0.trim()) Remove whitespace from the beginning and the end of each line
\N+ \(\0.trimRight()) Remove whitespace from the end of each line
[ \t]+$ Same without complex replacements
\N+ \(\0.trimLeft(' ')) Remove only spaces (not tabs) from the beginning of each line
^ + Same without complex replacements

The indexOf function

str.indexOf(pattern)

This function searches the string and returns the first found index of the pattern in the string, or −1 if it's not found. The pattern is a string; the function returns an index (starting from zero) that can be used for slicing or checking if the string contains a given substring.

If the pattern or the str string is an empty string, the function always returns −1.

Search forReplace toExplanation
<a href=​"(https?://[^"]+)" <a href=​"\( if \1.indexOf('example.​com') >= 0 {\1.​toLower()} else {\1} )" Convert all links containing example.com to lowercase, but leave the other links unchanged

The replace function

str.replace(pattern, replacement)
str.replace(pattern, replacement, count)

This function replaces all or some occurrences of the pattern with the replacement string. The pattern and the replacement parameters are strings; the optional count parameter is a number. If it's not specified, all occurrences of the pattern are replaced.

The count must be a positive integer, otherwise an error message is displayed. The function returns a copy of the string with the pattern occurrences replaced with the replacement; the original string is not changed.

Search forReplace toExplanation
^^ <img src="\{Aba.fileName().replace('.html', '.png')}"> Insert an image with the same name as the current HTML file

The len function

str.len()

The function returns the string length. If the string is empty, it returns zero. Note that Aba counts Unicode scalar values (code points), not bytes or grapheme clusters.

Search forReplace toExplanation
$$ \( if Aba.filePath().len() > 0 {'banner'} else {''} ) If the file is located in a nested directory, add a banner

The encodeUrl function

str.encodeUrl()
str.encodeUrl('+')

This function encodes the URL string using the percent-encoding specified in RFC 3986, section 2.1. The encoded URL string is returned; the original string is not changed. The encoding replaces reserved characters such as slashes with their ASCII codes preceded by a percent sign, for example, %2F. Non-ASCII characters are replaced with their UTF-8 codes.

You can pass a string containing a plus sign as a parameter if you want to replace spaces with plus signs instead of %20. This encoding variation is used in the application/x-www-form-urlencoded media type.

Search forReplace toExplanation
(https?://en\.​wiktionary\.​org/wiki/)(\S+) \1\{ \2.​encodeUrl() } Encode Wiktionary links like https://​en.​wiktionary.​org/​wiki/​štěstí

The decodeUrl function

str.decodeUrl()
str.decodeUrl('+')

This function decodes the URL string using the percent-encoding and returns the decoded string. All sequences starting with a percent sign are replaced with the corresponding UTF-8 characters; for example, %C3%B3 is replaced with ó and %23 is replaced with #.

You can pass a string containing a plus sign as a parameter if you want to replace plus signs with spaces. This encoding variation is used in the application/x-www-form-urlencoded media type.

Search forReplace toExplanation
(https?://en\.​wiktionary\.​org/wiki/)(\S+) \1\{ \2.​decodeUrl() } Decode Wiktionary links like https://​en.​wiktionary.​org/​wiki/​%E7%A6%8F

The encodeBase64 function

str.encodeBase64()
str.encodeBase64('nopad')
str.encodeBase64('url')
str.encodeBase64('url,nopad')

This function uses Base64 and UTF-8 to encode the string. The encoding is specified in RFC 4648; it transforms each three UTF-8 bytes into four printable characters, making them unreadable. The function returns the encoded string; the original string is not changed.

By default, the standard Base64 encoding is used (see section 4 of the RFC), so the last two characters of the Base64 alphabet are + and /. You can pass the string 'url' as a parameter to use the URL-safe alphabet (see section 5 of the RFC), where these two characters are - and _. The last encoded bytes are padded with the equal sign =; to avoid the padding, please pass 'nopad' or 'url,nopad' to the function.

Search forReplace toExplanation
<svg (.*?</svg>) <img src="​data:image/​svg+xml;​base64,\{( '<?xml version=​"1.0"?>​<svg xmlns="​http://​www.​w3.org/​2000/​svg" ' \1).encodeBase64()}"> Replace svg tags with img tags using data URLs

The decodeBase64 function

str.decodeBase64()

This function decodes a UTF-8 string encoded in Base64. The decoded data needs to be a valid UTF-8; otherwise, Aba shows an error message. In the current version, you cannot decode binary data with this function.

Both standard Base64 and URL-safe Base64 (base64url) are supported. The encoded string can be padded with equal signs = or include additional whitespace and newline characters, which are skipped. If the encoded string contains other characters (such as punctuation or non-ASCII characters), Aba shows an error message. An error message is also displayed if the last group of characters consists of only one character.

Search forReplace toExplanation
<img src="​data:image/​svg\+xml;​base64,([^"]+)"> \{\1.​decode​Base64().​replace(​'<?xml version​="1.0"?>​<svg xmlns="​http://​www.​w3.org/​2000/​svg"', '<svg')} Replace data URLs with svg tags

The encodeHtml function

str.encodeHtml()
str.encodeHtml('noquotes')
str.encodeHtml('apos')
str.encodeHtml('nodouble')
str.encodeHtml('noquotes,nodouble')
str.encodeHtml('apos,nodouble')

This function encodes special characters like < to HTML entities like &lt;. It's similar to PHP's htmlspecialchars. The function returns the encoded string; the original string is not changed.

The following characters are encoded:

< to &lt;
> to &gt;
& to &amp;
" to &quot;
' to &#39;

If 'noquotes' is passed as a parameter, the quotes are not encoded. Encoding quotes is necessary when you insert the encoded string into an HTML attribute; for a text node, it's not mandatory.

If 'apos' is passed as a parameter, the double quote is encoded to the &apos; entity, which is supported in HTML5 and XML. By default, Aba uses &#39; to remain compatible with legacy Internet Explorer versions.

If 'nodouble' is passed as a parameter, the existing entities are not double-encoded. For example, if the string already contains &lt;, it's not changed to &amp;lt;. This flag can be combined with noquotes and apos.

Search forReplace toExplanation
<pre>​(.*?)​</pre> <pre>\{ \1.​encode​Html​('no​quotes,​no​double') }</pre> Correct < and > characters inside <pre> tags

The decodeHtml function

str.decodeHtml()

This function decodes HTML entities like &lt;, &#39;, or &#x27; to the corresponding characters. All HTML4 and XML named character entities are supported as well as decimal and hexadecimal entities.

Search forReplace toExplanation
<pre>​(.*?)​</pre> ```\{ \1.​decode​Html() }``` Convert <pre> tags to markdown code blocks and replace HTML entities

This is a page from Aba Search and Replace help file.