# An O(n) Python 3 algorithm that halves the number of characters to be removed

First written on March 2, 2018
Last updated on December 29, 2021

It’s been a while.

Some days ago I thought to use this algorithm to remove duplicate escape characters in my md-toc program. I then realized I didn’t need it and also noticed that dealing correctly with escape characters seems very hard. So I generalized the algorithm to get any number of a specified charatcter in a string, except `\` !!, to be halvened.

## Algorithm ¶

Here’s the algorithm along with some unit tests:

``````import math
import unittest

def halve_characters(s, remove_char):
assert isinstance(s, str)
assert isinstance(remove_char, str)
assert len(remove_char) == 1
assert remove_char != '\\'

i = 0
final_string = str()
while i < len(s):
if s[i] == remove_char:
j = i
count = 1
match = True
while j < len(s) - 1 and match:
if s[j] == s[j + 1]:
count += 1
else:
match = False
j += 1
remove_char_count = math.floor(count / 2)
final_string += remove_char_count * remove_char
i += count
else:
final_string += s[i]
i += 1

return final_string

class Test(unittest.TestCase):
def test_halve_characters(self):
self.assertEqual(halve_characters('thiis is a stringiioo. Hiiii', 'i'), 'this s a strngioo. Hii')
self.assertEqual(halve_characters('****\n', '*'), '**\n')
self.assertEqual(halve_characters('\\\n\\\\\n', '\n'), '\\\\\\')
self.assertEqual(halve_characters('', 'a'), '')
self.assertEqual(halve_characters('abcdefffghifffff', 'f'), 'abcdefghiff')

if __name__ == '__main__':
unittest.main()
``````

## Explanation ¶

The idea behind it is that we iterate character by character through the string until we find the an element that needs to be removed: `if s[i] == remove_char`

If this condition is true, we set a counter `count = 1` which is a counter for the number of `remove_char` characters.

We assume that the next character in the string will also be `remove_char`. If it is it then we continue to count the occurrencies of `remove_char`.

Once we reach the end of the string or if there are no more consecutive `remove_chars` we compute the halved number of remove characters with `math.floor(count / 2)` and concatenate `remove_char * math.floor(count / 2)` to the final string. Using the floor function implies that the count is approximated to the nearest smaller integer number.

If the current character is not `if s[i] != remove_char` we simply add it to the final string.

## Complexity ¶

The complexity is O(n) because each character is inspected only once:

``````while i < len(s):
if s[i] == remove_char:
[...]
# Remember what count is.
i += count
else:
i += 1
``````

Enjoy!