This is a short blogpost. I wanted to document this recipe for my own benefit, and hopefully it will help others. I was working with a very messy dataset with some columns containing non-alphanumeric characters such as #,!,$^*) and even emojis.

numpy has two methods isalnum and isalpha.

isalnum returns True if all characters are alphanumeric, i.e. letters and numbers. documentation

isalpha returns True if all characters are alphabets (only alphabets, no numbers).documentation

import numpy as np
import pandas as pd
df = pd.DataFrame({'col':['abc', 'a b c', 'a_b_c', '#$#$abc', 'abc111', 'abc111#@$@', '  abc   !!! 123', 'ABC']})
df
col
0 abc
1 a b c
2 a_b_c
3 #$#$abc
4 abc111
5 abc111#@$@
6 abc !!! 123
7 ABC

Remove symbols and return alphanumerics

def alphanum(element):
    
    return "".join(filter(str.isalnum, element))
df.loc[:,'alphanum'] = [alphanum(x) for x in df.col]
df
col alphanum
0 abc abc
1 a b c abc
2 a_b_c abc
3 #$#$abc abc
4 abc111 abc111
5 abc111#@$@ abc111
6 abc !!! 123 abc123
7 ABC ABC

Remove symbols & numbers and return alphabets only

def alphabets(element):
    
    return "".join(filter(str.isalpha, element))
df.loc[:,'alphabets'] = [alphabets(x) for x in df.col]
df
col alphanum alphabets
0 abc abc abc
1 a b c abc abc
2 a_b_c abc abc
3 #$#$abc abc abc
4 abc111 abc111 abc
5 abc111#@$@ abc111 abc
6 abc !!! 123 abc123 abc
7 ABC ABC ABC

Bonus: Remove symbols & characters and return numbers only

def numbers(element):
    
    return "".join(filter(str.isnumeric, element))
df.loc[:,'num'] = [numbers(x) for x in df.col]
df
col alphanum alphabets num
0 abc abc abc
1 a b c abc abc
2 a_b_c abc abc
3 #$#$abc abc abc
4 abc111 abc111 abc 111
5 abc111#@$@ abc111 abc 111
6 abc !!! 123 abc123 abc 123
7 ABC ABC ABC
df.dtypes
col          object
alphanum     object
alphabets    object
num          object
dtype: object

Note that the num column is returned as an object (i.e. string) and not a number so be sure to convert it to int