The ability to clean data is one of the most valuable skills for any data analyst. Below are 3 ways how to clean any special character from a string.

## Using a Formula

**=TEXTJOIN(“”,TRUE,IF(((ABS(CODE(MID(formula!$B3,SEQUENCE(LEN(formula!$B3)),1))-109.5)<=12.5)+(ABS(CODE(MID(formula!$B3,SEQUENCE(LEN(formula!$B3)),1))-77.5)<=12.5)+(ABS(CODE(MID(formula!$B3,SEQUENCE(LEN(formula!$B3)),1))-52.5)<=4.5)),CHAR(CODE(MID(formula!$B3,SEQUENCE(LEN(formula!$B3)),1))),“”))**

if you do not have office365, replace the SEQUENCE function with ROW & INDIRECT functions as shown below

**=TEXTJOIN(“”,TRUE,IF(((ABS(CODE(MID(formula!$B3,ROW(INDIRECT(“1:”&LEN(formula!$B3))),1))-109.5)<=12.5)+ *check for lowercase*(ABS(CODE(MID(formula!$B3,ROW(INDIRECT(“1:”&LEN(formula!$B3))),1))-77.5)<=12.5)+ **

***check for UPPERCASE**(ABS(CODE(MID(formula!$B3,ROW(INDIRECT(“1:”&LEN(formula!$B3))),1))-52.5)<=4.5)), *check for number*CHAR(CODE(MID(formula!$B3,ROW(INDIRECT(“1:”&LEN(formula!$B3))),1))), *return the code and convert to char*“”))

How the Formula Works

NB: If you ever want to check if a string does not contain a special character use the below formula

**=SUM((ABS(CODE(MID(C3,SEQUENCE(LEN(C3)),1))-77.5)<=12.5)+(ABS(CODE(MID(C3,SEQUENCE(LEN(C3)),1))-109.5)<=12.5)+(ABS(CODE(MID(C3,SEQUENCE(LEN(C3)),1))-52.5)<=4.5))=LEN(C3)**

Thanks to a user called Xlambda for Mr. Excel Forum, we can now check if a character is a letter using the EXACT function

**A letter is a character that has different values between its lower- and upper-case values.**

Therefore,** =EXACT( LOWER(string) , UPPER(string) )** will always return FALSE for all letters

**=TEXTJOIN(“”,TRUE, IF( EXACT(UPPER(MID(B3,ROW(INDIRECT(“1:”&LEN(B3))),1)),LOWER(MID(B3,ROW(INDIRECT(“1:”&LEN(B3))),1))) * NOT(ISNUMBER(MID(B3,ROW(INDIRECT(“1:”&LEN(B3))),1)+0)), “”, MID(B3,ROW(INDIRECT(“1:”&LEN(B3))),1) ))**

**How the Formula Works**

**EXACT(UPPER(MID(B3,ROW(INDIRECT(“1:”&LEN(B3))),1)),LOWER(MID(B3,ROW(INDIRECT(“1:”&LEN(B3))),1)))** checks if character is not a letter

**NOT(ISNUMBER(MID(B3,ROW(INDIRECT(“1:”&LEN(B3))),1)+0))** checks if character is not a number

Now with the new knowledge, to check if a string contains a special character,

**= IF(SUM(EXACT(UPPER(MID(B4,ROW(INDIRECT(“1:”&LEN(B4))),1)),LOWER(MID(B4,ROW(INDIRECT(“1:”&LEN(B4))),1)))*NOT(ISNUMBER(MID(B4,ROW(INDIRECT(“1:”&LEN(B4))),1)+0))),1,0)**

You can replace the IF function with th**e FILTER function**

**=TEXTJOIN(“”,TRUE, FILTER( MID(E3,SEQUENCE(LEN(E3)),1),NOT(EXACT(UPPER(MID(E3,SEQUENCE(LEN(E3)),1)),LOWER(MID(E3,SEQUENCE(LEN(E3)),1))))+ISNUMBER(MID(E3,SEQUENCE(LEN(E3)),1)+0)))**

## Use Power Query to Clean Any Special Character

**= Table.AddColumn(Source, “Clean Data”, eachText.Select([DIRTY DATA],{“a”..”z”,”A”..”Z”,”0″..”9″}))**

## Using a Recursive LAMBDA to Clean Any Special Character

**=LAMBDA(ddata, start, IF(start<=LEN(ddata), CLEANDATA( LET(char, MID(ddata,start,1),charcode, CODE(char),IF((ABS(charcode-77.5)<=12.5)+(ABS(charcode-109.5)<=12.5)+(ABS(charcode-52.5)<=4.5), ddata , REPLACE(ddata,start,1,”-“)) ), start+1),SUBSTITUTE(ddata, “-“, “”) ))**

**How it works:**

**start<=LEN(ddata)** always check if we have looped through the whole string. If not yet, it calls the function CLEANDATA

**MID(ddata,start,1)** Retrieves every character in the string one at a time which is later converted into a code **CODE(char)**

**(ABS(charcode-77.5)<=12.5)+(ABS(charcode-109.5)<=12.5)+(ABS(charcode-52.5)<=4.5)** Checks if the extracted character code is either lowercase, UPPERCASE or number, if true, returns the character

** REPLACE(ddata,start,1,”-“)** if a special character is found it is replaced with a hyphen “-“

**SUBSTITUTE(ddata, “-“, “”)** When we have looped through the whole string, substitute “-” with nothing

**See more recursive LAMBDAs on Mr. Excel Forum**

With the new knowledge on how to check for any letter in a string, you can re-write the recursive LAMBDA below

**=LAMBDA(ddata, start,IF(start<=LEN(ddata),CLEANDATA(LET(char, MID(ddata,start,1),IF(EXACT(LOWER(char),UPPER(char))*NOT(ISNUMBER(char+0)),REPLACE(ddata,start,1,”-“), ddata)),start+1),SUBSTITUTE(ddata, “-“, “”)))**

## Using a NON-Recursive LAMBDA to Clean Any Special Character

**=LAMBDA(ddata, TEXTJOIN(“”,TRUE, LET( char, MID(ddata,SEQUENCE(LEN(ddata)),1), charcode, CODE(char), notSpecial, (ABS(charcode-77.5)<=12.5)+(ABS(charcode-109.5)<=12.5)+(ABS(charcode-52.5)<=4.5),IF(notSpecial , char , “”) ) ))**

With the new knowledge on how to check for any letter in a string, you can re-write the non-recursive LAMBDA below

**=LAMBDA(ddata, TEXTJOIN(“”,TRUE, LET(char, MID(ddata,SEQUENCE(LEN(ddata)),1),special, EXACT(LOWER(char),UPPER(char))*NOT(ISNUMBER(char+0)),IF(special, “” , char) ) ))**

## Recent Comments