What is the correct way to use unicode characters in a python regex

↧

Answer by Nozar Safari for What is the correct way to use unicode characters...

August 14, 2018, 4:04 am

i have same problem, i know this in not efficient way but in my case worked result = re.sub(r"\\" ,",x,x",result) result = re.sub(r",x,xu00ad" ,"",result) result = re.sub(r",x,xu" ,"\\u",result)

View Article

Answer by Bohemian for What is the correct way to use unicode characters in a...

September 25, 2013, 8:55 am

Rather than seek out specific unwanted chars, you could remove everything not wanted:re.sub('[^\\s!-~]', '', my_str)This throws away all characters not:whitespace (spaces, tabs, newlines, etc)printable...

View Article

What is the correct way to use unicode characters in a python regex

August 14, 2018, 4:04 am

In the process of scraping some documents using Python 2.7, I've run into some annoying page separators, which I've decided to remove. The separators use some funky characters. I already asked one...

View Article