Welcome to WuJiGu Developer Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
788 views
in Technique[技术] by (71.8m points)

regex - How to replace many special characters with "something plus special characters" in R

I have this sentence that contains "& / ?".

c = "Do Sam&Lilly like yes/no questions?"

I want to add a whitespace before and after each of the special characters to get

"Do Sam & Lilly like yes / no questions ? "

I can only get this by the hard way:

c = gsub("[&]", " & ", c)
c = gsub("[/]", " / ", c)
c = gsub("[?]", " ? ", c)

But imagine that I have many of these special character, which warrants using [:alnum:]. So I am really looking for a solution that looks like this:

gsub("[[:alnum:]]", " [[:alnum:]] ", c)

Unfortunately, I cannot use [:alnum:] as the second argument this way.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You can use a capture group reference:

gsub("([&/])", " \1 ", c)

Here we replace "&" or "/" with themselves ("\1") padded with spaces. The "\1" means "use the first matched group from the pattern. A matched group is a portion of a regular expression in parentheses. In our case, the "([&/])".

You can expand this to cover more symbols / special characters by adding them to the character set, or by putting in an appropriate regex special character.

note: you probably shouldn't use c as a variable name since it is also the name of a very commonly used function.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to WuJiGu Developer Q&A Community for programmer and developer-Open, Learning and Share
...