1 votes

Get the last number

How do I get the last integer of each line?

df <- data.frame( col1 = c(1,2,3,4,5), 
                  col2 = c("300 ww 1.4/-: Tb  6b 2170","erty 300 ww 1.4",
                         "sss 2 ss 34"," verde rojo (8383)","er:.56 tomate.455"))

col1                      col2
1    1 300 ww 1.4/-: Tb  6b 2170
2    2           erty 300 ww 1.4
3    3               sss 2 ss 34
4    4         verde rojo (8383)
5    5        er:.56 tomate.455

The expected result would be

col1                      col2           col3
1    1 300 ww 1.4/-: Tb  6b 2170          2170
2    2           erty 300 ww 1.4          (ninguno pq tiene decimal y se busca número entero)
3    3               sss 2 ss 34          34
4    4         verde rojo (8383)          (ninguno pq va entre paréntesis)
5    5        er:.56 tomate.455           455 (aunque hay un . delante si es el último número)

I would appreciate any help I can get.

1voto

Patricio Moracho Points 24098

What I understand from the example, is that you are always looking at the last "word" of each string. A simple way, with R base, is to separate each word by space, try to convert the string to a number and finally check if that number is an integer:

sapply(strsplit(df$col2, "\\D\\.| "),
       FUN=function(x) { 
         num <- tryCatch(as.numeric(rev(x)[1]),  warning = function(x) NA)
         ifelse(num == as.integer(num), num, NA)
       }
) -> df$col2num

The pattern of separation: "\\D\\.| " is: 1) every point that does not come from a number and 2) the space

HolaDevs.com

HolaDevs is an online community of programmers and software lovers.
You can check other people responses or create a new question if you don't find a solution

Powered by:

X