Extracting n decimal from a string

Question

Extracting n decimal from a string

I've been going through half of stackoverflow looking for this, but nothing seems to be perfect, sorry if not.

I have a string with the format:

fname <-'FS1_SCN0.83_axg3.csv'

I would like to extract the second number, which is decimal, but could also be an integer and get 0.83 in the result (or 3 if an integer). The closest I realized was:

gsub("[^0-9.]","\\2",fname)

which produces all numbers and decimal tokens in fname (10.833.), but as a whole string.

Thanks in advance, p.

+3

regex r

user3310782 May 18 '15 at 9:07

source to share

5 answers

Regex

.+_SCN(\d+(?:\.\d+)?)_.+\.csv

Description

Regular expression visualization

Demo

Sample code

sub(".+_SCN(\\d+(?:\\.\\d+)?)_.+\\.csv", "\\1", fname)

+3

Stephan May 18 '15 at 9:12

source to share

^.*?(?:\\d+(?:\\.\\d+)?).*?\\K\\d+(?:\\.\\d+)?

You can use this parameter perl=True

and get a match. See demo.

https://www.regex101.com/r/fJ6cR4/8

or

gsub("^.*?(?:\\d+(?:\\.\\d+)?).*?(\\d+(?:\\.\\d+)?).*$","\\1",fname,perl=TRUE)

+2

vks May 18 '15 at 9:12

source to share

You can use str_extract_all()

from a batch stringr

to match all numbers in a given input, and then grab the captured group # 2 from the resulting array:

library(stringr)

str_extract_all(fname, "([0-9]+(?:\\.[0-9]+)?)")

+2

anubhava May 18 '15 @ 9:13 am

source to share

As per your comment, you can use this: _[A-Z]+(\d+(\.\d+)?)

as shown here . As a minor note, this suggested answer does nothing that hasn't been posted. I just think it's a bit readable and easier to follow.

If you know the exact characters, it might make sense to replace the section with the [A-Z]

specified characters. This would make the expression even more intuitive.

+1

npinti May 18 '15 at 9:45

source to share

Avinash Raj · Accepted Answer · 2015-05-18T09:11:01+0000

To get the second number,

regmatches(x, regexpr("^\\D*\\d+\\D*\\K\\d+(?:\\.\\d+)?", x, perl=TRUE))

Demo

or

sub("^\\D*\\d+\\D*(\\d+(?:\\.\\d+)?).*", "\\1", x, perl=TRUE)

Example:

> x <-'FS1_SCN0.83_axg3.csv'
> regmatches(x, regexpr("^\\D*\\d+\\D*\\K\\d+(?:\\.\\d+)?", x, perl=TRUE))
[1] "0.83"
> sub("^\\D*\\d+\\D*(\\d+(?:\\.\\d+)?).*", "\\1", x, perl=TRUE)
[1] "0.83"

For a more general case

regmatches(x, regexpr("^\\D*\\d+(?:\\.\\d+)?\\D*\\K\\d+(?:\\.\\d+)?", x, perl=TRUE))
sub("^\\D*\\d+(?:\\.\\d+)?\\D*(\\d+(?:\\.\\d+)?).*", "\\1", x, perl=TRUE)

OR

Just provide the postcode number to get the number you want.

> regmatches(fname, gregexpr("\\d+(?:\\.\\d+)?", fname))[[1]][2]
[1] "0.83"

Extracting n decimal from a string

Regex

Description

Sample code

More articles: