RegEx Compliance for Profile G + URL

Question

RegEx Compliance for Profile G + URL

I was trying to match only user id or vanity URIs for Google+ accounts. I am using GAS (Google Script Engine) which I have loaded XRegExp to match Unicode characters.

As long as I have this: ((https?://)?(plus\.)?google\.com/)?(.*/)?([a-zA-Z0-9._]*)($|\?.*)

which you can see in regular tests (external site) , it still doesn't just match the right-hand side.

I tried using \p{L}

internally [a-zA-Z0-9._]

but no luck with that. Also, I end up with an extra slash at the end of the profile name when it matches.

UPDATE # 1: I am trying to fix any G + url in a spreadsheet copied from a google form. Links are not all the same and the simplest profile link is https://plus.google.com/ "+ user id or vanity name.

UPDATE # 2: So far I have ([+]\w+|[0-9]{21})(?:\/)?(?:\w+)?$

using @demrks simplified version of @ guest271314's answer. However, there are two problems:

1) Google Vanity urls can have unicode in them. Example: https://plus.google.com/u/0/+JoseManuelGarcía_ertatto

that fails. I tried using \ p {L} but didn't seem to figure it out.

2) GAS doesn't seem to like this event, although regular tests are running on this site. = (

UPDATE # 3: It seems like GAS just hates using it \w

, so I had to extend it. So I have this so far:

/([+][A-Za-z0-9-_]+|[0-9]{21})(?:\/)?(?:[A-Za-z0-9-_]+)?$/

It even matches "/ about" or "/ posts" at the end of the url. However, still not UNICODE compliant. = (I am still working on this.

UPDATE # 4: . This seems to work: /([+][\\w-_\\p{L}]+|[\\d]{21})(?:\/)?(?:[\\w-_]+)?$/

Looks like I needed to do a double backslash towards the character classes. So it works for now. Not sure if there is a shorter way to use this.

+3

javascript regex

flamusdiu 30 Aug 14 at 15:02

source to share

4 answers

Change, update

Try (v4)

document.URL.match(/\++\w+.*|\d+\d|\/+\w+$/).toString()
.replace(/\/+|posts|about|photos|videos|plusones|reviews/g, "")

eg.

var urls = ["https://plus.google.com/+google/posts"
            , "https://plus.google.com/+google/about"
            , "https://plus.google.com/+google/photos"
            , "https://plus.google.com/+google/videos"
            , "https://plus.google.com/+google/plusones"
            , "https://plus.google.com/+google/reviews"
            , "https://plus.google.com/communities/104645458102703754878"
            , "https://plus.google.com/u/0/LONGIDHERE"
            , "https://plus.google.com/u/0/+JoseManuelGarcía_ertatto"];
var _urls = [];

urls.forEach(function(item) {
  _urls.push(item.match(/\++\w+.*|\d+\d|\/+\w+$/).toString()
            .replace(/\/+|posts|about|photos|videos|plusones|reviews/g, ""));

});

_urls.forEach(function(id) {
    var _id = document.createElement("div");
    _id.innerHTML = id;
    document.body.appendChild(_id)
});

jsfiddle http://jsfiddle.net/guest271314/o4kvftwh/

+3

guest271314 30 Aug 14 at 15:10

source to share

Following a possible solution:

(?:\+)(\w+)|(?:\/)(\w+)$

Explanation:

1st alternative: (?:\+)(\w+)

(?:\+)

Missing group: \+

Literally matches the character +

. Capture group (\w+)

: \w+

matches any character in the word [a-zA-Z0-9_]. Quantifier: one to unlimited times.
The second alternative: (?:\/)(\w+)$

. (?:\/)

Not an exciting group. \/

literally matches a character /

. Capturing a group (\w+)

. \w+

matches any character in the word [a-zA-Z0-9_]

. Quantifier: from one to unlimited time. $

approve the position at the end of the line.

Hope this is helpful!

0

Academia 30 Aug 14 at 18:16

source to share

So this works: /([+][\\w-_\\p{L}]+|[\\d]{21})(?:\/)?(?:[\\w-_]+)?$/

Looks like I needed to do a double backslash towards the character classes. So it works for now. Not sure if there is a shorter way to use this.

0

flamusdiu 30 Aug 14 at 20:19

source to share

Daniel · Accepted Answer · 2014-08-30T23:02:07+0000

This solution needs to match both IDs and usernames (with Unicode characters):

/\+[^/]+|\d{21}/

http://regexr.com/39ds0

Explanation: Alternatively \w

(which does not match Unicode characters) I have used a negation group [^/]

(matches anything other than "/").

RegEx Compliance for Profile G + URL

More articles: