RegEx Compliance for Profile G + URL

I was trying to match only user id or vanity URIs for Google+ accounts. I am using GAS (Google Script Engine) which I have loaded XRegExp to match Unicode characters.

As long as I have this: ((https?://)?(plus\.)?google\.com/)?(.*/)?([a-zA-Z0-9._]*)($|\?.*)

which you can see in regular tests (external site) , it still doesn't just match the right-hand side.

I tried using \p{L}

internally [a-zA-Z0-9._]

but no luck with that. Also, I end up with an extra slash at the end of the profile name when it matches.

UPDATE # 1: I am trying to fix any G + url in a spreadsheet copied from a google form. Links are not all the same and the simplest profile link is https://plus.google.com/ "+ user id or vanity name.

UPDATE # 2: So far I have ([+]\w+|[0-9]{21})(?:\/)?(?:\w+)?$

using @demrks simplified version of @ guest271314's answer. However, there are two problems:

1) Google Vanity urls can have unicode in them. Example: https://plus.google.com/u/0/+JoseManuelGarcía_ertatto

that fails. I tried using \ p {L} but didn't seem to figure it out.

2) GAS doesn't seem to like this event, although regular tests are running on this site. = (

UPDATE # 3: It seems like GAS just hates using it \w

, so I had to extend it. So I have this so far:

/([+][A-Za-z0-9-_]+|[0-9]{21})(?:\/)?(?:[A-Za-z0-9-_]+)?$/ 

      

It even matches "/ about" or "/ posts" at the end of the url. However, still not UNICODE compliant. = (I am still working on this.

UPDATE # 4: . This seems to work:   /([+][\\w-_\\p{L}]+|[\\d]{21})(?:\/)?(?:[\\w-_]+)?$/

Looks like I needed to do a double backslash towards the character classes. So it works for now. Not sure if there is a shorter way to use this.

+3


source to share


4 answers


This solution needs to match both IDs and usernames (with Unicode characters):

/\+[^/]+|\d{21}/

      



http://regexr.com/39ds0

Explanation: Alternatively \w

(which does not match Unicode characters) I have used a negation group [^/]

(matches anything other than "/").

+1


source


Change, update

Try (v4)

document.URL.match(/\++\w+.*|\d+\d|\/+\w+$/).toString()
.replace(/\/+|posts|about|photos|videos|plusones|reviews/g, "")

      



eg.

var urls = ["https://plus.google.com/+google/posts"
            , "https://plus.google.com/+google/about"
            , "https://plus.google.com/+google/photos"
            , "https://plus.google.com/+google/videos"
            , "https://plus.google.com/+google/plusones"
            , "https://plus.google.com/+google/reviews"
            , "https://plus.google.com/communities/104645458102703754878"
            , "https://plus.google.com/u/0/LONGIDHERE"
            , "https://plus.google.com/u/0/+JoseManuelGarcía_ertatto"];
var _urls = [];

urls.forEach(function(item) {
  _urls.push(item.match(/\++\w+.*|\d+\d|\/+\w+$/).toString()
            .replace(/\/+|posts|about|photos|videos|plusones|reviews/g, ""));

});

_urls.forEach(function(id) {
    var _id = document.createElement("div");
    _id.innerHTML = id;
    document.body.appendChild(_id)
});

      

jsfiddle http://jsfiddle.net/guest271314/o4kvftwh/

+3


source


Following a possible solution:

(?:\+)(\w+)|(?:\/)(\w+)$

      

Explanation:

  • 1st alternative: (?:\+)(\w+)

    (?:\+)

    Missing group: \+

    Literally matches the character +

    . Capture group (\w+)

    : \w+

    matches any character in the word [a-zA-Z0-9_]. Quantifier: one to unlimited times.

  • The second alternative: (?:\/)(\w+)$

    . (?:\/)

    Not an exciting group. \/

    literally matches a character /

    . Capturing a group (\w+)

    . \w+

    matches any character in the word [a-zA-Z0-9_]

    . Quantifier: from one to unlimited time. $

    approve the position at the end of the line.

Hope this is helpful!

0


source


So this works: /([+][\\w-_\\p{L}]+|[\\d]{21})(?:\/)?(?:[\\w-_]+)?$/

Looks like I needed to do a double backslash towards the character classes. So it works for now. Not sure if there is a shorter way to use this.

0


source







All Articles