Force first letter of regex to match uppercase value

I am trying to improve regular expressions. I am using regex101.com. I have a regex that has two capture groups. Then I use substitution to include my captured values ​​elsewhere.

For example, I have a list of values:

fat dogs 
thin cats
skinny cows
purple salamanders
etc...

      

and this commits them to two variables:

^([^\s]+)\s+([^\s;]+)?.*

      

which I then replace with new proposals using $ 1 and $ 2. For example:

$1 animals like $2 are a result of poor genetics.

      

(obviously this is a stupid example)

This works and I get my suggestions, but I'm stumped trying to get $ 1 to have the first uppercase letter. I can see all sorts of examples in MATCHING in upper or lower case, but not upper case.

It seems I need to do some "functional" processing. I need to pass $ 1 to something that will then split it in two ... the first letter and all other letters .... convert the part to uppercase ... then parse back and return the result.

Add a check to this error ... and although it is unlikely that $ 1 will have numeric values, we still have to perform a security check.

So, if someone can just point me to reading material, I would appreciate it.

+3


source to share


4 answers


So at the end of the day, the answer is that you CANNOT use a regex to convert ... it doesn't work. With input from others, I was able to fine-tune my approach and still fulfill the purpose of this self-employed academic assignment.

First, from the OP, you will recall that I had a list, and I was writing two words from that list into regex variables. Ok, I modified the regex capture to get three capture groups. For example:

^(\S)(\S+)\s+_(\S)?.*
//would turn fat dogs into
//$1 = f, $2 = at, $3 = dogs

      

So, then using Notepad ++, I replaced it with this:



\ u $ 1 $ 2 animals like $ 3 are the result of bad genetics.

So I was able to convert the first letter to uppercase .. but as others have pointed out, this is NOT a regex doing the conversion, but a different process. (Notepad ++ in this case, but could be your C #, perl, etc.).

Thanks everyone for helping the newbie.

0


source


I think it can be very simple based on your language of choice. You can iterate over the list of values ​​and find a match, and then put the groups in your string using the method capitalize

for the first match:

for val in my_list:
    m = match(^([^\s]+)\s+([^\s;]+)?.*,val)
    print  "%sanimals like %s are a result of poor genetics."%(m.group(1).capitalize(), m.group(1))

      



But if you want to calculate the whole thing with regex

This is hardly possible because you need to change your string and it is usually not a regex, a suitable task for regex.

+2


source


Simply put, a regex can only be replaced with what is in the original string. There is fat dogs

no capital F

, so you cannot receive fat dogs

as a withdrawal.

However, this is possible in Perl, but just because Perl processes the text after the regex replacement completes, this is not a feature of the regex itself. Below is a short Perl program (no regex) that performs case conversion if executed from the command line:

#!/usr/bin/perl -w
use strict;

print "fat dogs\n";   # fat dogs
print "\ufat dogs\n"; # Fat dogs
print "\Ufat dogs\n"; # FAT DOGS

      

The same escape sequences also work in regular expressions:

#!/usr/bin/perl -w
use strict;

my $animal = "fat dogs";
$animal =~ s/(\w+) (\w+)/\u$1 \U$2/;
print $animal;  # Fat DOGS

      

Let me repeat, but it is Perl that does it, not a regular expression.

Depending on your real world, you may not need to change the case of the letter. If your input fat dogs

, then you will get the desired result. Otherwise, you will have to handle it $1

yourself.

In PHP you can use preg_replace_callback()

to handle the entire match, including captured groups, before returning a wildcard string. Here's a similar PHP program:

<?php
$animal = "fat dogs";
print(preg_replace_callback('/(\w+) (\w+)/', 'my_callback', $animal));  // Fat DOGS

function my_callback($match) {
  return ucfirst($match[1]) . ' ' . strtoupper($match[2]);
}
?>

      

+2


source


The regex will only match what is. What you are doing is essentially:

  • Matching element
  • Show matches

but you want:

  • Matching element
  • Change matches
  • Displaying Modified Matches

The regex doesn't do any "processing" on matches, it's just syntax for finding matches first.

Most string processing languages, for example if you have matches in variables $1

and $2

like above you would like to do something along the lines:

$1 = upper(substring($1, 0, 1)) + substring($1, 1)

assuming a function upper()

, if the language is stringed the function is uppercase, and substring()

returns a substring (zero indexed).

+1


source







All Articles