Extracting a substring from a weirder string (duplicate characters)
I have a series of address lines in the format: 12345 Some Address, Some Square | phone number | surname name
For example:
40327 Ocie Camp Apt. 117, Maywood | 1-155-932-2562 x738 | Sauer Meredith
76106 Tomas Highway, Santa Ana | 722.884.5632 | Roberts Westley 19056
Jamarcus Lane, Lawndale | (151) 847-7455 x133 | Haag Camille 66724 Slip
12-C, Hoover | 841.047.3195 x69422 | Trantow Danielle 99824 Fisher Locks
# 247, Akron | (565) 132-9970 x93939 | Wiza bell
I am trying to extract only part of the last name.
I've tried a typical str.substring (str.indexOf ("|"), str.indexOf ("")), but obviously this is producing the wrong string.
Any ideas as to how I would get the last name from strings like this?
source to share
If your data is consistent in the form in which you have an ALLOWS data structure like:
"12345 Some Address, Some Square | phone number | last name first name
then you can split each string by a char pipe and get the element at index 2
String myString = "12345 Some Address, Some Square|phone number|surname name";
String[] x = myString.split("\\|");
System.out.println(x[2]);
Edit:
if some elements change their order, this approach will not work, if some elements are missing, this approach will not work, so you need to take care of this by validating the input earlier.
Edit2:
another approach can get the last channel index | and trim using String # substring ()
int c = myString.lastIndexOf("|");
System.out.println(myString.substring(c + 1));
source to share
You can do this with a regular expression.
^.*\|([^\d]+)[^|]*$
code
System.out.println(s.replaceAll("^.*\\|([^\\d]+)[^|]*$", "$1"));
Output
Sauer Meredith
Roberts Westley
Haag Camille
Trantow Danielle
Wiza Bell
Completion code: https://ideone.com/uON0BP
source to share
I used Regular Expressions for this.
Code:
@Test
public void test() {
String[] lines
= ("40327 Ocie Camp Apt. 117, Maywood|1-155-932-2562 x738|Sauer Meredith\n" +
"76106 Tomas Highway, Santa Ana|722.884.5632|Roberts Westley")
.split("\n");
Pattern pattern = Pattern.compile("^(?<address>.*?)\\|(?<number>.*?)\\|(?<surname>.*?) (?<name>.*?)$");
for (String line : lines) {
Matcher matcher = pattern.matcher(line);
if(matcher.find()) {
String surname = matcher.group("surname");
System.out.println(surname);
}
}
}
Output:
Sauer
Roberts
The expression matches one line in the format you specify, and you can easily access the individual parts of the desired line.
It's also easier to maintain if you want to access different parts in the future.
source to share
Use method lastIndexOf
This method returns the index of the last occurrence of a character in the character sequence represented by this object that is less than or equal to the index, or -1 if the character does not occur before that point.
Example:
String data = "40327 Ocie Camp Apt. 117, Maywood|1-155-932-2562 x738|Sauer Meredith";
System.out.println(data.substring(data.lastIndexOf('|') + 1));
source to share
This is the job for regular expressions:
Pattern rx = Pattern.compile("[^\\|]*\\|[^\\|]*\\|\\s*([^0-9]+)");
String line = "76106 Tomas Highway, Santa Ana|722.884.5632|Roberts Westley 19056";
Matcher m = rx.matcher(line);
if(m.find()){
String surname = m.group(1).trim();
System.out.println(surname);
}
This will lead to the conclusion
Roberts Westley
source to share