Combining two regular expressions into a custom way to get text in Java
I need to combine two regular expressions into one. text (userdoc)
INPUT:
<user>textxtxtxtx</user>
<unnecessarytag>unwanted info</unnecessarytag>
<info>infoinfoinfo. part 1.....multiline</info>
<unnecessarytag>unwanted info</unnecessarytag>
<info>infoinfoinfo. part 2.....multiline</info>
There will be many similar blocks in the file.
OUTPUT:
<user>textxtxtxtx</user>
<info>infoinfoinfo. part 1.....multiline</info>
<info>infoinfoinfo. part 2.....multiline</info>
Order must be maintained
One user can have a lot of information. The file contains many userdocs.
Code for this:
String out = String.join("\n", Files.readAllLines(Paths.get("text.txt")));
Pattern p = Pattern.compile("<user>(.*?)</user>");
Matcher m = p.matcher(out);
Pattern p1 = Pattern.compile("<info>([^<]*)</info>", Pattern.MULTILINE);
Matcher m1 = p1.matcher(out);
I was planning to write
while (m.find() && m1.find())
{
String cp = m.group();
String cp1 = m1.group();
System.out.println( cp + cp1 );
}
But it gives a text where each user will only have one information. How do I combine these two regular expressions to create a pattern that supports the ab ^ n format.
source to share
Hello why don't you turn this to XML using JDOM2 or no DOM implementation at all in java. Your current approach may be error-prone. Additionally, the XML query will be simpler, more readable (from a code point of view), and generally more elegant.
Do this, you will need to do something like the following (I am using JDOM2)
SAXBuilder saxBuilder = new SAXBuilder();
\\where modelPath a string originated from the IPath of the file that stores the data
Document originalDoc = saxBuilder.build(new File(modelPath));
Then, handling the nodes is pretty straightforward, you can either use the traditional parent -> children approach, or a slightly more general implementation that is reliable for modifying the model structure. This implementation is associated with xpath expressions. There are some pros and cons to these approaches that I suggest you research and evaluate.
For this to work, your structure must change to something like this:
<?xml version="1.0" encoding="UTF-8"?>
<userdocs>
<user name="textxtxtxtx">
<info>...</info>
<info>...</info>
<info>...</info>
</user>
<user name="test2">
<info>...</info>
<info>...</info>
<info>...</info>
</user>
<!-- etc... -->
</userdocs>
You can then do this to extract items from your preferences.
public static List<Element> getElements(String regex, Document doc, Namespace ns) {
XPathFactory xFactory = XPathFactory.instance();
XPathExpression<Element> expr = xFactory.compile(regex, Filters.element(), null, ns);
return expr.evaluate(doc);
}
\\a sample caller of the method
getElements("//user",doc,namespace).
forEach(el->{
//your processing
});
\\all it will take to retrive the user `xx`
with all of its info children is this expression //user[@name='textxtxtxtx']
A list of xpath expressions and their meaning can be found here Tester / Evaluator / Examples
source to share
Wrap your search info
in a search user
.
Pattern p = Pattern.compile("<user>(.*?)</user>");
Pattern p1 = Pattern.compile("<info>([^<]*)</info>", Pattern.MULTILINE);
Matcher m = p.matcher(out);
while ( m.find() ){
String content = m.group(1);
Matcher m2 = p1.matcher(content);
while ( m2.find() ){
//do what needs to be done.
}
}
You can also set the flag Pattern.DOT_ALL
source to share