Combining two regular expressions into a custom way to get text in Java

I need to combine two regular expressions into one. text (userdoc)



<unnecessarytag>unwanted info</unnecessarytag>

<info>infoinfoinfo. part 1.....multiline</info>

<unnecessarytag>unwanted info</unnecessarytag>

<info>infoinfoinfo. part 2.....multiline</info>


There will be many similar blocks in the file.



<info>infoinfoinfo. part 1.....multiline</info>

<info>infoinfoinfo. part 2.....multiline</info>


Order must be maintained

One user can have a lot of information. The file contains many userdocs.

Code for this:

String out = String.join("\n", Files.readAllLines(Paths.get("text.txt")));

Pattern p = Pattern.compile("<user>(.*?)</user>");
Matcher m = p.matcher(out);

Pattern p1 = Pattern.compile("<info>([^<]*)</info>", Pattern.MULTILINE);
Matcher m1 = p1.matcher(out);


I was planning to write

while (m.find() && m1.find())
    String cp =;
    String cp1 =;
    System.out.println(  cp + cp1 );


But it gives a text where each user will only have one information. How do I combine these two regular expressions to create a pattern that supports the ab ^ n format.


source to share

2 answers

Hello why don't you turn this to XML using JDOM2 or no DOM implementation at all in java. Your current approach may be error-prone. Additionally, the XML query will be simpler, more readable (from a code point of view), and generally more elegant.

Do this, you will need to do something like the following (I am using JDOM2)

SAXBuilder saxBuilder = new SAXBuilder(); 
\\where modelPath a string originated from the IPath of the file that stores the data
Document originalDoc = File(modelPath));


Then, handling the nodes is pretty straightforward, you can either use the traditional parent -> children approach, or a slightly more general implementation that is reliable for modifying the model structure. This implementation is associated with xpath expressions. There are some pros and cons to these approaches that I suggest you research and evaluate.

For this to work, your structure must change to something like this:

<?xml version="1.0" encoding="UTF-8"?>
    <user name="textxtxtxtx">
    <user name="test2">
    <!-- etc... -->


You can then do this to extract items from your preferences.

public static List<Element> getElements(String regex, Document doc, Namespace ns) {
        XPathFactory xFactory = XPathFactory.instance();
        XPathExpression<Element> expr = xFactory.compile(regex, Filters.element(), null, ns);    
        return expr.evaluate(doc);   

\\a sample caller of the method
                             //your processing

\\all it will take to retrive the user `xx` 
with all of its info children is this expression //user[@name='textxtxtxtx']


A list of xpath expressions and their meaning can be found here Tester / Evaluator / Examples



Wrap your search info

in a search user


Pattern p = Pattern.compile("<user>(.*?)</user>");
Pattern p1 = Pattern.compile("<info>([^<]*)</info>", Pattern.MULTILINE);
Matcher m = p.matcher(out);
while ( m.find() ){
    String content =;
    Matcher m2 = p1.matcher(content);
    while ( m2.find() ){
        //do what needs to be done. 


You can also set the flag Pattern.DOT_ALL



All Articles