Removing spaces in XML fields with Java

I am facing the problem of removing spaces from value fields in XML data.

eg:

Input

<?xml version="1.0"?>
<ns:myOrder xmlns:ns="http://w3schools.com/BusinessDocument" xmlns:ct="http://something.com/CommonTypes">
  <MessageHeader>
     <ct:ID>i7                           </ct:ID>
     <ct:ID>i7                           </ct:ID>
     <ct:ID>i7                           </ct:ID>
     <ct:ID>i7                           </ct:ID>
     <ct:Name> Company Name           </ct:Name>
 </MessageHeader>
</ns:myOrder>

      

Expected Result:

<?xml version="1.0"?>
  <ns:myOrder xmlns:ns="http://w3schools.com/BusinessDocument" xmlns:ct="http://something.com/CommonTypes">
    <MessageHeader>
       <ct:ID>i7</ct:ID>
       <ct:ID>i7</ct:ID>
       <ct:ID>i7</ct:ID>
       <ct:ID>i7</ct:ID>
       <ct:Name>Company Name</ct:Name>
    </MessageHeader>
  </ns:myOrder>

      

I tried with the below code

public static String getTrimmedXML(String rawXMLFilename) throws Exception
     {
          BufferedReader in = new BufferedReader(new FileReader(rawXMLFilename));
     String str;
     String trimmedXML = null;     
     while ((str = in.readLine()) != null) 
     {
          String str1 = str;
          if (str1.length()>0) 
          {
               str1 = str1.trim();
               if(str1.charAt(str1.length()-1) == '>')
               {
                    trimmedXML = trimmedXML + str.trim();
               }
               else
               {
                    trimmedXML = trimmedXML + str;
               }
          }
     }     
     in.close();
     return trimmedXML.substring(4);
     }

      

I am unable to remove these spaces. Please let me know where I am going wrong.

Regards, Moniz

+3


source to share


5 answers


You might not want to use replace or replace all, because then it will replace all spaces in your xml data. If you want to trim the start / end of the xml content, either you want to parse the whole xml or use xpath and convert it to string. Use below code.



public static String getTrimmedXML(String rawXMLFilename, String tagName) throws Exception {
    // Create xml document object
    BufferedReader in = new BufferedReader(new FileReader(rawXMLFilename));
    InputSource source = new InputSource(in);
    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
    DocumentBuilder db = dbf.newDocumentBuilder();
    Document document = db.parse(source);
    XPathFactory xpathFactory = XPathFactory.newInstance();
    XPath xpath = xpathFactory.newXPath();

    // Path to the node that you want to trim
    NodeList nodeList = (NodeList) xpath.compile("//*[name()='" + tagName + "']").evaluate(document, XPathConstants.NODESET);
    for (int index = 0; index < nodeList.getLength(); index++) { // Loop through all nodes that match the xpath
        Node node = nodeList.item(index);
        String newTextContent = node.getTextContent().trim(); // Actual trim process
        node.setTextContent(newTextContent);
    }

    // Transform back the document to string format.
    TransformerFactory tf = TransformerFactory.newInstance();
    Transformer transformer = tf.newTransformer();
    transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
    StringWriter writer = new StringWriter();
    transformer.transform(new DOMSource(document), new StreamResult(writer));
    String output = writer.getBuffer().toString().replaceAll("\n|\r", "");
    return output;
}

      

+2


source


IMHO you should be using an XML library then probably select vulnerable nodes via XPath and then



String value = node.getTextContent();
node.setTextContent(value.trim());

      

0


source


Removing all whitespace in a string can be done using the replace method of the String class as follows:

String str = " random    message withlots   of white  spaces     ";
str = str.replace(" ", "");
System.out.println(str);

      

The above will work for printing str without any spaces. The replace method takes two arguments: the first is the String that you want to replace with the second argument, which is another string. This method argument is not limited to single-character strings.

0


source


Below is the code that performs white space removal in vtd-xml.

import com.ximpleware.*;
public class removeWS {

    public static void main(String[] s) throws VTDException, Exception{
        VTDGen vg = new VTDGen();
        AutoPilot ap = new AutoPilot();
        XMLModifier xm = new XMLModifier();
        if (vg.parseFile("d:\\xml2\\ws.xml", true)){
            VTDNav vn = vg.getNav();
            ap.bind(vn);
            xm.bind(vn);
            ap.selectXPath("//text()");
            int i=-1;
            while((i=ap.evalXPath())!=-1){
                int offset = vn.getTokenOffset(i);
                int len = vn.getTokenLength(i);

                long l = vn.trimWhiteSpaces((((long)len)<<32)|offset );
                System.out.println(" ===> "+vn.toString(i));
                System.out.println("len ==>"+len+" new len==>"+ (l>>32));
                int nlen = (int)(l>>32);
                int nos= (int) l;
                xm.updateToken(i,vn,nos,nlen);
            }
            xm.output("d:\\xml2\\new.xml");

        }
    }
}

      

0


source


Use replaceAll method in java

for example

String s1 = "<ct:ID>i7                           </ct:ID>";
System.out.println(s1.replaceAll(" ","").trim());

      

-2


source







All Articles