Removing spaces in XML fields with Java
I am facing the problem of removing spaces from value fields in XML data.
eg:
Input
<?xml version="1.0"?>
<ns:myOrder xmlns:ns="http://w3schools.com/BusinessDocument" xmlns:ct="http://something.com/CommonTypes">
<MessageHeader>
<ct:ID>i7 </ct:ID>
<ct:ID>i7 </ct:ID>
<ct:ID>i7 </ct:ID>
<ct:ID>i7 </ct:ID>
<ct:Name> Company Name </ct:Name>
</MessageHeader>
</ns:myOrder>
Expected Result:
<?xml version="1.0"?>
<ns:myOrder xmlns:ns="http://w3schools.com/BusinessDocument" xmlns:ct="http://something.com/CommonTypes">
<MessageHeader>
<ct:ID>i7</ct:ID>
<ct:ID>i7</ct:ID>
<ct:ID>i7</ct:ID>
<ct:ID>i7</ct:ID>
<ct:Name>Company Name</ct:Name>
</MessageHeader>
</ns:myOrder>
I tried with the below code
public static String getTrimmedXML(String rawXMLFilename) throws Exception
{
BufferedReader in = new BufferedReader(new FileReader(rawXMLFilename));
String str;
String trimmedXML = null;
while ((str = in.readLine()) != null)
{
String str1 = str;
if (str1.length()>0)
{
str1 = str1.trim();
if(str1.charAt(str1.length()-1) == '>')
{
trimmedXML = trimmedXML + str.trim();
}
else
{
trimmedXML = trimmedXML + str;
}
}
}
in.close();
return trimmedXML.substring(4);
}
I am unable to remove these spaces. Please let me know where I am going wrong.
Regards, Moniz
source to share
You might not want to use replace or replace all, because then it will replace all spaces in your xml data. If you want to trim the start / end of the xml content, either you want to parse the whole xml or use xpath and convert it to string. Use below code.
public static String getTrimmedXML(String rawXMLFilename, String tagName) throws Exception {
// Create xml document object
BufferedReader in = new BufferedReader(new FileReader(rawXMLFilename));
InputSource source = new InputSource(in);
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document document = db.parse(source);
XPathFactory xpathFactory = XPathFactory.newInstance();
XPath xpath = xpathFactory.newXPath();
// Path to the node that you want to trim
NodeList nodeList = (NodeList) xpath.compile("//*[name()='" + tagName + "']").evaluate(document, XPathConstants.NODESET);
for (int index = 0; index < nodeList.getLength(); index++) { // Loop through all nodes that match the xpath
Node node = nodeList.item(index);
String newTextContent = node.getTextContent().trim(); // Actual trim process
node.setTextContent(newTextContent);
}
// Transform back the document to string format.
TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
StringWriter writer = new StringWriter();
transformer.transform(new DOMSource(document), new StreamResult(writer));
String output = writer.getBuffer().toString().replaceAll("\n|\r", "");
return output;
}
source to share
IMHO you should be using an XML library then probably select vulnerable nodes via XPath and then
String value = node.getTextContent();
node.setTextContent(value.trim());
source to share
Removing all whitespace in a string can be done using the replace method of the String class as follows:
String str = " random message withlots of white spaces ";
str = str.replace(" ", "");
System.out.println(str);
The above will work for printing str without any spaces. The replace method takes two arguments: the first is the String that you want to replace with the second argument, which is another string. This method argument is not limited to single-character strings.
source to share
Below is the code that performs white space removal in vtd-xml.
import com.ximpleware.*;
public class removeWS {
public static void main(String[] s) throws VTDException, Exception{
VTDGen vg = new VTDGen();
AutoPilot ap = new AutoPilot();
XMLModifier xm = new XMLModifier();
if (vg.parseFile("d:\\xml2\\ws.xml", true)){
VTDNav vn = vg.getNav();
ap.bind(vn);
xm.bind(vn);
ap.selectXPath("//text()");
int i=-1;
while((i=ap.evalXPath())!=-1){
int offset = vn.getTokenOffset(i);
int len = vn.getTokenLength(i);
long l = vn.trimWhiteSpaces((((long)len)<<32)|offset );
System.out.println(" ===> "+vn.toString(i));
System.out.println("len ==>"+len+" new len==>"+ (l>>32));
int nlen = (int)(l>>32);
int nos= (int) l;
xm.updateToken(i,vn,nos,nlen);
}
xm.output("d:\\xml2\\new.xml");
}
}
}
source to share