dcsimg
Need help in parsing text file with list to XML
1 posts in topic
Flat View  Flat View
TOPIC ACTIONS:
 

Posted By:   Rampriya_Balakrishna
Posted On:   Thursday, October 23, 2008 05:50 PM

I have a tricky situation, I have a text file which contains thousands of lines, each lines has delimiter as tab. Here is the text snapshot line 1   line2   line3       line4   line5   line6       line7           line8 line9 I want the xml as                                                   Any help on this is greatly appreciated.    More>>

I have a tricky situation, I have a text file which contains thousands of lines, each lines has delimiter as tab. Here is the text snapshot



line 1

  line2

  line3

      line4

  line5

  line6

      line7

          line8

line9




I want the xml as


  

  

      

  

  

  

      

          

      

  







Any help on this is greatly appreciated.

   <<Less

Re: Need help in parsing text file with list to XML

Posted By:   Robert_Lybarger  
Posted On:   Thursday, October 23, 2008 08:10 PM

This should be fairly close:



package org.roblybarger;

import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.Reader;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;

import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;

public class Main {

public static void main(String[] args) throws Exception {
File dir = new File(System.getProperty("user.home"));
File file = new File(dir, "input.txt");
Reader reader = new FileReader(file);
BufferedReader br = new BufferedReader(reader);
String line = null;

DocumentBuilder builder =
DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document doc = builder.newDocument();
Element root = doc.createElement("root");

int lineNum = 0;
int lastDepth = -1;
Node lastNode = root;

while ((line = br.readLine()) != null)
{
lineNum++;
int depth = line.lastIndexOf(' ') + 1;
String elementName = line.trim();
Element element = doc.createElement(elementName);
if (depth > (lastDepth+1))
{
throw new RuntimeException("Bad input file: "
+" Too many additional tabs at: ["
+ elementName + "] on line " + lineNum);
}
if (depth > lastDepth)
{
lastNode.appendChild(element);
lastNode = element;
}
else if (depth == lastDepth)
{
lastNode = lastNode.getParentNode();
lastNode.appendChild(element);
lastNode = element;
}
else
{
for (int i=0 ; i<=(lastDepth-depth) ; i++)
{
lastNode = lastNode.getParentNode();
}
lastNode.appendChild(element);
lastNode = element;
}
lastDepth = depth;
}

Transformer tf = TransformerFactory.newInstance().newTransformer();
tf.setOutputProperty(OutputKeys.INDENT, "yes");
tf.transform(new DOMSource(root), new StreamResult(System.out));
}
}

The 'indent' mode on my system at least gives new lines but no actual superfluous tabs in the output. You should be able to load the result into an xml editor to format in the pretty-printing tabs and whitespace if you need it. I also wrapped the entire file in a "root" node to make it a valid xml file. HTH.

About | Sitemap | Contact