[KLUG Members] Python XML to CSV

Adam Tauno Williams adam at morrison-ind.com
Tue Jan 31 10:19:23 EST 2006


On Mon, 2006-01-30 at 16:09 -0500, Adam Tauno Williams wrote:
> Does anyone have an example for convering an XML document to a CSV, or
> something similiar, in Python?
> For instance, I want to turn -
> <row><c01>2</c01><c02>22302101</c02><c03>022</c03><c04>00</c04><c05>01/25/2006</c05><c06>5</c06><c07/><c08/><c09>2.00</c09><c10>277202</c10><c11/><c12>277202 WHEEL,LOAD</c12><c13>65.03</c13><c14>130.06</c14><c15/><c16/><c17>Y</c17><c18>NPP40</c18><c19>2CL02277</c19><c20/><c21/><c22/><c23/><c24/></row>
> -
> into a 24 field CSV file (with a couple of caveats, but I think I can
> figure those out).  I want to do it in Python as this example exists
> inside BIE which has an internal Python-script action.

For the sake of the archives, here is my solution (I cheated a bit and
imported the Java JDOM classes into the Python namespace, it is way more
straight forward than PyXML - at least to me,  who isn't terribly used
to Pythons freakish syntax or documentation style):
------------------------------
import org.jdom as jdom
import java.io as io

output_raw = ""
counter = ""
builder = jdom.input.SAXBuilder()
doc = builder.build(io.FileInputStream("TranslatedDocument.xml"))
rootElement = doc.getRootElement()
for node in rootElement.getChildren("row"):
   output_raw = output_raw + (node.getChild("c01").getTextTrim()) + ","
   output_raw = output_raw + (node.getChild("c02").getTextTrim()) + ","
   output_raw = output_raw + (node.getChild("c03").getTextTrim()) + ","
   output_raw = output_raw + (node.getChild("c04").getTextTrim()) + ","
   output_raw = output_raw + (node.getChild("c05").getTextTrim()) + ","
   if ( (node.getChild("c01").getTextTrim()) == "2" ):
     counter = counter + 1
     output_raw = output_raw + str(counter) + ","
   else:
     counter = 0
     output_raw = output_raw + (node.getChild("c06").getTextTrim()) +
","
   output_raw = output_raw + (node.getChild("c07").getTextTrim()) + ","
   output_raw = output_raw + (node.getChild("c08").getTextTrim()) + ","
   output_raw = output_raw + (node.getChild("c09").getTextTrim()) + ","
   output_raw = output_raw + (node.getChild("c10").getTextTrim()) + ","
   output_raw = output_raw + (node.getChild("c11").getTextTrim()) + ","
   output_raw = output_raw + (node.getChild("c12").getTextTrim()) + ","
   output_raw = output_raw + (node.getChild("c13").getTextTrim()) + ","
   output_raw = output_raw + (node.getChild("c14").getTextTrim()) + ","
   output_raw = output_raw + (node.getChild("c15").getTextTrim()) + ","
   output_raw = output_raw + (node.getChild("c16").getTextTrim()) + ","
   output_raw = output_raw + (node.getChild("c17").getTextTrim()) + ","
   output_raw = output_raw + (node.getChild("c18").getTextTrim()) + ","
   output_raw = output_raw + (node.getChild("c19").getTextTrim()) + ","
   output_raw = output_raw + (node.getChild("c20").getTextTrim()) + ","
   output_raw = output_raw + (node.getChild("c21").getTextTrim()) + ","
   output_raw = output_raw + (node.getChild("c22").getTextTrim()) + ","
   output_raw = output_raw + (node.getChild("c23").getTextTrim()) + ","
   output_raw = output_raw + (node.getChild("c24").getTextTrim()) + ","
   output_raw = output_raw + "\n"
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://www.kalamazoolinux.org/pipermail/members/attachments/20060131/ffd1dbf0/attachment.bin


More information about the Members mailing list