Json-> csv using in2csv - pointer keys do not return values

I converted the xml file to json using xml2json.

A small section looks like this. I want to convert this to csv. I am using csvkit in2csv

Using the basic syntax shows an error, simple.

C:\Users\Renshaw\Documents\Sayth\XML>in2csv test2.json > test2.csv
When converting a JSON document with a top-level dictionary element, a key must
be specified.

      

So by adding the key, I get no errors, but also no output.

C:\Users\Renshaw\Documents\Sayth\XML>in2csv test2.json -k "//Meeting/Races" > te
st2.csv
'//Meeting/Races'

C:\Users\Renshaw\Documents\Sayth\XML>in2csv test2.json -k "//Meeting/Races/RaceE
ntries/RaceEntry" > test2.csv
'//Meeting/Races/RaceEntries/RaceEntry'

      

I've tried a wide range of keys now and get no errors, but also no output, any idea how to make it an output source in csv?

{
  "Meeting": {
    "NumOfRaces": {
      "#tail": "\n  ",
      "#text": "9"
    },
    "WeightsPublishing": {
      "#tail": "\n  ",
      "#text": "2014-09-30T00:00:00+10:00"
    },
    "NominationsClose": {
      "#tail": "\n  ",
      "#text": "2014-09-29T12:00:00+10:00"
    },
    "CodeType": {
      "#tail": "\n  ",
      "#text": "GALLOPS"
    },
    "Track": {
      "Rainfall": {
        "#tail": "\n    ",
        "#text": "Nil last 24hrs, 4.2mm last 7 days"
      },
      "Irrigation": {
        "#tail": "\n    ",
        "#text": "Nil last 24hrs, 25mm last 7 days"
      },
      "RailPosition": {
        "#tail": "\n    ",
        "#text": "+9m Entire Circuit"
      },
      "#tail": "\n  ",
      "TrackSurface": {
        "#tail": "\n    ",
        "#text": "Turf"
      },
      "Comments": {
        "#tail": "\n    ",
        "#text": "Finalised 4\/10 - 7:45am  Late Scratching Race 3 No. 4"
      },
      "Weather": {
        "#tail": "\n    ",
        "#text": "Fine"
      },
      "Penetrometer": {
        "#tail": "\n    ",
        "#text": "4.83"
      },
      "RailPositionLastMeeting": {
        "#tail": "\n    ",
        "#text": "True Position Entire Circuit"
      },
      "TrackInfo": {
        "#tail": "\n  ",
        "#text": "Penetrometer: Inside 4.85, Outside 4.85"
      },
      "TrackRating": {
        "#tail": "\n    ",
        "#text": "Good"
      },
      "#text": "\n    ",
      "RacingDirection": {
        "#tail": "\n    ",
        "#text": "AntiClockwise"
      }
    },
    "MeetingStage": {
      "#tail": "\n  ",
      "#text": "Acceptances"
    },
    "Races": {
      "#tail": "\n",
      "#text": "\n    ",
      "Race": [
        {
          "Comments": {
            "#tail": "\n    "
          },
          "NominationsDivisor": {
            "#tail": "\n      ",
            "#text": "0"
          },
          "Starters": {
            "#tail": "\n      ",
            "#text": "11"
          },
          "TrackRecords": {
            "#tail": "\n      ",
            "TrackRecord": {
              "TrackRecordHorse": {
                "#tail": "\n        "
              },
              "#text": "\n          ",
              "#tail": "\n      ",
              "DistanceRace": {
                "#tail": "\n          ",
                "#text": "1000"
              },
              "Time": {
                "#tail": "\n          ",
                "#text": "00:00:55.420"
              },
              "RaceNumber": {
                "#tail": "\n          ",
                "#text": "7"
              },
              "RaceDate": {
                "#tail": "\n          ",
                "#text": "2013-02-16"
              }
            },
            "#text": "\n        "
          },
          "RaceDistance": {
            "#tail": "\n      ",
            "#text": "1000"
          },
          "NominationsRaceNumber": {
            "#tail": "\n      ",
            "#text": "1"
          },
          "ApprenticeCanClaim": {
            "#tail": "\n      ",
            "#text": "false"
          },
          "SizeField": {
            "#tail": "\n      ",
            "#text": "16"
          },
          "NameRaceForm": {
            "#tail": "\n      ",
            "#text": "MARIBYRNONG TRL"
          },
          "RaceType": {
            "#tail": "\n      ",
            "#text": "Flat"
          },
          "SizeEmergency": {
            "#tail": "\n      ",
            "#text": "4"
          },
          "DistanceApprox": {
            "#tail": "\n      ",
            "#text": "false"
          },
          "#text": "\n      ",
          "BallotedOutEntries": {
            "#tail": "\n      "
          },
          "Logos": {
            "#tail": "\n      ",
            "Logo": {
              "#tail": "\n      "
            },
            "#text": "\n        "
          },
          "#tail": "\n    ",
          "TrackCircumference": {
            "#tail": "\n      ",
            "#text": "2313"
          },
          "NameRaceNews": {
            "#tail": "\n      ",
            "#text": "Maribyrnong Trial Stakes"
          },
          "WeightChange": {
            "#tail": "\n      ",
            "#text": "0.00"
          },
          "Accepters": {
            "#tail": "\n      ",
            "#text": "12"
          },
          "RaceEntries": {
            "RaceEntry": [
              {
                "Trainer": {
                  "Location": {
                    "#tail": "\n            ",
                    "#text": "Cranbourne"
                  },
                  "#text": "\n            ",
                  "Surname": {
                    "#tail": "\n            ",
                    "#text": "Laing"

      

+3


source to share


2 answers


There are two problems with what you are doing.

First, you are specifying the key incorrectly (you do it in XML / XPath style with a slash when dealing with JSON in this case). You should just give the name of the element (eg Meeting).



However, the main problem is the type of JSON used, which consists of several nested dictionaries that in2csv cannot handle (multiple levels, how would it know which columns to use?). You need to flatten your data somehow so that the fields can be clearly identified.

You can look into this question for ideas on how to convert JSON to CSV, because I don't think in2csv is going to shorten it in your case.

+3


source


If you want to convert each XML path to a path expression, use for column 1 in your CSV, and use the value at the very bottom for column 2, the following code might solve your problem:

import json

json_input = """{
  "Meeting": {
    "NominationsClose": {
      "#tail": "\\n  ",
      "#text": "2014-09-29T12:00:00+10:00"
    },
    "CodeType": {
      "#tail": "\\n  ",
      "#text": "GALLOPS"
    },
    "Track": {
      "Rainfall": {
        "#tail": "\\n    ",
        "#text": "Nil last 24hrs, 4.2mm last 7 days"
      },
      "Irrigation": {
        "#tail": "\\n    ",
        "#text": "Nil last 24hrs, 25mm last 7 days"
      }
    }
  }
}"""

def print_csv_depth_first(tree, path=""):
    if isinstance(tree, dict):
        for key in tree.keys():
            print_csv_depth_first(tree[key], "{}/{}".format(path, key))
    elif isinstance(tree, list):
        for i in range(len(tree)):
            print_csv_depth_first(tree[i], "{}/{}".format(path, str(i)))
    elif isinstance(tree, str):
        entry = tree
        print('{},{}'.format(path, repr(entry)))
        return

json = json.loads(json_input)
print_csv_depth_first(json)

      

I've included a small portion of your sample JSON data. At the very bottom, your data also contains the beginning of the list "RaceEntry": [

, but this is incomplete, so I had to extrapolate. The above code outputs the following output:



/Meeting/NominationsClose/#tail,'\n  '
/Meeting/NominationsClose/#text,'2014-09-29T12:00:00+10:00'
/Meeting/CodeType/#tail,'\n  '
/Meeting/CodeType/#text,'GALLOPS'
/Meeting/Track/Rainfall/#tail,'\n    '
/Meeting/Track/Rainfall/#text,'Nil last 24hrs, 4.2mm last 7 days'
/Meeting/Track/Irrigation/#tail,'\n    '
/Meeting/Track/Irrigation/#text,'Nil last 24hrs, 25mm last 7 days'

      

You will need to adapt the string containing the statement print

to suit your needs.

+1


source







All Articles