Split JSON File Objects Into Multiple Files

2020-05-09 22:32发布

I have a file with too many data objects in JSON of the following form:

{
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "properties": {},
      "geometry": {
        "type": "Polygon",
        "coordinates": [
          [
            [
              -37.880859375,
              78.81903553711727
            ],
            [
              -42.01171875,
              78.31385955743478
            ],
            [
              -37.6171875,
              78.06198918665974
            ],
            [
              -37.880859375,
              78.81903553711727
            ]
          ]
        ]
      }
    },
    {
      "type": "Feature",
      "properties": {},
      "geometry": {
        "type": "Polygon",
        "coordinates": [
          [
            [
              -37.6171875,
              78.07107600956168
            ],
            [
              -35.48583984375,
              78.42019327591201
            ],
            [
              -37.880859375,
              78.81903553711727
            ],
            [
              -37.6171875,
              78.07107600956168
            ]
          ]
        ]
      }
    }
  ]
}

I would like to split the large file such that each features object would have its own file containing a its type object and features(coordinates) object. So essentially, I am trying to get many of these:

{
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "properties": {},
      "geometry": {
        "type": "Polygon",
        "coordinates": [
          [
            [
              -37.6171875,
              78.07107600956168
            ],
            [
              -35.48583984375,
              78.42019327591201
            ],
            [
              -37.880859375,
              78.81903553711727
            ],
            [
              -37.6171875,
              78.07107600956168
            ]
          ]
        ]
      }
    }
  ]
}

3条回答
做个烂人
2楼-- · 2020-05-09 22:46

PowerShell solution (requires PowerShell v3 or newer):

$i = 0
Get-Content 'C:\path\to\input.json' -Raw |
  ConvertFrom-Json |
  Select-Object -Expand features |
  ForEach-Object {
    $filename = 'C:\path\to\feature{0:d5}.json' -f ($i++)

    $properties = [ordered]@{
      type     = 'FeatureCollection'
      features = $_
    }

    New-Object -Type PSObject -Property $properties |
      ConvertTo-Json -Depth 10 |
      Set-Content $filename
  }
查看更多
乱世女痞
3楼-- · 2020-05-09 22:50

Here's a solution requiring just one invocation of jq and one of awk, assuming the input is in a file (input.json) and that the N-th component should be written to a file /tmp/file$N.json beginning with N=1:

jq -c '.features = (.features[] | [.]) ' input.json |
  awk '{ print > "/tmp/file" NR ".json"}'

An alternative to awk here would be split -l 1.

If you want each of the output files to be "pretty-printed", then using a shell such as bash, you could (at the cost of n additional calls to jq) write:

N=0
jq -c '.features = (.features[] | [.])' input.json |
  while read -r json ; do
  N=$((N+1))
  jq . <<< "$json"  > "/tmp/file${N}.json"
done

Each of the additional calls to jq will be fast, so this may be acceptable.

查看更多
家丑人穷心不美
4楼-- · 2020-05-09 22:57

I haven't tested this code properly. But should provide you some idea on how you can solve the problem mentioned above

var json = {
        "type": "FeatureCollection",
        "features": [
          {
            "type": "Feature",
            "properties": {},
            "geometry": {
              "type": "Polygon",
              "coordinates": [
                [
                  [
                    -37.880859375,
                    78.81903553711727
                  ],
                  [
                    -42.01171875,
                    78.31385955743478
                  ],
                  [
                    -37.6171875,
                    78.06198918665974
                  ],
                  [
                    -37.880859375,
                    78.81903553711727
                  ]
                ]
              ]
            }
          },
          {
            "type": "Feature",
            "properties": {},
            "geometry": {
              "type": "Polygon",
              "coordinates": [
                [
                  [
                    -37.6171875,
                    78.07107600956168
                  ],
                  [
                    -35.48583984375,
                    78.42019327591201
                  ],
                  [
                    -37.880859375,
                    78.81903553711727
                  ],
                  [
                    -37.6171875,
                    78.07107600956168
                  ]
                ]
              ]
            }
          }
        ]
      }
      $(document).ready(function(){
        var counter = 1;
        json.features.forEach(function(feature){
          var data = {type: json.type, features: [feature]}
          var newJson = JSON.stringify(data);
          var blob = new Blob([newJson], {type: "application/json"});
          var url  = URL.createObjectURL(blob);
          var a = document.createElement('a');
          a.download    = "feature_" + counter + ".json";
          a.href        = url;
          a.textContent = "Download feature_" + counter + ".json";
          counter++;
          document.getElementById('feature').appendChild(a);
          document.getElementById('feature').appendChild(document.createElement('br'));
        });
      });
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<div id="feature"></div>

查看更多
登录 后发表回答