I need to split a large (500 MB) text file (a log4net exception file) into manageable chunks like 100 5 MB files would be fine.
I would think this should be a walk in the park for PowerShell. How can I do it?
I need to split a large (500 MB) text file (a log4net exception file) into manageable chunks like 100 5 MB files would be fine.
I would think this should be a walk in the park for PowerShell. How can I do it?
Do this:
FILE 1
There's also this quick (and somewhat dirty) one-liner:
You can tweak the number of first lines per batch by changing the hard-coded 3000 value.
FILE 2
FILE 3
etc…
As the lines can be variable in logs I thought it best to take a number of lines per file approach. The following code snippet processed a 4 million line log file in under 19 seconds (18.83.. seconds)splitting it into 500,000 line chunks:
This can easily be turned into a function or script file with parameters to make it more versatile. It uses a
StreamReader
andStreamWriter
to achieve its speed and tiny memory footprintA word of warning about some of the existing answers - they will run very slow for very big files. For a 1.6 GB log file I gave up after a couple of hours, realising it would not finish before I returned to work the next day.
Two issues: the call to Add-Content opens, seeks and then closes the current destination file for every line in the source file. Reading a little of the source file each time and looking for the new lines will also slows things down, but my guess is that Add-Content is the main culprit.
The following variant produces slightly less pleasant output: it will split files in the middle of lines, but it splits my 1.6 GB log in less than a minute:
I often need to do the same thing. The trick is getting the header repeated into each of the split chunks. I wrote the following cmdlet (PowerShell v2 CTP 3) and it does the trick.
There's also this quick (and somewhat dirty) one-liner:
You can tweak the number of first lines per batch by changing the hard-coded 3000 value.
Sounds like a job for the UNIX command split:
Just split my 55 GB csv file in 21k chunks in less than 10 minutes.
It's not native to PowerShell though, but comes with, for instance, the git for windows package https://git-scm.com/download/win