I am not a pro and I have been scratching my head over understanding what exactly StringIO is used for. I have been looking around the internet for some examples. However, almost all of the examples are very abstract. And they just show "how" to use it. But none of them show "why" and "in which circumstances" one should/will use it? Thanks in advance
p.s. not to be confused with this question on stackoverflow: StringIO Usage which compares string and StringIo.
I've just used StringIO in practice for two things:
print
ing, by redirectingsys.stdout
to aStringIO
instance for easy analysis;ElementTree
and thenwrite
it for sending via a HTTP connection.Not that you need
StringIO
often, but sometimes it's pretty useful.I've used it in place of text files for unit-testing.
For example, to make a csv 'file' for testing with pandas (Python 3):
From the documentation here:
In cases where you want a file-like object that ACTS like a file, but is writing to an in-memory string buffer: StringIO is the tool. If you're building large strings, such as plain-text documents, and doing a lot of string concatenation, you might find it easier to just use StringIO instead of a bunch of
mystr += 'more stuff\n'
type of operations.It's used when you have some API that only takes files, but you need to use a string. For example, to compress a string using the gzip module in Python 2:
Couple of things I personally have used it for:
Whole-file caching. I have a script that reads PDFs and does validation of various things about them. The PDF library I'm using takes an open file in its document constructor. I originally just opened the PDF I was interested in reading, however when I changed it to read the entire file at once into memory then pass a StringIO object to the PDF library, the running time of my script was cut in half.
Deferred printing. Same script prints a header before every PDF it reads. However, I can specify on the command line whether to ignore certain tests that are in its configuration file, or to only include certain ones. If I ignore all tests for a given PDF I don't want the header printed, but I won't know how many tests I ran until I'm done running the tests (the tests can be defined dynamically as well). So I capture the header into a StringIO object by changing
sys.stdout
to point to it, and each time I run a test I check to see whether that object has anything in it. If so, I print it then and reset it to empty. Voila, only PDFs that have tests have headers printed.StringIO gives you file-like access to strings, so you can use an existing module that deals with a file and change almost nothing and make it work with strings.
For example, say you have a logger that writes things to a file and you want to instead send the log output over the network. You can read the file and write its contents to the network, or you can write the log to a StringIO object and ship it off to its network destination without touching the filesystem. StringIO makes it easy to do it the first way then switch to the second way.