How to save data with Python?

2019-02-08 07:30发布

I am working on a program in Python and want users to be able to save data they are working on. I have looked into cPickle; it seems like it would be a fast and easy way to save data, it seems insecure. Since entire functions, classes, etc can be pickled, I am worried that a rogue save file could inject harmful code into the program. Is there a way I can prevent that, or should I look into other methods of saving data, such as directly converting to a string (which also seems insecure,) or creating an XML hierarchy, and putting data in that.

I am new to python, so please bear with me.

Thanks in advance!

EDIT: As for the type of data I am storing, it is mainly dictionaries and lists. Information such as names, speeds, etc. It is fairly simple right now, but may get more complex in the future.

7条回答
祖国的老花朵
2楼-- · 2019-02-08 07:54

Who -- specifically -- is the sociopath who's going through the effort to break a program by hacking the pickled file?

It's Python. The sociopath has your source. They don't need to fool around hacking your pickle file. They can just edit your source and do all the "damage" they want.

Don't worry about "insecurity" unless you're involved in litigation with organized crime syndicates.

Don't worry about "a rogue save file could inject harmful code into the program". No one will bother with a rogue save file when they have the source.

查看更多
狗以群分
3楼-- · 2019-02-08 08:02

*****In this answer, I'm only concerned about accidental corruption of the application's integrity.*****

Pickle is "secure". What might be insecure is accessing code you didn't write, for example in plugins; that is not relevant to pickles though.

When you pickle an object, all its data is saved, but code and implementation is not. This means when unpickled, an updated object might find it has "old-style" data inside (if you update the implementation). This is something you must know and handle, if applicable.

Pickling strings, lists, numbers, dicts is very easy and works perfectly, and comparably to JSON. The Pickle magic is that -- sometimes without adjustment -- even complex python objects can be pickled. But only data is pickled; the instances are reconstructed simply by the saved module name and type name of the object.

查看更多
你好瞎i
4楼-- · 2019-02-08 08:02

You should use a database of some kind. Storing in pickle format isn't a good idea (in most cases). You may consider:

  • SQLite - (included in Python 2.5+) fast and simple, but requires knowledge of SQL and DB-API
  • buzhug - non-SQL, file based database with pythonic syntax
  • SQL database - you may use interface to some of DBMS (like MySQL, PostreSQL etc.), but it's only good for larger amount of data (thousands of records).

You may find some other solutions here.

查看更多
贪生不怕死
5楼-- · 2019-02-08 08:08

You need to give us more context before we can answer: what type of data are you saving, how much is there, how do you want to access it?

As for pickles: they do not store code. When you pickle a function or class, it is the name that is stored, not the actual code itself.

查看更多
\"骚年 ilove
6楼-- · 2019-02-08 08:13

You could do something like:

to write

  • Pickle
  • Sign pickled file
  • Done

to read

  • Check pickled file's signature
  • Unpickle
  • Use

I wonder though what makes you think that the data files are going to be tampered but your application is not going to be?

查看更多
叛逆
7楼-- · 2019-02-08 08:13

You might enjoy working with the y_serial module over at http://yserial.sourceforge.net

which reads like a tutorial but operationally offers working code for serialization and persistance. The commentary discusses some of the pros and cons relevant to issues raised here.

It's designed to be a general solution to warehousing compressed Python objects with SQLite (with almost no SQL fuss ;-)

Hope this helps.

查看更多
登录 后发表回答