this post was submitted on 02 Nov 2024
35 points (94.9% liked)

No Stupid Questions

35806 readers
2137 users here now

No such thing. Ask away!

!nostupidquestions is a community dedicated to being helpful and answering each others' questions on various topics.

The rules for posting and commenting, besides the rules defined here for lemmy.world, are as follows:

Rules (interactive)


Rule 1- All posts must be legitimate questions. All post titles must include a question.

All posts must be legitimate questions, and all post titles must include a question. Questions that are joke or trolling questions, memes, song lyrics as title, etc. are not allowed here. See Rule 6 for all exceptions.



Rule 2- Your question subject cannot be illegal or NSFW material.

Your question subject cannot be illegal or NSFW material. You will be warned first, banned second.



Rule 3- Do not seek mental, medical and professional help here.

Do not seek mental, medical and professional help here. Breaking this rule will not get you or your post removed, but it will put you at risk, and possibly in danger.



Rule 4- No self promotion or upvote-farming of any kind.

That's it.



Rule 5- No baiting or sealioning or promoting an agenda.

Questions which, instead of being of an innocuous nature, are specifically intended (based on reports and in the opinion of our crack moderation team) to bait users into ideological wars on charged political topics will be removed and the authors warned - or banned - depending on severity.



Rule 6- Regarding META posts and joke questions.

Provided it is about the community itself, you may post non-question posts using the [META] tag on your post title.

On fridays, you are allowed to post meme and troll questions, on the condition that it's in text format only, and conforms with our other rules. These posts MUST include the [NSQ Friday] tag in their title.

If you post a serious question on friday and are looking only for legitimate answers, then please include the [Serious] tag on your post. Irrelevant replies will then be removed by moderators.



Rule 7- You can't intentionally annoy, mock, or harass other members.

If you intentionally annoy, mock, harass, or discriminate against any individual member, you will be removed.

Likewise, if you are a member, sympathiser or a resemblant of a movement that is known to largely hate, mock, discriminate against, and/or want to take lives of a group of people, and you were provably vocal about your hate, then you will be banned on sight.



Rule 8- All comments should try to stay relevant to their parent content.



Rule 9- Reposts from other platforms are not allowed.

Let everyone have their own content.



Rule 10- Majority of bots aren't allowed to participate here.



Credits

Our breathtaking icon was bestowed upon us by @Cevilia!

The greatest banner of all time: by @TheOneWithTheHair!

founded 1 year ago
MODERATORS
 

Background: I am working on a Python project where, given a set of input files (text/image/audio), it generates an executable game. The text files are there to describe the rules of the game.

Currently, the program reads and parses the files upon each startup, and builds a Python class that contains these rules, as well as links to image/audio files. This is fine for now, but I don't want the end executable to have to bundle these files and re-parse them each time it gets run.

My question: Is there a way to persist the instance of my class to disk, as it exists in memory? Kind of like a snapshot of the object. Since this is a Python project, my question is specific to Python. But, I'd be curious if this concept exists anywhere else. I've never heard of it.

My aim is not to serialize/de-serialize the class to a text file, but instead load the 1's and 0's that existed before into an instance of a class.

top 11 comments
sorted by: hot top controversial new old
[–] solrize@lemmy.world 39 points 1 week ago

The quick answer is to use a serialization/deserialization library like pickle. You can't just dump a binary image and reload it in any simple way.

[–] CameronDev@programming.dev 22 points 1 week ago

You are describing pickle, but it does come with some serious risks, especially if the file can be modified by a third party.

https://arjancodes.com/blog/python-pickle-module-security-risks-and-safer-alternatives/

I'd suggest using protobuf or similar instead, but its a bit more work.

[–] UnfortunateShort@lemmy.world 13 points 1 week ago (1 children)

I think pickle is what you want.

Keep in mind that this might have a huge performance impact if you do it all the time - it's still IO even when it's not parsing.

[–] spacemanspiffy@lemmy.world 4 points 1 week ago (1 children)

My idea would be to load one larger file one time and not parse anything, and keep it in memory the entire time. Versus what it does now which is load the files and parse them and keep everything in memory.

But three people responding here so far with "pickle" so maybe that is the way.

[–] UnfortunateShort@lemmy.world 1 points 1 week ago

You can stuff all the info into an object and use it this way, no problem. I just wanted to point out that this doesn't have zero performance impact compared to what you currently have.

So (depending on how your OS caches files) you might not want to do this like twice in a lambda that you pass to an iterator over a huge slice or something.

[–] GBU_28@lemm.ee 10 points 1 week ago

Pickle.

But, depending on the needs, writing to SQLite can be blazing fast and you could store your data as BLOB as needed

[–] WolfLink@sh.itjust.works 6 points 1 week ago

What is the “executable” in this context? I’m kinda confused as to what you are looking for.

What’s wrong with parsing the input files at runtime? Is it performance? Do you want one file to load instead of multiple?

Many have suggested pickle, which is kinda what you are asking for, but on some level it’s not much different from parsing the input files. Also, depending on your code, you may have to write custom serialization code as part of getting pickle to work.

Note that pretty much every modern game is a bundle of often multiple pieces of executable code alongside a whole bunch of separate assets.

[–] muntedcrocodile@lemm.ee 2 points 1 week ago

Not anything more efficient than just serialising and deserialising the data u want to load.

Im sure u could use pickle to do somthing to store the entire object graph but opens u up for all kinds of exploits (arbitrary code execution etc).

[–] jerkface@lemmy.ca 2 points 1 week ago

I took a closer look at what you are asking for and no, you cannot hand a reference to a python structure to a library and have it write the binary data from memory out to disk, then read that same binary data back into living Python instances later. That's just not how Python works. For one thing, any such structure is full of pointers which would be invalid unless you re-load to the same address in memory, which is not practical. You have to serialize and de-serialize.

[–] jerkface@lemmy.ca 2 points 1 week ago

The Zope Object Data Base (aka ZODB) exists for more complex persistence use cases. It's been a long time, though, there are probably more modern options.

[–] kevincox@lemmy.ml 1 points 1 week ago

I don’t want the end executable to have to bundle these files and re-parse them each time it gets run.

No matter how you persist data you will need to re-parse it. The question is really just if the new format is more efficient to read than the old format. Some formats such as FlatBuffers and Cap'n Proto are designed to have very efficient loading processes.

(Well technically you could persist the process image to disk, but this tends to be much larger than serialized data would be and has issues such as defeating ASLR. This is very rarely done.)

Lots of people are talking about Pickle. But it isn't particularly fast. That being side with Python you can't expect much to start with.