I have a strange sense of dread when I create data

manpaint

̴̘̈́ ̵̲̾ ̸̯̎ ̴͓̀ ̸̳͝ ̸͈͑ ̴̡̋ ̸̞̂ ̴̰̚ ̵̨̔ ̸̭̎
Gold
Joined
Aug 11, 2022
Messages
899
Reaction score
1,671
Awards
200
Website
manpaint.neocities.org
I think I have found a way to negate my dread to a certain extend.

Earlier this week, I tried some Linux distro. I discovered that the squashfs compression was a thing. I soon discovered how bloody efficient it was. I wondered if there was a way to achieve a similar level of compression on Windows.

Thus, I spend my entire researching compression formats. I made an important discovery that blew my mind.

As it turns out, by default 7zip and Winrar do not account for duplicates files in an archive (regardless of archive format). For instance, if you would have an archive with two binary-identical 100 mb files, you would get an archive weighting 200 mb.

Before doing my research I kinda assumed that this was the whole point of compressing files, but apparently it does something different.

After messing around with 7zip, I have found the most effecient compression format, behold:

Optimal compression settings.PNG


As it turns out, the best way to compress file is by making a .7z archive with the parameters "qs". This account for any duplicate file and thus reduce data to it's simplest expression without any form of data lost.

This means this entire thing, all the compressed files I have ever made where bloody inefficient.

This can be EXTREMLY useful. For instance, on my computers I had a backup of multiple version of a given game. As the all versions only contained bug fixes, most assets were repeated in multiple folder. Using ths compression the archive went from 3gb to 1gb.

This change everything for me as I have a lot of duplicates files that I whish to keep (this allows me to do essentially do a similar things to version control, but with raw assets). The fact that this settings is not the default baffles me.

The only downside to this method is that adding files to an existing archive will not take into account if files are duplicate. That being said the nice thing about this method is that it's directly redable (unlike something like tar.gz). It is the closest to squashfs compression you can get on Windows (being extremly close in term of efficiency). There is also the added bonus of it being OS-agnostic.

---------

Here's some result I obtained during this research.

This research was conducting on two version of a game, with the executable being the only different between the two folders (both foders having the full art assets that were hex-identical):

-Individual zipped archives are 575 mb (original folders)

-Combined zipped archives are 575 mb
-Combined tar file is 586 mb
-Combined 7zip file is 566 mb (directly readable)
-Combined rar file is 573 mb
-Combined tar.xy file is 567 mb (directly readable) (normal settings)
-Combined tar.gz file is 574 mb (not directly readable)
-Combined tar.xy file is 567 mb (directly readable) (ultra settings)
-Combined rar5 file is 573 mb (directly readable)
-Combined tar.xy file is 567 mb (directly readable) (128 mb dictionnary)
-Combined .squashfs is 402 mb (directly readable)
-Combined .iso file is 587 mb (indirectly readable)
-Combined .tar.gz file is 574 mb (with shared file detection)
-Combined .tar file is 586 mb (with shared file compression)
-Combined 7zip file is 405 mb (directly readable) (with "qs" parameters")

As you can see the difference between 7zip with qs and squashfs versus everything else is like night and day. Using this methods, art assets are only stored once, meaning no file has any duplicates.

This method essentially "compiling instructions" that refer to the originals file. This mean I can essentially command 7zip to recreate any version of the game I want using a minimal amount of data.

I hope my explanation was clear. I might showcase a concrete example tomorrow because this is an extremely good thing, especially for archivists.
 

7Pebbles

Enemy of the Digital Panopticon
Joined
Jul 25, 2022
Messages
103
Reaction score
285
Awards
48
This is probably going to sound very weird, but I am curious what you guys thoughts are on this. I have never seen this sentiment described anywhere else, but I am curious if I am truly alone with this.

I have a strange sense of dread when I create data.

For context, I use a computers on a daily basis. I am actively working on multiple personal projects which all include the creation of data to some degree.

Lately, I have been thinking about the concept of creating data and it's implication. Everyday I work, a lot of data is created (I consider a significant amount being 500+ mb) and it seems that there is no end to it.

I am not sure why, but I feel all of this is wrong. My mind categorize creating data in the same category as "consuming stuff irresponsibly" for some reason.

While the HDD on my main computer is somewhat limited, I have plenty of external disk drive I can use, so I don't think this feeling stems from a lack of space.

I recently tried some Linux distros for funsie (just downloading iso is enough to get the feeling of dread as it creates data on my local HDD). I found a distribution called Linux Elive - it is designed to run well on old computers and it has plenty of software installed by default.

As most Linux distro, it came with a Live mode. For those unfamiliar, it is essentially an extensive demo version that runs entirely in the RAM of your pc. Upon trying this live mode and testing some stuff in it, I got the opposite feeling of "data creation dread" which I will refer as "Bliss" for simplicity.

It's hard to describe how I felt, but it's like that feeling when you discover the potential of something. When testing feature, I got the feeling that it could solve a part of the "creating data" problem if it was given a "catalyst" - something static I could enjoy without creating data. Truth to be told, it felt somewhat "holy".

This lead to the idea that entertainment has to inevitably create data and thus is "wrong". The catalyst idea arken back to my sustainable entertainment thread. The consensus was that there is no true way to get infinite entertainment, the closest being creating art and talking to other people.

The problem, is that creating things obviously create data. Artists on the Agora Road likely know that when you create something, there will likely be a lot of data created just to make the final product. An obvious example of this is the creation of a video game, the "source folder" containing all the data used in the creation of the game far outweigh the final compiled game.

This goes into the idea that entertainment itself could be wrong. Truth to be told, I have avoided having fun for most of my life because having fun is "not contributing to society". I was heavily under the influence that "to have fun is to consume" and that consuming in itself is selfish and sinful by nature unless you take the bare minimum of what you need.

That being said, most of what constitute my idea "contributing to society" involve the creation of data, which is apparently wrong.

At first I thought that obtaining sustainable entertainment would be the most "pure and noble" virtue, but since the only real option would be to create art, I am not so sure anymore.

I feel guilty for just using a computer and creating files. It just feels wrong, almost as if I am committing some sort of sin.

I am not sure what to believe anymore.
I find my mind focusing more on the data that I create in terms of what I'm giving to third parties. I don't mind saving data or creating my own for my own personal use, but I do dislike giving my data to 3rd parties that I don't know. I find myself unsettlingly aware sometimes the amount of information that one gives away in just simple actions online. On a normal Windows computer using something like Google Chrome, websites you use regularly can build a LARGE profile on you even without ever once creating an account or explicitly telling them about yourself. Where you live, what you like, what you look like, how much money you make, how old you are, when you poop, when you sleep, your politics, all sorts of things. If you really want to get paranoid, read about QubesOS with Whonix, read op-sec guides, read about the Intel management engine. So many people give away so much data without realizing.
Creating data, to me at least, isn't a problem. Its fighting entropy. You're organizing the universe and its energy into patterns that are useful. Your DNA is data. The code I wrote yesterday keeps multiple companies in business and helps me provide for myself and the people I love. Information is the base currency of reality. The important part to me is that I am conscious of who is getting what data and what they're doing with it.
A word of caution: You can take the plunge into op-sec. You can use every privacy-oriented piece of software known to man. But the best way to remain private is to not use technology. So I recommend that you use Linux, harden your Firefox with something like the arkenfox user.js and uBlock Origin. Consider a VPN (although be aware that it just shifts the burden of trust). Use KeePassXC. Research search privacy oriented search engines. Prefer open source software. That's probably enough to give most trackers a hard time and dramatically decrease your online-data footprint.
 
Last edited:
Virtual Cafe Awards

Andy Kaufman

i know
Joined
Feb 19, 2022
Messages
1,184
Reaction score
4,796
Awards
209
For me it's almost the opposite. I already lost some photos and videos forever because I didn't bother enough to save them properly.
The short shelf life of our storage media was already discussed.
There used to be this new glass laser engraving method which could store huge amounts of data for basically eternity by making tiny indents in a glass cube, even INSIDE the cube.
Iirc Microsoft bought the firm dveloping this and it's been in limbo for a few years.

edit:
I think there's one instance I can relate to OP: I hate taking screenshots with my phone because they just clog up space after a while and usually only 20% the screenshot is what I wanted to save. Cropping it creates a copy of the image, albeit smaller but still more data instead of overwriting it (Android).
On my PC I have a folder called "temporary" where I have a task running that checks for files older than a month and deletes them. This is good for stuff you know you won't need later and helps cutting back on "waste"
 
Virtual Cafe Awards

HellManMayo

Please be patient, I have autism.
Joined
Apr 24, 2022
Messages
78
Reaction score
606
Awards
60
This is probably going to sound very weird, but I am curious what you guys thoughts are on this. I have never seen this sentiment described anywhere else, but I am curious if I am truly alone with this.

I have a strange sense of dread when I create data.

For context, I use a computers on a daily basis. I am actively working on multiple personal projects which all include the creation of data to some degree.

Lately, I have been thinking about the concept of creating data and it's implication. Everyday I work, a lot of data is created (I consider a signifiant amount being 500+ mb) and it seems that there is no end to it.

I am not sure why, but I feel all of this is wrong. My mind categorize creating data in the same category as "consuming stuff irresponsibly" for some reason.

While the HDD on my main computer is somewhat limited, I have plenty of external disk drive I can use, so I don't think this feeling stems from a lack of space.

I recently tried some Linux distros for funsie (just downloading iso is enough to get the feeling of dread as it creates data on my local HDD). I found a distribution called Linux Elive - it is designed to run well on old computers and it has plenty of software installed by default.

As most Linux distro, it came with a Live mode. For those unfamiliar, it is essentially an extensive demo version that runs entirely in the RAM of your pc. Upon trying this live mode and testing some stuff in it, I got the opposite feeling of "data creation dread" which I will refer as "Bliss" for simplicity.

It's hard to describe how I felt, but it's like that feeling when you discover the potential of something. When testing feature, I got the feeling that it could solve a part of the "creating data" problem if it was given a "catalyst" - something static I could enjoy without creating data. Truth to be told, it felt somewhat "holy".

Ths lead to the idea that entertainement has to inevitably create data and thus is "wrong". The catalyst idea arken back to my sustainable entertainement thread. The consensus was that there is no true way to get infinite entertainement, the closest being creating art and talking to other people.

The problem, is that creating things obviously create data. Artists on the Agora Road likely know that when you create something, there will likely be a lot of data created just to make the final product. An obvious example of this is the creation of a video game, the "source folder" containing all the data used in the creation of the game far outweight the final compiled game.

This goes into the idea that entertainement itself could be wrong. Truth to be told, I have avoided having fun for most of my life because having fun is "not contributing to society". I was heavily under the influence that "to have fun is to consume" and that consuming in itself is selfish and sinful by nature unless you take the bare minimum of what you need.

That being said, most of what constitute my idea "contributing to society" involve the creation of data, which is apparently wrong.

At first I thought that obtaining sustainable entertainement would be the most "pure and noble" virtue, but since the only real option would be to create art, I am not so sure anymore.

I feel guilty for just using a computer and creating files. It just feels wrong, almost as if I am commiting some sort of sin.

I am not sure what to believe anymore.
I can't agree about data creation from a sustainability standpoint, but when it comes to actually usable data, information overload is a legitimate problem everyone struggles with. We create data about songs to download or chinese torrents of games from 2010, and then we lose it because there's no good system for organizing and presenting everything you want after you save it. There's also the appeal of minimalism (optimize your system so only the things you need are saved) and you have more juicy hard drive space to yourself.

I just learned that there's an entire field in academia called visual analytics/cognitive sciences that's trying to solve this problem. Right now the best methods on the market are
1. better design of digital interfaces so that it highlights what you need
2. intelligent agents (AI) sort through immense data and give you the stuff you want

Right now method 2 isn't really feasible, because machine learning requires a lot of processing power and it's often wrong without human supervision.
 

manpaint

̴̘̈́ ̵̲̾ ̸̯̎ ̴͓̀ ̸̳͝ ̸͈͑ ̴̡̋ ̸̞̂ ̴̰̚ ̵̨̔ ̸̭̎
Gold
Joined
Aug 11, 2022
Messages
899
Reaction score
1,671
Awards
200
Website
manpaint.neocities.org
I find my mind focusing more on the data that I create in terms of what I'm giving to third parties. I don't mind saving data or creating my own for my own personal use, but I do dislike giving my data to 3rd parties that I don't know. I find myself unsettlingly aware sometimes the amount of information that one gives away in just simple actions online. On a normal Windows computer using something like Google Chrome, websites you use regularly can build a LARGE profile on you even without ever once creating an account or explicitly telling them about yourself. Where you live, what you like, what you look like, how much money you make, how old you are, when you poop, when you sleep, your politics, all sorts of things. If you really want to get paranoid, read about QubesOS with Whonix, read op-sec guides, read about the Intel management engine. So many people give away so much data without realizing.
Creating data, to me at least, isn't a problem. Its fighting entropy. You're organizing the universe and its energy into patterns that are useful. Your DNA is data. The code I wrote yesterday keeps multiple companies in business and helps me provide for myself and the people I love. Information is the base currency of reality. The important part to me is that I am conscious of who is getting what data and what they're doing with it.
A word of caution: You can take the plunge into op-sec. You can use every privacy-oriented piece of software known to man. But the best way to remain private is to not use technology. So I recommend that you use Linux, harden your Firefox with something like the arkenfox user.js and uBlock Origin. Consider a VPN (although be aware that it just shifts the burden of trust). Use KeePassXC. Research search privacy oriented search engines. Prefer open source software. That's probably enough to give most trackers a hard time and dramatically decrease your online-data footprint.

I am more concerned about data on my hard drive more than any privacy concerns, but thanks for this information. I have made a copy of your post in my archives and I might look into some of the things you mentionned.

On my PC I have a folder called "temporary" where I have a task running that checks for files older than a month and deletes them. This is good for stuff you know you won't need later and helps cutting back on "waste"

I have something similar set up, although it is not automated.

I just learned that there's an entire field in academia called visual analytics/cognitive sciences that's trying to solve this problem. Right now the best methods on the market are
1. better design of digital interfaces so that it highlights what you need
2. intelligent agents (AI) sort through immense data and give you the stuff you want

Right now method 2 isn't really feasible, because machine learning requires a lot of processing power and it's often wrong without human supervision.

I personally woulsn't trust an AI to decide which files to keep. This sounds like a disaster waiting for happens to me. The only way I could see this be useful would be for temporary operating system files, but I am pretty sure the management of those files is already automated to some degree.
 

Andrew Eldritch

Definitely Not Goth
Joined
Sep 22, 2022
Messages
35
Reaction score
69
Awards
16
I find my mind focusing more on the data that I create in terms of what I'm giving to third parties. I don't mind saving data or creating my own for my own personal use, but I do dislike giving my data to 3rd parties that I don't know. I find myself unsettlingly aware sometimes the amount of information that one gives away in just simple actions online. On a normal Windows computer using something like Google Chrome, websites you use regularly can build a LARGE profile on you even without ever once creating an account or explicitly telling them about yourself. Where you live, what you like, what you look like, how much money you make, how old you are, when you poop, when you sleep, your politics, all sorts of things. If you really want to get paranoid, read about QubesOS with Whonix, read op-sec guides, read about the Intel management engine. So many people give away so much data without realizing.
Creating data, to me at least, isn't a problem. Its fighting entropy. You're organizing the universe and its energy into patterns that are useful. Your DNA is data. The code I wrote yesterday keeps multiple companies in business and helps me provide for myself and the people I love. Information is the base currency of reality. The important part to me is that I am conscious of who is getting what data and what they're doing with it.
A word of caution: You can take the plunge into op-sec. You can use every privacy-oriented piece of software known to man. But the best way to remain private is to not use technology. So I recommend that you use Linux, harden your Firefox with something like the arkenfox user.js and uBlock Origin. Consider a VPN (although be aware that it just shifts the burden of trust). Use KeePassXC. Research search privacy oriented search engines. Prefer open source software. That's probably enough to give most trackers a hard time and dramatically decrease your online-data footprint.
This is what concerns me as well, cataloguing someone's google search queries is like reading their mind. You google things you wouldn't talk to your best friends about.

For me it's almost the opposite. I already lost some photos and videos forever because I didn't bother enough to save them properly.
The short shelf life of our storage media was already discussed.
I just got a cd player and I tried to listen to some of my cd's from way back. Unfortunately they have all degraded and keep skipping and stuttering. I thought they would keep longer... I have vinyls from my grandparents that are still fine. For music vinyl seems to be the most stable, consumer-friendly format if you want to keep your music collection for your whole lifetime.

I also make a point of collecting books I like on paper, even if they are shitty paperbacks from the thrift store (actually, especially then). I don't want to rely on digital media for the things that are most important to me. Unfortunately this does not lead to a minimalist lifestyle costanzayeahrightsmirk
 
Virtual Cafe Awards

nsequeira119

DNW Expert
Joined
Jan 11, 2024
Messages
297
Reaction score
469
Awards
82
Website
tinyurl.com
This is probably going to sound very weird, but I am curious what you guys thoughts are on this. I have never seen this sentiment described anywhere else, but I am curious if I am truly alone with this.

I have a strange sense of dread when I create data.

For context, I use a computers on a daily basis. I am actively working on multiple personal projects which all include the creation of data to some degree.

Lately, I have been thinking about the concept of creating data and it's implication. Everyday I work, a lot of data is created (I consider a signifiant amount being 500+ mb) and it seems that there is no end to it.

I am not sure why, but I feel all of this is wrong. My mind categorize creating data in the same category as "consuming stuff irresponsibly" for some reason.

While the HDD on my main computer is somewhat limited, I have plenty of external disk drive I can use, so I don't think this feeling stems from a lack of space.

I recently tried some Linux distros for funsie (just downloading iso is enough to get the feeling of dread as it creates data on my local HDD). I found a distribution called Linux Elive - it is designed to run well on old computers and it has plenty of software installed by default.

As most Linux distro, it came with a Live mode. For those unfamiliar, it is essentially an extensive demo version that runs entirely in the RAM of your pc. Upon trying this live mode and testing some stuff in it, I got the opposite feeling of "data creation dread" which I will refer as "Bliss" for simplicity.

It's hard to describe how I felt, but it's like that feeling when you discover the potential of something. When testing feature, I got the feeling that it could solve a part of the "creating data" problem if it was given a "catalyst" - something static I could enjoy without creating data. Truth to be told, it felt somewhat "holy".

Ths lead to the idea that entertainement has to inevitably create data and thus is "wrong". The catalyst idea arken back to my sustainable entertainement thread. The consensus was that there is no true way to get infinite entertainement, the closest being creating art and talking to other people.

The problem, is that creating things obviously create data. Artists on the Agora Road likely know that when you create something, there will likely be a lot of data created just to make the final product. An obvious example of this is the creation of a video game, the "source folder" containing all the data used in the creation of the game far outweight the final compiled game.

This goes into the idea that entertainement itself could be wrong. Truth to be told, I have avoided having fun for most of my life because having fun is "not contributing to society". I was heavily under the influence that "to have fun is to consume" and that consuming in itself is selfish and sinful by nature unless you take the bare minimum of what you need.

That being said, most of what constitute my idea "contributing to society" involve the creation of data, which is apparently wrong.

At first I thought that obtaining sustainable entertainement would be the most "pure and noble" virtue, but since the only real option would be to create art, I am not so sure anymore.

I feel guilty for just using a computer and creating files. It just feels wrong, almost as if I am commiting some sort of sin.

I am not sure what to believe anymore.
This is an interesting viewpoint. I'm sure you're not the only one who feels like this- it would be sort of close-minded to view the creation of data only as a net positive, there are undoubtedly potential downsides to it. Different mediums, for instance, take up more data than others. Raw text takes up less data than a picture takes up, and in turn a 3D render takes up more data than a picture. So there are comparisons to be made. If creating data bothers you, maybe try creating the least data possible and seeing how that goes.

I've made an entire universe effectively based on the premise of excessive data preservation, and whether or not humans should even be preserving so much data. There comes a point where you start to wonder whether most of the data we currently possess is useless junk. There is a whole lot of empty, vapid crap which we might be better off without. Or we might not. That's the dilemma, really.

Anyway, as an Atheist, I don't believe in sin or whatever, nor do I believe you necessarily have to contribute to society. Do whatever feels right for you, delete a couple large files on your hard drive if it feels healthy- it could even feel liberating- and otherwise don't worry too much over it. Hard drives, like all things, are subject to entropy, and your data will eventually decay no matter what, so enjoy it while it lasts.
 
Virtual Cafe Awards

Similar threads