I have been shitposting and trolling a bit too much recently, to balance things out here is an effort post
The duality of the net, somehow "The Internet is Forever" but also very ephemeral.
Our playlists look like this
Hosting sites frequently pull the plug on old content
Leading to vast linkrot
The internet might remember your cringy facebook or racist twitter posts, but it will easily forget old stop-motion bionicle videos and all your image uploads for the past decade.
If you like something, SAVE IT.
If you want to share something, HOST IT.
This post will examine and encourage (easy) ways to download, store, and share media you interact with.
This is not comprehensive, but it should serve as a good starting point for most travelers.
ARCHIVING
Youtube Videos:
Videos, Playlists, and entire channels can easily be archived with yt-dlp. It comes in a command line and GUI flavor
If you ever need help using the CLI version just ask god
Entire Websites
You can use wget (CLI) or httrack (GUI) to download entire websites. You can set it to download external resources as well (such as imgur links) or things just hosted directly on the site.
Static sites, imageboards, and forums such as this one don't take up a terrible amount of space. IIRC Agora is ~200GB, Lainchan is just 13GB, and sites like textfiles.com less than that.
If a community is ever shutdown, having an archive of it's content, even if its a few months old, is incredibly useful in its revival.
Discords
I very much dislike discord for it's walled gardened and non-indexable bs but that's where people live. If you want to do look back at a discords ancient history it's better to it with an offline copy than scroll and load and scroll and load or with discords shitty searching. Use this: https://github.com/Tyrrrz/DiscordChatExporter
STORING
Local Strategies:
Really you can't beat just storing on your hard-drive.
Data stored on an HDD lasts longer when being cold stored (when the disk is not plugged into anything for long periods of time) than flash storage (SSD's, SD cards, USB).
To store more data on your HDD of choice, compress your shit!
Now data that is already compressed (like .mp4's or .jpgs) won't be any smaller if you compress them again with winrar or whatever, but for websites and other uncompressed media (documents, png, avi) you can save a lot of space by compressing. Use WinRar or 7zip with the default settings to achieve a decent ratio without taking a ton of CPU time, then store that away.
Cloud Strategies:
So an archive isn't very good if it only exists on one place. Bit-rot, drive failure, water damage, accidental formatting, or any number of things can cause ruin to your data hoard.
Upload in multiple places!
If any of your archive is copyrighted material, archive it and add a password so hosts cant peep the data inside.
The big three:
MEGA.NZ - 15GB for free users, 5GB allowed to be upload per day
Google Drive - 15GB for free users
OneDrive - 5GB for free users
More listed here
Another strat is to upload data as a video for ∞ cloud storage:
View: https://www.youtube.com/watch?v=8I4fd_Sap-g
YouTube may eventually set-up detection for this kind of thing, but it will work across all video platforms (vidlii, bitchute, ok.ru, nicovideo.jp)
There are other projects that do the same thing available on github if the limitations of this one are annoying
(but you can split your archive into parts, upload the parts into a playlist, and then download the playlist back later, so the limitation is easily circumvented)
One thing to keep in mind WHENEVER using a cloud service is your data and account, at any time, can be purged. Do not rely on just one service.
A good strat is to have data stored on an HDD that's plugged into your main desktop, a large portable HDD that's occassionally plugged into your desktop, and on a cloud service.
Check your backups regularly. A bi-weekly (every other week) basis is plenty sufficient to ensure your data doesn't get surprise purged.
SHARING
Any data worth saving is worth sharing. If you are utilizing cloud services it's easy enough to link to where it's being stored, but if your keeping things local..
Bittorrent
All torrent clients support making a torrent. Just add a public tracker such as udp://tracker.openbittorrent.com:6969 , seed and share (will ofc require your machine to be on for people to download)
IPFS
Similar to Bittorrent you can share files over IPFS, this can be useful if the person your trying to share it to has P2P traffic blocked as IPFS supports public web gateways for hosted content. Its a neat protocol (also requires your machine to be on for others to download)
Webserver
Bittorrent is not as efficient as a direct Server to Client protocol, but most ISPs don't allow port forwarding on residential networks. However many VPN providers, like Proton, Mullvad, and others allow for port-forwarding. Forward a port through your VPN, download Apache, edit httpd.conf so "Listen 80" is "Listen $Forwarded_Port", drop the files you want to share in the htdocs folder, run httpd.exe and now you can share your files direct via an IPv4 link.
Further Reading
I'll stop there as this is supposed to be for the everyman and hosting a Webserver, even without getting into router port-forwarding and DNS n shit, is still technical. But it's plenty cheap to get a dedicated device (Raspberry Pi or shitty craigslist PC) to be a NAS, seedbox, and webserver.
https://old.reddit.com/r/DataHoarder/wiki/index
The DataHoarder sub is a great resource for information and frequently lead community efforts to archive en-masse (such as their Imgur archival effort)
This is a good resource for tools (ImageBoard thread archiving, instagram scraping, telegram ripping, etc.)
If anyone here considers themselves a data hoarder or internet archivist share your set-ups, tools, and what your targeted media is! If your interested in getting started ask any questions you have :D
The duality of the net, somehow "The Internet is Forever" but also very ephemeral.
Our playlists look like this
Hosting sites frequently pull the plug on old content
Leading to vast linkrot
The internet might remember your cringy facebook or racist twitter posts, but it will easily forget old stop-motion bionicle videos and all your image uploads for the past decade.
If you like something, SAVE IT.
If you want to share something, HOST IT.
This post will examine and encourage (easy) ways to download, store, and share media you interact with.
This is not comprehensive, but it should serve as a good starting point for most travelers.
ARCHIVING
Youtube Videos:
Videos, Playlists, and entire channels can easily be archived with yt-dlp. It comes in a command line and GUI flavor
GitHub - yt-dlp/yt-dlp: A feature-rich command-line audio/video downloader
A feature-rich command-line audio/video downloader - yt-dlp/yt-dlp
github.com
Releases · kannagi0303/yt-dlp-gui
Windows GUI for yt-dlp. Contribute to kannagi0303/yt-dlp-gui development by creating an account on GitHub.
github.com
Entire Websites
You can use wget (CLI) or httrack (GUI) to download entire websites. You can set it to download external resources as well (such as imgur links) or things just hosted directly on the site.
Code:
wget --mirror --convert-links --adjust-extension --page-requisites --no-parent $URL_HERE
If a community is ever shutdown, having an archive of it's content, even if its a few months old, is incredibly useful in its revival.
Discords
I very much dislike discord for it's walled gardened and non-indexable bs but that's where people live. If you want to do look back at a discords ancient history it's better to it with an offline copy than scroll and load and scroll and load or with discords shitty searching. Use this: https://github.com/Tyrrrz/DiscordChatExporter
STORING
Local Strategies:
Really you can't beat just storing on your hard-drive.
Data stored on an HDD lasts longer when being cold stored (when the disk is not plugged into anything for long periods of time) than flash storage (SSD's, SD cards, USB).
To store more data on your HDD of choice, compress your shit!
Now data that is already compressed (like .mp4's or .jpgs) won't be any smaller if you compress them again with winrar or whatever, but for websites and other uncompressed media (documents, png, avi) you can save a lot of space by compressing. Use WinRar or 7zip with the default settings to achieve a decent ratio without taking a ton of CPU time, then store that away.
Cloud Strategies:
So an archive isn't very good if it only exists on one place. Bit-rot, drive failure, water damage, accidental formatting, or any number of things can cause ruin to your data hoard.
Upload in multiple places!
If any of your archive is copyrighted material, archive it and add a password so hosts cant peep the data inside.
The big three:
MEGA.NZ - 15GB for free users, 5GB allowed to be upload per day
Google Drive - 15GB for free users
OneDrive - 5GB for free users
More listed here
Another strat is to upload data as a video for ∞ cloud storage:
View: https://www.youtube.com/watch?v=8I4fd_Sap-g
YouTube may eventually set-up detection for this kind of thing, but it will work across all video platforms (vidlii, bitchute, ok.ru, nicovideo.jp)
There are other projects that do the same thing available on github if the limitations of this one are annoying
(but you can split your archive into parts, upload the parts into a playlist, and then download the playlist back later, so the limitation is easily circumvented)
One thing to keep in mind WHENEVER using a cloud service is your data and account, at any time, can be purged. Do not rely on just one service.
A good strat is to have data stored on an HDD that's plugged into your main desktop, a large portable HDD that's occassionally plugged into your desktop, and on a cloud service.
Check your backups regularly. A bi-weekly (every other week) basis is plenty sufficient to ensure your data doesn't get surprise purged.
SHARING
Any data worth saving is worth sharing. If you are utilizing cloud services it's easy enough to link to where it's being stored, but if your keeping things local..
Bittorrent
All torrent clients support making a torrent. Just add a public tracker such as udp://tracker.openbittorrent.com:6969 , seed and share (will ofc require your machine to be on for people to download)
IPFS
Similar to Bittorrent you can share files over IPFS, this can be useful if the person your trying to share it to has P2P traffic blocked as IPFS supports public web gateways for hosted content. Its a neat protocol (also requires your machine to be on for others to download)
Webserver
Bittorrent is not as efficient as a direct Server to Client protocol, but most ISPs don't allow port forwarding on residential networks. However many VPN providers, like Proton, Mullvad, and others allow for port-forwarding. Forward a port through your VPN, download Apache, edit httpd.conf so "Listen 80" is "Listen $Forwarded_Port", drop the files you want to share in the htdocs folder, run httpd.exe and now you can share your files direct via an IPv4 link.
Further Reading
I'll stop there as this is supposed to be for the everyman and hosting a Webserver, even without getting into router port-forwarding and DNS n shit, is still technical. But it's plenty cheap to get a dedicated device (Raspberry Pi or shitty craigslist PC) to be a NAS, seedbox, and webserver.
https://old.reddit.com/r/DataHoarder/wiki/index
The DataHoarder sub is a great resource for information and frequently lead community efforts to archive en-masse (such as their Imgur archival effort)
GitHub - simon987/awesome-datahoarding: List of data-hoarding related tools
List of data-hoarding related tools. Contribute to simon987/awesome-datahoarding development by creating an account on GitHub.
github.com
If anyone here considers themselves a data hoarder or internet archivist share your set-ups, tools, and what your targeted media is! If your interested in getting started ask any questions you have :D
Last edited: