AI/Machine Learning as a way of "declassifying" documents?

SolidStateSurvivor

This is Extremely Dangerous to Our Democracy
Joined
Feb 15, 2022
Messages
1,125
Reaction score
5,423
Awards
246
Website
youtuube.neocities.org
There's been a comment that has stuck with me for a few months now, I saw it as means of speculating the nature of a raid conducted on an Area 51 website admin. It was something to the effect of how there exists enough partial information out there in documents and on the net to piece together the bigger picture of classified Intel. It was speculation that the site admin may have been getting too thorough in his research, and that this raid was conducted as a warning of sorts. I'm unsure of just how true it is, but it got me thinking.

Given how far the AI/machine learning field has come, where we have technology like ChatGTP that can write essays with accurately cited sources, would it be theoretically possible to create and feed such a program enough primary source material for it to connect the dots and fill in censored/partially blacked out documents? There's plenty of primary documents out there related to shady happenings, but a shortage of man power to thoroughly comb through it all.

So what if the human element was removed? Instead the AI is tasked with the research of all available material, and then does it's best to fill in the redacted info and provide sources elaborating on it's speculation.

Is this at all feasible? Could it yield accurate information in place of redactions and usher in a new era of information transparency?
 
Virtual Cafe Awards

LostintheCycle

Formerly His Holelineß
Joined
Apr 4, 2022
Messages
963
Reaction score
3,818
Awards
245
Maybe you could try something like that, but a couple problems popped into my head.
1. What if people use cherry-picked primary sources or even falsified sources to nudge the 'truth' into what they believe? People with particular bias would certainly do this, and you're left with the task of having to pick through all sources yourself, so the problem of having to comb through all sources manually is not solved.
2. Ultimately, how would you verify whether it is true or not? After declassification you'll still not be able to reach a definite answer, there's still some amount of uncertainty even with the above problem put aside, but it is so much worse because you have to trust where this 'declassified document' comes from.
My opinion is that this will only serve to make waters more muddy.
 
Virtual Cafe Awards

Outer Heaven

Stranger in a strange land
Bronze
Joined
Oct 25, 2021
Messages
781
Reaction score
5,621
Awards
230
Damn this is an interesting as hell idea. There are 2 immediate flaws I can see with this though. The first is that governments will quickly catch on and find a way to ban you from doing it if it's effective. Kinda like how every page you ever print has a way to track you, the government can find a way to tag documents so they're not declassified. The second issue is that this might work for some words being filled in but will not work for things like names of people or certain places, which is what primarily composes redactions. The ai will have no way of guessing which general authorized the human testing or where it is if the document refers to the location in code words.
 
Virtual Cafe Awards

eve

Professional Loser
Bronze
Joined
Feb 16, 2022
Messages
233
Reaction score
1,032
Awards
108
Website
evvv.org
cool idea, but i think itzz important 2 remember that ai cant rlly "fill in the blanks" and it wouldnt be any more reliable than any humans guess. that and (at least 4 now) ai like chatgpt is only as intelligent as the models and source material it haz been trained on :p
 
Virtual Cafe Awards

SolidStateSurvivor

This is Extremely Dangerous to Our Democracy
Joined
Feb 15, 2022
Messages
1,125
Reaction score
5,423
Awards
246
Website
youtuube.neocities.org
1. What if people use cherry-picked primary sources or even falsified sources to nudge the 'truth' into what they believe? People with particular bias would certainly do this, and you're left with the task of having to pick through all sources yourself, so the problem of having to comb through all sources manually is not solved.
Most research has this as a potential issue, even if one looks at all documents they will inevitably be swayed one way or the other and stick with a conviction. I trust the AI to remain a bit more impartial if it were to be fed everything.


2. Ultimately, how would you verify whether it is true or not? After declassification you'll still not be able to reach a definite answer, there's still some amount of uncertainty even with the above problem put aside, but it is so much worse because you have to trust where this 'declassified document' comes from.
Valid point, it does indeed boil down to educated guesses. Even if it can't 100% nail declassifying it could at least piece together public information to give detailed, supported answers. Personally I've never fully trusted official government releases information but it's the best one has for this sort of thing, they're not perfect, the truth is in there, somewhere.

The first is that governments will quickly catch on and find a way to ban you from doing it if it's effective.
There's nothing illegal about this, but glowies can plant cheese pizza and send you away to be merc'd in jail anytime they want. A possibility, but one could operate in relative secrecy if they know what they're doing.

The ai will have no way of guessing which general authorized the human testing or where it is if the document refers to the location in code words.
Theoretically could be possible to get a good background on individuals, for instance with this human testing one you provide it could narrow down generals with experience in biology/medicine.
 
Virtual Cafe Awards

Regal

Well-Known Traveler
Joined
Nov 20, 2022
Messages
340
Reaction score
1,217
Awards
111
Technically possible, maybe, but no way to confirm accuracy. The training dataset required to do this right is more than the documents. You would probably need to give it a very complete dataset of world events and history for it to make some articulate guesses. Probably some additional datasets I'm not thinking about too. It takes a lot of data to make an educated guess.

The better AI/ML usecase would be to provide summaries of declassified documents. One of the hurdles of the conspiracy community (or whatever) is manpower. Being able to get the valuable data out of a 300 page document in minutes instead of a week is the biggest immediate value of AI/ML in this scenario.
 

Sinthôme

Golem Aglow
Joined
Apr 8, 2022
Messages
31
Reaction score
71
Awards
14
Like others said, it's totally contingent on the sensitivity of the field at hand and the lacunae being patched over in the intelligence, as well as how it would be tested for confirmation as accurate or not. Depends on the stakes of getting it right and what it's applied to, in other words, to say whether a data set training like this would "work".
 

Taleisin

Lab-coat Illuminatus
Bronze
Joined
Nov 8, 2021
Messages
636
Reaction score
3,316
Awards
213
The CIA reading room has enough documents to train something like this, especially if you also fed in some wikileaks type stuff
 
Virtual Cafe Awards