v/programming: Idea: Using AI to eradicate the private data companies have on you.

Idea: Using AI to eradicate the private data companies have on you.

2 20 Aug 2016 15:53 by u/roznak

It could work like this;

Use Google/Bing/Amazon/Cortana/Siri... searches to retrieve what companies know about you.
Feed that to an AI that learns
Make the AI execute Google/Bing/Amazon/Cortana/Siri searches that makes the original data less accurate.

This may be coded into a cool app and let it run as a continuous process.

14 comments

1 u/Lopsid 20 Aug 2016 18:19

Actually that's fucking brilliant. it's like having another you... swimming through the internet looking at stuff and services wouldn't know your true interests. I mean it's creepy, I made choices inside my own brain and someone is selling them for profit.

I think I had a similar idea but possibly for metadata on services you can't escape: https://voat.co/v/CodeRequest/comments/969369

3 u/roznak [OP] 20 Aug 2016 18:57

I think neural networks would be great to do such a thing since it can learn and adapt. It is a low scale attack so big companies won't notice that their data is slowly turning into useless data. So it is not a massive flood of the data but a very slow crawling. I think the AI could learn by trial and error, and learn to outsmart big company AI's.

Maybe the AI could even generate a fake ID for you. Maybe convince Bing that you are Robert Redford.

Imagine that the AI also generate new photo's and upload them to new online profiles. The neural network AI's now in use really does not interpret images, but you can generate images with fake profile photo's that would fool a face detection AI (but not a human)

The key here is that is is a slow process, to go unnoticed. Slow gradual changes.

0 u/panoptikon 21 Aug 2016 16:35

I agree, like one day 'you' the AI posts a photo and a black lesbian is tagged with your name.Next month an old white lady. Etc.

1 u/tcp 21 Aug 2016 17:51

This would also piss off the data miners behind Microsoft Edge. They would probably have to add captchas once they suspected an AI was using their browser.

0 u/roznak [OP] 21 Aug 2016 19:18

But that is the hole point, how do you test that an AI is behind the fake searches. The AI adapts to new situations.

1 u/BirdPoo 22 Aug 2016 11:23

You forgot Facebook and Amazon. Most data is now collected from your smartphone via behavior analysis (location movement,etc). This hard to fake. You should also expect the data is used by intelligence. Deviations of the norm put you on a list.

0 u/IdeaGhost 23 Aug 2016 02:31

Deviations of the norm put you on a list. Good point, and considering this actually makes the solution easier.

How viable would it be to collect other's habits ( say view most popular videos etc...) and generate searches for those and related content ?

Should be easier than coming up with a random "normal" search pattern, and would be more likelly to look normal because it's a copy.

0 u/BirdPoo 23 Aug 2016 14:58

For your example of youtube videos: Links of videos are often shared and I would expect that Google already developed something like "propagation graphs" between google accounts. There is a substantial amount of mony behind youtube views and much effort is made to guarantee that the clicks are real.

Any manipulation should be plausible and match with your remaining data.

0 u/Scruffy_Nerfherder 20 Aug 2016 23:09

My AI would end up searching for chill porn and land me in jail.

0 u/panoptikon 21 Aug 2016 16:33

I've been thinking about this myself, however, I've been thinking along a different path.

Instead of removing your own data, why not 'pollute' the entire data stream. How? Perhaps this could take the form a browser extension which, you would activate or set a frequency whereby it would make random searches, go to random sites, etc.

Or perhaps there are better ways to do this.

I just imagine that social networking data 'customers' start finding that their data is shit, it's going cause havok.

0 u/roznak [OP] 21 Aug 2016 17:30

Maybe we do not need to pollute everything, just small enough that big data companies start to distrust the data..

0 u/panoptikon 21 Aug 2016 22:15

In my vision it would be like an extension you install, and you could set to "pollute" the stream only when computer isn't being used or actively browsed, etc.

The more bad data the better. What is this "just small enough" stuff, I want a SETI@HOME crowd soruced marketing disinformation campaign.

One application might even be a screensaver rather than a browser plugin. It needs to be massive amounts of bad data, from massive amounts of people from all over.

0 u/foltaisaprovenshill 04 Oct 2016 21:35

1) Identify which browser functions are used to finger print/data gather
2) Write a DLL that lets you inject your own versions of each of the identified methods
3) Feed your DLL functions data created by a simple markov chain to look like real data but be nonsense as it relates to your data profile.