v/programming: It would be cool if someone built a system/protocol that could reduce jpeg transmission sizes by comparing parts of image to a local catalog.

It would be cool if someone built a system/protocol that could reduce jpeg transmission sizes by comparing parts of image to a local catalog.

5 04 Dec 2015 01:18 by u/TheRealTruth

Just an idea but if instead of having to upload a 2MB jpeg maybe you could just upload a little text string that told some kind of image decompiler where to look in a catalog that would allow it to reconstruct a jpeg that looked pretty much the same.

edit - or that pi thing. Someone posted it a while back. You can find any number in pi. I wonder if it takes longer to find an index, or enough indexes, in pi than it does to send the file.

5 comments

4 u/scorinth 04 Dec 2015 04:11

"comparing parts of image to a local catalog"

This is essentially how jpeg compression already works. It breaks the image into tiles and then figures out how to recreate each tile by mixing together a selection of patterns from a local catalog.

Here's a video that breaks it down. It's pretty good if you can keep your eyes from glazing over.

Sorry to be the bearer of bad news.

1 u/Aussiesurvivor 04 Dec 2015 01:55

Thats a pretty cool idea, id imagine itd be possible in the same way that your phone calls do something similar. If a part of the transmission is lost it uses an algoritm to calculate what the missing part was and fills it in. Similar to a soduko type thing. /somethingalongthelinesof

0 u/qzx 04 Dec 2015 03:21

There's a buttload of possible/probable jpegs so you would need a very very large catalog and you'd have to have that catalog on every computer that would look up your string index. If every computer had to download the catalog, which would probably be terabytes or larger in size, then that would defeat the purpose.

0 u/tame 04 Dec 2015 03:23

Sounds like a cool idea, basically use your existing web cache as a giant lookup table. The tricky bit is how you let the web server know what images you've got cached, so it can say which ones to use as parts of the new image.

Also if you allow approximate matches (to improve compression ratio) you risk your images being subtly doctored by the algorithm, as Xerox did a couple of years ago.

Technically you may be able to find any bit string in the digits of pi (has this been proven? "it's infinite and doesn't repeat" doesn't necessarily prove it) but in general the index saying which digit the bit string starts at would take more bits than just storing the image directly.

0 u/TheRealTruth [OP] 04 Dec 2015 05:24

yeah, someone already wrote it. its on github somewhere