Estimate amount of un-compressible data on Linux
You may issue the following command on your Linux in order to find out how many jpegs, gifs, zipped files (badly compressible data) is on the system.
# find / \( -fstype iso9660 -o -fstype proc -o -fstype sysfs -o -fstype nfs -o -type s \) -prune -o -exec ls -l {} \; | egrep -e ".gif$|.mov$|.jpeg$|.zip$|.gz$|.Z$" > /var/tmp/find.out
#
You can add other extensions of movie formats if you know it to the egrep statement. Issue the command in one line. Now you can calculate the amount of data covered by the files listed in the find output:
# awk '{ s = $5 } END { printf ("%d\n", s/1024/1024) }' /var/tmp/find.out
137225
#
Determine the number of respective files:
# wc -l /var/tmp/find.out
548319 /var/tmp/find.out
#
So the number of badly (or un-) compressible files on this system is around 550000 and the disc space used by those files is around 138 GiB in this example.