Skip to main content
Topic solved
This topic has been marked as solved and requires no further attention.
Topic: SHA-256 hash algorithm weakness, a n00b in cryptography question (Read 1027 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

SHA-256 hash algorithm weakness, a n00b in cryptography question

 Hello guys as you may know one of my passion is probabilities/entropy calculation (see this topic. )

Now i had a cool idea (my opinion) to try to understand how SHA-256 algorithm impacts entropy of a good quality long string (63 chars length, 32 bit if i'm not mistaken). To say it even more plain English i took a 63 long string and hashed it with sha-256 algorithm and tried to see how this impacts the Entropy of the output string (sha-256 hashed string)

In order to do that i needed to see what chars/numbers sha-256 maps to and saw it uses (a-h, letters and 0-9 numbers) in total only 18 numbers + letters. But in theory i've read that sha-256 has 2^256 possible combination but this raises me an eyebrow as i don't see how it's possible having 2^256 different variants out of only 18 different total chars.

Anyway, after using an example took a string hashed it and analyzed/compare the entropy of the original string vs the entropy of the hashed string and result was not very flattery for this hashing algorithm and i'll disclose my result/conclusion.

So to summarize, from a good quality 63 long string with high entropy resulted a 64 long sha-256 string with a crippled Entropy (meaning many chars/letters repeated excessively or others did not occur at all lowering the quality). To put into perspective from the initial entropy 10^77 it decreased to 10^26. This can be compared to a high quality string but only 14 chars length (a-Z;0-9;special chars)

So in conclusion i see no benefit of using a super long complex string since anyway almost everything is getting chopped by the sha-256 algo. The only benefit is when initial string is not that complex and for that indeed this algo scramble things up with a positive impact. So the question is, i m missing something or is really that bad as my result/conclusion showed up? I'm gonna add my calculations in case till now i did not missed anything major about this hashing algo.

 Thank everyone who took the time to read my current 'paradigm'  ;)..my current thing ;D




Re: SHA-256 hash algorithm weakness, a n00b in cryptography question

Reply #1
But if the output is a RANDOM set of a-h 0-9 then surely it doesn't matter if characters are repeated or unused? Every character has the possibility of being any character, it can only be one of them. Until the string is guessed you don't know if some are unused. Simple example, characters a and b:
ab ba aa bb - if you say all characters must be used and not repeated you lose half the possibilities. Isn't the repeated character thing for password security more due to the human tendency to do something easily remembered, rather than mathematical probability? "password" is a rubbish password because we know it's a word, "ejdycolm" might be better but only because it isn't a word, yet both are a-z 8 letter passwords.
If you try and study how hashing and encryption methods work internally they do things like complex sums, delete half the result, repeat, repeat, add in some "salt" to mix with the original password, perhaps some random noise stuff, more sums, partial deletions and so on. Crackers work by guessing an original password going through the options possibly trying likely ones and variants of them first, and calculating it's hash to see if it matches. There are often several possible passwords that create the same hash as I mentioned before, so even if you take your very secure original string it might have the same hash as something simpler.

Re: SHA-256 hash algorithm weakness, a n00b in cryptography question

Reply #2
Yeah if i'm thinking twice there's no need having good entropy for the output as long as that hash can't be used to reverse engineer and discover the secret key. So even if the output entropy is low that doesn't mean the seed can be discovered easily by any mean. Of course there might be collisions meaning two different strings give the same hash but that must be pretty rare.

Re: SHA-256 hash algorithm weakness, a n00b in cryptography question

Reply #3
Actually reading a bit more on the topic shows I should do more reading! Plus I should have paid much more attention in maths classes ...  ;D Each letter of the sha256sum output is a hexadecimal character, representing 4 bits or a "nibble". So 64 nibbles = 256 bytes, each of which can be 0 or 1, hence 2^256 is the result which is the same value as 16 ^64. Collisions are in fact unlikely, contrary to what I said before, as this is a very big number at current processing speeds.
I think the problem here is that often when you are creating a sha256 hash you take a simple string for a shorter password and obfuscate it to make it more secure, however if you start with something that is of a similar or greater level of complexity than the output hash then you would end up with something simpler, like if you have a sha256 checksum for a software tarball, it is unique to the file but is much shorter.

Re: SHA-256 hash algorithm weakness, a n00b in cryptography question

Reply #4
Aw so i got it wrong that's why i did not see any g and h occurring. So yeah it's a hexadecimal format. Now that i know that it's clear why 16^64=2^256 .

G and H must be used by the algorithm just temporary or whatever but final hash is hex. Tnx for the tip  :D

Re: SHA-256 hash algorithm weakness, a n00b in cryptography question

Reply #5

So this is how looks like. SHA-256 hash algorithm downgrades (encodes however you wanna put it) from the entropy point of view.

From a 4.7*(10)^77  it chops till reaches around 2.9*(10)^23

Sure the entropy is not very important but what it can unveil there's a chance the number of collisions could be much higher than we might expect and that would be pretty horrible for any cryptography that claim to be widely used.





Re: SHA-256 hash algorithm weakness, a n00b in cryptography question

Reply #6
Perhaps I made a mistake typing it out because you posted an image so it wouldn't copy and paste, but when I tried:
`D+"]nKE_rp_IhW1.=<:,1%NM&'#GbPiy\L$Jcz*)gF![jvwU9{V/0Q(;5tRXxe
on that website I got:
2529809ea0be4e5b45b9f56cbe2aab0e30f2953c37d350ef524ce5d9da886585
and then I could reproduce that result on the desktop by:
Code: [Select]
$ nano -L testsha256sum.txt
(paste in the same string, save and close)
$ sha256sum testsha256sum.txt
it gave the same result (nano -L is required so no newline is written at the end and it proved difficult to escape that string on the commandline to echo it and pipe to sha256sum)
I'm not sure sha256 is intended to create high entropy output, and your string was deliberately made to avoid character duplication and repetition? If you think the output isn't truly random and has some pattern I suppose you might find some weakness though!

Re: SHA-256 hash algorithm weakness, a n00b in cryptography question

Reply #7
Yeah my bad, this is the string in the example:

`D+"]nKE_rp_IhWl.=<:,1%NM&'#GbPiy\L$Jcz*)gF![jvwU9{V/OQ(;5tRXxe

and hashed

13d31ba2ee6e46601129986fd3598638fa5ebad8bb0d27079e8bbea963a14bb2

(your string is slightly different you wrote hW1 instead of hWl and 0 zero instead of capital O, /0Q instead of /OQ somewhere at the end)


My initial string has entropy on steroids there is no way or very rare you can get such string thru true/pure random generation even if it's the cosmic background noise. I've personally learned even random things in general offers a satisfactory level of entropy but it's pretty far by an special crafted engineered string (sort to speak)

Initial string has only 24|18|18|3 meaning 24 symbols 18 a-z 18 A-Z and 3 0-9

This proportion is the golden ratio that gives the highest amount of arrangement having in mind there are 32 symbols, 26 a-z, 26 A-Z, 10 0-9

To craft it a bit more added a  bit of 'salt' meaning it will not fit into a 63 string that has no repetition i randomly choose one symbol that repeats  a (_)  just to fit into arrangements with repetition and not only arrangements. Arrangements with repetition are more numerous but have a huge downside it shrinks entropy so i've choose the least case where entropy suffers with the price of blending inside the bigger pile of arrangements with repetition.

SHA-256 hash algo is not bad in terms of random Entropy which i already said it's not that great either.

You can see letter c does not show up at all but b occurs 8 times (excessively), e and 6 same excessively, 6 times each lowering vastly the final hashed entropy. Also many other occurs 5 times which is again a no-no in terms of entropy.

Imagine how many hashed strings have chars repeating 8/6/5 times, very few compared with others, those things singles out thru  bunch of others making it rather much more easy to discover.

So what can be done is making a script that can asses not only the initial string but the hashed string and decide if there's any need to generate another one to fit higher standard requirements. I would take an educated guess and say if the final hash has good entropy there is little chance that hash has any collisions therefor better security/privacy


Re: SHA-256 hash algorithm weakness, a n00b in cryptography question

Reply #8
I'm gonna demonstrate why random alone is not good enough and why there's need for what i would call smart random, or targeted randomness. Again i have to point out cryptography is more complicated but i'll stick to the probability aspects which i like to think i master way out better.  :)

 

Re: SHA-256 hash algorithm weakness, a n00b in cryptography question

Reply #9
Took me a lot of time to choose a good example why is ridiculous to choose a 100% random generated password. Being a bigger or better said more important subject i'll open a separate topic on this and will include my mathematical calculation of why is pretty dumb choosing a random password and why we should stop doing that to ourselves this of course will be received with anger by many with 'random supremacy' in their little minds. Strong claims demand strong proofs and that's exactly what i'll be doing.

See this topic https://forum.artixlinux.org/index.php/topic,4690.new.html#new