Skip to main content
Topic solved
This topic has been marked as solved and requires no further attention.
Topic: [SOLVED] count with python/awk or other script/command (Read 1886 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Re: [SOLVED] count with python/awk or other script/command

Reply #30
And last question is this good enough to escape those problematic symbols like I did bellow?
Code: [Select]
"!\"#$%&\'()*+,-./:;<=>?@[\]^_`{|}~"
It looks correct. In case you don't know another way in python to define strings which contain both single and double quotes is to use triple quotes !
Then no escaping of the contained 'quotes' is needed.
Code: [Select]
'''!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~'''

Re: [SOLVED] count with python/awk or other script/command

Reply #31



Think the script as is now is restricted by this two syntax:
Code: [Select]
        for letter in line.lower():
            if letter.isalpha():

I think I need to find something else instead of line.lower() and letter.isalpha().

isascii do you think would fix the problem?




you're right,
lower() method converts any letter to lowercase, so you can just delete it

isascii() might not be the solution because, if I remember correctly, it include some control characters (like end of line '\n\ )
i think you should use
Code: [Select]
isprintable()
instead, but it will include 'space' as character, so if you have 'spaces' in your data and want to get rid of it add check for it
Code: [Select]
and not letter.isspace()

And in this case i personally would use tripple quotes  as suggested above, but both methods should work

Code: [Select]
with open("fin.txt") as file:
    line_list = []
    line_count = 0
    for line in file:
       
        letter_list = []
        for letter in line:
            if letter.isprintable() and not letter.isspace():
                letter_list.append(letter)
        if len(letter_list) != 0:
            words = []
           
            for x in range(0, len(letter_list), 6):
                word = letter_list[x: x + 6]
                words.append(word)
           
           
            line_list.append(words)
        line_count += 1
       

groups = ["""!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~""", "ABCDEFGHIJKLMNOPQRSTUVWXYZ", "abcdefghijklmnopqrstuvwxyz", "0123456789"]
result = []

for la in line_list:
    gr = {}
    for word in la:
        word.sort()
        counter = ""
        for group in groups:
            count = 0
            for n in range(len(group)):
                count += word.count(group[n])
            if count != 0:
                counter += str(count)
        gr["".join(word)] = "".join(sorted(counter, reverse=True))
    result.append(gr)

for ind, value in enumerate(result):
    for word, score in value.items():
        print(f"{word} {score}", end="\t")          
    print()
print(f"Line count = {line_count}")

I cleaned up the output a bit for better readability

Re: [SOLVED] count with python/awk or other script/command

Reply #32

 Awesome! Works like a charm! Yes triple quotes seems to be the cleanest way. The rest works also flawlessly 🎯

 Started to like python even more.

 Tnx a lot!!!

 ✌🏻


Re: [SOLVED] count with python/awk or other script/command

Reply #33

 Found a small bug. In some cases the script reads and counts good but when it orders back in descending order it gives an error.

So it counts 12_2_2  but when tries to order in descending order it gives 2_2_2_1 see the image. How this can be fixed? Another thing even minor is that indeed it counts ok but it changes the original string.
 So for `1l,*}:{)k;%{?5(  it turns it in   %()*,15:;?`kl{{}




Re: [SOLVED] count with python/awk or other script/command

Reply #34
Try this:
Code: [Select]
with open("fin.txt") as file:
    line_list = []
    line_count = 0
    for line in file:
        letter_list = []
        for letter in line:
            if letter.isprintable() and not letter.isspace():
                letter_list.append(letter)
        if len(letter_list) != 0:
            words = []

            for x in range(0, len(letter_list), 16):
                word = letter_list[x: x + 16]
                words.append(word)

            line_list.append(words)
        line_count += 1


groups = [
    """!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~""",
    "ABCDEFGHIJKLMNOPQRSTUVWXYZ",
    "abcdefghijklmnopqrstuvwxyz",
    "0123456789",
]
result = []

for la in line_list:
    gr = {}
    for word in la:
        # word.sort()
        counter = []
        for group in groups:
            count = 0
            for n in range(len(group)):
                count += word.count(group[n])
            #if count != 0:
            counter.append(count)
        # gr["".join(word)] = " ".join(sorted(counter, reverse=True))
        gr["".join(word)] = sorted(counter, reverse=True)
    result.append(gr)

for value in result:
    for word, score in value.items():
        print(word, end=" ")
        for number in score:
            print(number, end=" ")
        print("\t", end="")
    print()
print(f"Line count = {line_count}")

The way 'count' was stored and sorted  earlier would not work if  it can be greater then '9', so i changed it from storing 'counts'  as string to store it as list. Now it should work. Also disabled 'word.sort()' so words now should be original (not sorted).


Hope this time works  :)

Re: [SOLVED] count with python/awk or other script/command

Reply #35

 Cool as ice!  :D Infinite thanks!!! However already have some results as those script errors have only marginal impact on the conclusion.

 Changed the strategy. Now I'm abusing a bit the random factor but just enough so I don't need to calc. too much.

 The easiest way to determine which pattern is #1 is still thru extract random and see which array shows up more often. This however applies when we are in rather low numbers like t=94 and a small s=16. If we go bigger like t=94 and s=63 or higher s=94 then the most frequent pattern will no longer correspond to the most potent pattern. In this case we need to calc all those arrays that spawn in the 1st 100 or 200 arrays and calc thru script automatically which one holds the bigger numbers.

Bellow an example cracked thru both tools, random and calc.

This example used t=94 and s=16 resulting a best 6 5 4 1 (s pattern) when we use 32 26 26 10 (t pattern). Check it out.



The final conclusion from the above calc. is as follow the best is to choose

6 elements from symbols
5 from letters
4 from LETTERS
1 from numbers (0-9)

However to match the other criteria (where groups are basically elements themselves) we must repeat only 1 element in the final string no matter which element. Example string      '(}1BNOPS^_abhr{ 

The result will be a 'hyper string'

edit: tested the latest script and now is really pure gold  🎯 🏆


 

Re: [SOLVED] count with python/awk or other script/command

Reply #36

 Here's a secondary/residual conclusion and that's about arr. do not like symmetry. A   9|7 is preferred vs a symmetric  8|8, let alone 13 people out of 500 will have a 20x weak secret and 63% will have just an average Joe's secret.

 So when split in half 94 it's better to have 9 from a group and 7 from 2nd group (16 digit string) and not 8|8 like most would have guessed.

 See image.