|
I recently signed up for an account on a very large investment / bank platform and they have the following password requirements (which are mostly stupid) -- see snapshot[^]
The requirement I'd like to talk about (which is obviously arbitrary) is :
Special characters except for # & * < > ( ) ' [ ]
Also, note, there aren't too many "special chars" left after all those are removed.
This is obviously arbitrary and the developers of that site & the password requirements decided not to allow common characters which would cause them problems when parsing the password string.
I'm wondering what constitutes a "special character"?
1. Above ascii 127 (even though surely sites use UTF-8 character encoding now?
2. If that is true, then would any emoji (🤖🦍🦄😻👻) count as a "special char"?
3. Anything that isn't alphanumeric?
4. Arbitrary : defined by each site you visit. DING! DING! DING! We have us a winner!!!
Length, Most Important Password Requirement
They have a max of 20 which is ridiculous.
I'm guessing that is related to the algorithm they use to parse your incoming password -- especially since the last requirement is:
No sequences (e.g. 12345 or 111)
I'm guessing that since extremely long passwords would take their algorithm longer to search for sequences they decide not to allow long passwords.
Doesn't pass the sniff test.
|
|
|
|
|
I just counted the special characters on my keyboard - 32 of them. The bank isn't allowing 10 of them.
123456789 123456789 123456789 12 <= Counter
`~!@#$%^&*()_-+={[}]|\:;"'<,>.?/
Their allowable character set contains 74 characters. The issue is many people forget about some of the common characters that aren't alphabetic and only think about the characters on the top row of the keyboard.
|
|
|
|
|
Very good point. I agree with that.
I'm still interested in why those particular characters are "special characters"
Just because they appear on my keyboard?
This means that
1. their algorithm probably contains an array with the non-allowed characters.
2. they compare each char in your password string to the array ala nonAllowedChars.Contains(password[i])
Seems interesting. Also, of course, there is apparently no standard for what is a "special char".
They could just:
1. hash whatever you sent
2. compare hash to known bad password hashes - deny if fails and allow if passes
I don't know. It seems odd that they look at cleartext password.
|
|
|
|
|
Some of those "invalid" characters are used as "escape" characters in character codes, strings, and HTML / XML.
(I can't show examples because they "get escaped" here)
"Before entering on an understanding, I have meditated for a long time, and have foreseen what might happen. It is not genius which reveals to me suddenly, secretly, what I have to say or to do in a circumstance unexpected by other people; it is reflection, it is meditation." - Napoleon I
|
|
|
|
|
Gerry Schmitz wrote: Some of those "invalid" characters are used as "escape" characters in character codes, strings, and HTML / XML
I agree with this also, but I also know that converting text input to Base64 to pass it across the line is extremely easy. You would hope they would just :
1. take the cleartext input
2. convert to base64
3. send base64
4. convert back to UTF-8 on server side.
5. hash the cleartext
6. look up the hash to see if it matches known bad passwords
7. if it matches bad, reject, if not then save it as the user's hash.
Next when user logs in, just hash their cleartext and match it to what the user's hash should be.
|
|
|
|
|
The (extra) handling of said characters is another topic.
"Before entering on an understanding, I have meditated for a long time, and have foreseen what might happen. It is not genius which reveals to me suddenly, secretly, what I have to say or to do in a circumstance unexpected by other people; it is reflection, it is meditation." - Napoleon I
|
|
|
|
|
raddevus wrote: 4. convert back to UTF-8 on server side. Well, UTF-8 isn't 'cleartext', but a multi-byte encoding of a 32-bit character code. There aren't very many applications around prepared to handle 32 bit character codes.
At least in the Western world, the common approach is to go for 16 bit UTF-16 codes internally - and for those falling outside that range, say 'To heck with those'. Even if you are not prepared to handle those UTF-16 codes pointing you over to the more exotic characters, you will be able to handle almost all the textual information you are likely to encounter, regardless of language.
In the age of 16-bit Windows, the 8-bit ISO 8859 encoding was the standard (with a few Window specific extensions). Since the advent of 32 bit Windows, UTF-16 has been the standard internal representation - although you shouldn't rely firmly on all applications to handle anything outside the 'Basic plane' (the first 64 kibi characters). (I would never trust that to be true for any *nix application without thorough testing!)
So: If the application on the server side is prepared for 16 bit characters, most likely according to UTF-16, you are probably fine. (Full 32-bit UTF is a non-essential 'nice to have'-feature.) An ability to accept and forward UTF-8 as a byte stream is not sufficient unless you can also process it as unpacked characters requiring more than 8 bits.
|
|
|
|
|
Cleartext is just a term for unencrypted data in any form. Well, I might've used the wrong term: plaintext[^]
|
|
|
|
|
greetings kind regards
i have always wondered as to the requirement of some sites to avoid certain "special" characters . i do not see how parsing a string of such would be any different from parsing exempli gratia "this_is_my_password" . perhaps you know if so i seek your knowledge .
as to length does it not seem reasonable to limit same ? further my password manager originally written by Bruce Schneier generates passwords af length 12 . i assume he knows his stuff .
|
|
|
|
|
No, it makes no sense to limit passwords to a shortened length.
Here is why:
1. You cleartext password is __never__ stored so it's length doesn't matter.
2. Your cleartext password is hashed before it is stored at the site you're logging into (probably SHA256) and that means that it is 32 bytes long when it is stored. A SHA256 hash is always EXACTLY 32 bytes long (neither shorter nor longer). So the site never has to worry about max length.
If the site does store your cleartext password it would be a terrible security problem.
So, why would they ever want to know anything about your password before it is hashed?
Well, they are attempting to insure you are not using a weak password which already has an obvious SHA256 hash that hackers know.
Google and Microsoft allow 64 character passwords and my passwords for those accounts are that long -- since they are often used for SSO (single sign on) which allows you to sign into many sites.
Meaning that you want to keep those accounts very secure so longer is always better.
|
|
|
|
|
raddevus wrote: 1. You cleartext password is __never__ stored so it's length doesn't matter.
2. Your cleartext password is hashed before it is stored at the site you're logging into (probably SHA256) You didn't put a huge, boldfaced, underlined, uppercased IF in front of those assumptions. You should have
|
|
|
|
|
raddevus wrote: weak password which already has an obvious SHA256 hash
Which would mean it's not salted, yes? Simply hashing only the password is still a weak link in the process.
|
|
|
|
|
An important consideration for security, but really at a different level from using a (sha256) hash as opposed to cleartext storage.
|
|
|
|
|
PIEBALDconsult wrote: Which would mean it's not salted, yes? Simply hashing only the password is still a weak link in the process.
Well, I'm ambivalent about that.
If the hacker has exfiltrated your entire db of hashed passwords then she has probably gotten your source code where you apply the salt too.
Also, it is an interesting point, because there are huge rainbow tables of passwords now and they are used by hackers to compare hashes to see what the passwords were.
As a matter of fact, when Google and other companies tell you that your password has been compromised they know because they check the huge databases of hashes that match. So I'm not sure.
The new thing that seems to be to do is to multi-hash -- hash the password hash multiple times (100s or 1000s of times). Because the the hacker has to use expensive CPU time to hash multiple times to see if it matches. As a matter of fact, my password manager (insert gratuitous self-promotion here) C'YaPass now allows the user to pick a number of times to multi-hash their hashed password.
C’YaPass: The Best Password Manager You’ve Never Used (A Complete Password EcoSystem)[^]
|
|
|
|
|
Google allows up to 100 character passwords and maybe more now.
Cheers,
Russ
|
|
|
|
|
RussellT wrote: Google allows up to 100 character passwords and maybe more now.
Yes! Exactly. My Google password is 64 characters long.
My Microsoft one is also 64 characters long.
Neither of those require any special character at all.
|
|
|
|
|
Maybe the restrictions on special characters is because they do password processing by some sort of regex processing. (Obligatory xkcd: xkcd: Bobby Tables[^])
I 'sort of' (but only sort of) can accept that they for simplicity restrict passwords to 8 bit characters. Unless they do full UTF-8/16 processing, some UTF byte values of e.g. an emoji may interfere with their regex processing; an intermediate UTF-8 byte may, in isolation, look like one of their special characters. Also: We still have a lot of text based internet protocols, developed before the internet community realized that there is a world outside 7-bit ASCII. If you connect through a protocol that is not updated, and not protected by some encoding of binaries, some of your emoji byte values in UTF coding may be misinterpreted as a protocol control character.
Yet, anno 2023 (and really even anno 2000 or 1990!) I would take for granted that both text oriented protocols and web sites can handle any ISO 8859 variant, 8-bit characters, or the numerous IBM code pages for the 128-255 range, without misbehaving or breaking down.
What scares me most is the limitation to 20 chars, clearly suggesting that they do not hash it but store it as plaintext. If hashed, there would be no reason for limiting the length. And ... there is no reason why they should store the password as plaintext. They should not!
If they do not store it as plaintext, and if they accept any ISO 8859 8-bit character, I do not have any difficulties creating 20-character passwords that would not appear in any dictionary attack. Attacking a 160 bit key by brute force is something that the attacker will do only if she expects to find something really valuable behind the locked door. (Besides: What happened to the old technique of incurring an exponentially increasing delay for each unsuccessful attempt to log in to an account? That prevents all brute force attacks!)
The restriction on repeating sequences I take as their attempt to discipline their users to create better passwords. They should include 'qwerty' and 'asdf' in the list as well. And several others. Even though this is an 'arbitrary' restriction, there are so many users out there ignorant with passwords that I accept it as a way to give those ignorant users a kick in the behind.
|
|
|
|
|
All great points and I agree with you.
trønderen wrote: Maybe the restrictions on special characters is because they do password processing by some sort of regex processing.
Yes, that is my guess at why they limit it to 20. I'm guessing that their algorithm would get stuck or they would possibly incur some regex backtracking[^] and they would hit some limitation which would cause their pattern matching to blow up.
That's all I can think because surely they aren't saving cleartext password.
Like you said, they're trying to give the user a good kick in the seat of the pants and insure they aren't using a bad password so they're looking at what the password contains, but it just seems bad.
I mean they could just
1. hash the cleartext
2. look up the hash in a known bad passwords table
If the hash isn't in the known bads then save it as the user's hash (which can be matched later).
|
|
|
|
|
[edit]
I missed your point at first. That is a deep approach that will likely frustrate any user that is not using password generation.
[Hash leakage hardening]
That is why you add a little “salt” that is different for each user. Prevents rainbow table lookup attacks if the hashes are exposed.
|
|
|
|
|
trønderen wrote: What scares me most is the limitation to 20 chars, clearly suggesting that they do not hash it but store it as plaintext
There are other possible explanations.
1. They simply do not know how normal hashing works.
2. They just needed/wanted a length on that field for other reasons so they chose an arbitrary value.
|
|
|
|
|
jschell wrote: 1. They simply do not know how normal hashing works. I wouldn't be surprised!
jschell wrote: 2. They just needed/wanted a length on that field Or maybe they didn't know how to create an input field of arbitrary length. They created a fixed input field of 20 chars and don't know how to expand it.
|
|
|
|
|
You need to limit "all" fields, then check for "nonsense" sequence of repeating blanks, etc.
People fall asleep leaning on the keyboard.
And the last thing you want is a report with a "blank field" that runs for pages ... and you wonder "how did it ..."
"Before entering on an understanding, I have meditated for a long time, and have foreseen what might happen. It is not genius which reveals to me suddenly, secretly, what I have to say or to do in a circumstance unexpected by other people; it is reflection, it is meditation." - Napoleon I
|
|
|
|
|
What bugs me just as much: In an input field for phone number or bank account / credit card number, and they accept digits only. There is a convention for writing credit card numbers as 1234 2345 3456 4567 - not as 1234234534564567, which makes it a lot harder to detect a typo. Bank account numbers are written as 1234 56 7890, or sometimes as 1234.56.7890. Again: 1234567890 makes it much harder to verify. Phone numbers are 404 55 606, not 40455606 (and today, you see phone numbers up to 14 digits, even domestic ones, making it even harder to detect a typo).
If they really insist on handling a structured number as a singe long integer, those 'readability spaces' can be removed by a single program code line, in several commonly used text processing tools/libraries. It takes a lot more effort to display an error message box declaring 'Spaces are not permitted in phone numbers' (or whatever).
While the number of sites rejecting 'readability spaces' seems to be declining, I have seen an increase in another 'facility' that I wouldn't say 'bugs' me, it rather amuses me: Entry fields for phone or account numbers with a spin button. As when I typed the wrong account or phone number, and really would like to spin from the mistyped number to the correct one. This started long before the AI wave, so don't blame AI!
|
|
|
|
|
When I've had to validate "contact" information, we (simply) used "national address" databases (world wide subscription) in real time to validate all contact information. We did what USPS, UPS and FedEx said to do in their specs. Incremental searching while you typed; you didn't even have to complete your own address information, since it "had" to be in the "database".
A call center app for all of New Zealand, 1,000,000+ addresses, with realtime incremental searching. (Different DB; custom API)
"Before entering on an understanding, I have meditated for a long time, and have foreseen what might happen. It is not genius which reveals to me suddenly, secretly, what I have to say or to do in a circumstance unexpected by other people; it is reflection, it is meditation." - Napoleon I
|
|
|
|
|
trønderen wrote: What bugs me just as much: In an input field for phone number or bank account / credit card number, and they accept digits only
I found a product that I really wanted. The order form would not accept 16 digits. Too long it said.
So I copied the page locally, hacked it, then submitted my order. Far as I can recall I got it.
|
|
|
|