Unicode is a way to encode the things that humans use to write stuff into a computer.
ASCII is for example another way, as is EBCDIC.
All these methods translate squiggles that we’ve used for centuries into something that can be represented inside a computer.
For example, the letter “A” is under ASCII represented by the number 65.
This post is pointing out that there are two characters that look identical, but have different numbers, which means that what the user sees is identical, but what the computer sees is different.
This fact is actively used for phishing, as you can craft domains looking nearly identical to the original one, but leading to your IP address hosting the phishing mask.
One of my favorites was using Japanese full stop (U+3002) in place of periods in a bare IP or anywhere you would use a period in a FQDN (fully qualified domain name). Only tested in Chrome at the time, but the browser would “correct” it for you and take you to the intended page.
Unicode is a way to encode the things that humans use to write stuff into a computer.
ASCII is for example another way, as is EBCDIC.
All these methods translate squiggles that we’ve used for centuries into something that can be represented inside a computer.
For example, the letter “A” is under ASCII represented by the number 65.
This post is pointing out that there are two characters that look identical, but have different numbers, which means that what the user sees is identical, but what the computer sees is different.
This is the basis for much tomfoolery.
This fact is actively used for phishing, as you can craft domains looking nearly identical to the original one, but leading to your IP address hosting the phishing mask.
One of my favorites was using Japanese full stop (U+3002) in place of periods in a bare IP or anywhere you would use a period in a FQDN (fully qualified domain name). Only tested in Chrome at the time, but the browser would “correct” it for you and take you to the intended page.