Lately some clients of our software began to contact the support with specific problems: their input was looking right (they even send screenshots with proofs), but when the system was trying to parse the string and validate it using regular expression of, for example, numbers — client was receiving error with statement that the parameter was entered not correctly. Its not hard to guess that if the character is not visible — it means that there can be some hidden (not presented in rendering font) character, like, for example, “line ending” or “tabulation” (for example: вЂЋ01026019 in ASCII after paste from editor that supports Unicode, but 01026019 in Unicode). In case of our client it was: RTL mark (In Unicode, the RLM character is encoded at U+200F RIGHT-TO-LEFT MARK (HTML
‏). In our case we do not provide support to any language/typing, included in Unicode, so the easiest fix in our situation was escaping such chars.
And here is the snippet in C# that removes several Unicode characters from the string using regex: