I have a fun challenge I couldn't find an answer for here. I have a string of text, that could potentially contain an account number.
Example:
"Hi, my account number is 1234 5678 9012 2345 and I'm great."
The account number can come in many flavours, as it's entered by the user:
Basic and potential possibilities below:
1234 1234 1234 1234
1234 1234 1234 1234 1
BE12 1234 1234 1234
1234-1234-1234-1234
1234.1234.1234.1234
1234123412341234
12341234 1234 1234
1234-1234-1234-1234-1
1234.1234.1234.1234.1
12341234123412341
12341 234 1234 12341
BE12-1234-1234-1234
be12-1234-1234 1234
Be12.1234.1234-1234
BE12123412341234
(basically integers with hyphen, space or a dot in the middle, with the exception of IBAN format, that has two characters at the beginning)
What I need as output is everything masked, except the last four digits.
"Hi, my account number is **** **** **** 2345 and I'm great."
How I think I should approach this problem:
- Analyze every string and try to find the above account no. patterns
- Create a magical regular expression that replaces the account no. they way I need
- If there's an account number, use this RegEx to do so.
What would be your approach?
Thanks!
You could match all of the above with:
\b[\dX][-. \dX]+(\d{4})\b
... and replace it with *
x strlen(match) - 4
+ \1
, see a demo on regex101.com.
In
JavaScript
:
var string = "Hi, my account number is 1234 5678 9012 2345 and I'm great.";
var new_string = string.replace(/\b[\dX][-. \dX]+(\d{4})\b/g, function(match, capture) {
return Array(match.length-4).join("*") + capture;
});
print(new_string);
See a demo on ideone.com.
Borrowing Jan's awesome regex pattern, it can be extended to capture the last digit too (see the example below)
Note: His method of using replace()
is better, I recommend using it for clarity. This is only to offer an alternative approach using match()
// Setup
let str = `1234 1234 1234 1234
1234-1234-1234-1234
12341234123412341234
1234 1234 1234 1234 1
12341234123412341
1234-1234-1234-1234-1
1234.1234.1234.1234.1
XX12 3456 1234 1234
XX123456123123
XX12-3456-1234-1234
XX12.3456.1234.1234
This is a sentence for visual proof 1234 5678 9012 3456
And some XX32 1111.2222-9999-2 more proof`,
nums = str.split('\n');
var re = new RegExp(/(\b[\dX][-. \dX]+(\d{4}.?\d?)\b)/);
// Convert Nums
var converted = nums.map(num => {
let match = num.match(re);
if (match) {
let orig = match[1],
end = match[2],
hidden = orig.substr(0, orig.length - end.length);
hidden = hidden.replace(/\S/g, "X") + end;
num = num.replace(orig, hidden);
}
return num;
});
// Visual Verification
console.log(converted);
str.replace(/\b[\dX][-. \dX]+(\d{4})\b/g, '**** **** **** $1')
console.log("Hi, my account number is 1234-5678-9012-2345 and I'm great.".replace(/\b[\dX][-. \dX]+(\d{4})\b/g, '**** **** **** $1'));
console.log("Hi, my account number is 1234 5678 9012 2345 and I'm great.".replace(/\b[\dX][-. \dX]+(\d{4})\b/g, '**** **** **** $1'));
console.log("Hi, my account number is 1234.5678.9012.2345.1 and I'm great.".replace(/\b[\dX][-. \dX]+(\d{4})\b/g, '**** **** **** $1'));
console.log("Hi, my account number is XX123456123123 and I'm great.".replace(/\b[\dX][-. \dX]+(\d{4})\b/g, '**** **** **** $1'));
Use a Regex with a look ahead
trick, simply find
\d{4}([ -.])(?![A-Za-z])
and replace with
****\1