I am attempting to pull some information from my tnsnames file using regex. I started with the following pattern:
MYSCHEMA *? = *?[\W\w\S\s]*\(HOST *?= *?(?<host>\w+\s?)\)\s?\(PORT *?= *?(?<port>\d+)\s?\)[\W\w\S\s]*\(SERVICE_NAME *?= *?(?<servicename>\w+)\s?\)
which worked fine when MYSCHEMA was the only schema in the file, but when there are other schemas listed after MYSCHEMA it matches all the way to the last schema.
I have since created a new pattern:
MYSCHEMA *=\s*\(DESCRIPTION =\s*\(ADDRESS *= *\(PROTOCOL *= *TCP\)\(HOST *= *(?<host>\w+)\)\(PORT *= *(?<port>\d+)\)\)\s*\(CONNECT_DATA *=\s*(?<serverdedicated>\(SERVER *= *DEDICATED\))\s*\(SERVICE_NAME *= *(?<servicename>[\w\.]+) *\)\s*\)\s*\)
This pattern matches MYSCHEMA only, but I had to add every element that appeared in MYSCHEMA entry, and it won't match MYOTHERSCHEMA if it does not contain all the same elements.
Ideally, I'd like a pattern that matches MYSCHEMA entry only, and captures HOST, PORT and SERVICE NAME, and optionally (SERVER = DEDICATED) (which I didn't have in the first pattern) to named groups.
Below is the sample tnsnames that I've been using for testing:
SOMESCHEMA =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = REMOTEHOST)(PORT = 1234))
)
(CONNECT_DATA = (SERVICE_NAME = REMOTE)
)
)
MYSCHEMA =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = MYHOST)(PORT = 1234))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = MYSERVICE.LOCAL )
)
)
MYOTHERSCHEMA =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = MYHOST)(PORT = 1234))
)
(CONNECT_DATA =
(SERVICE_NAME = MYSERVICE.REMOTE)
)
)
SOMEOTHERSCHEMA =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = LOCALHOST)(PORT = 1234))
)
(CONNECT_DATA =
(SERVICE_NAME = LOCAL)
)
)
The following regex will parse out individual TNS entries, and all you have to do is parse each result how you see fit for the names / values
([\w-]+)\s*=(?:\s|.)+?\)\s*\)\s*\)\s*(?=[\w\-])
Example: https://regexr.com/3r2vn
This should do it, using balanced groups. And modify the switch/case for your needs.
Well, since I haven't found a compelling answer to this issue (no offense @Mikael Svenson), I have just stuck with the second pattern listed in my question. It is sufficient for the time being, as the tnsnames.ora file always follows that exact pattern within our organization. Should the tnsnames.ora file format change, I will most likely adopt a parser.
This expression parse the schema with one address on address_list, etc. Hope this helps.
--begin (?>((?[\n][\s][^(][\w_.]+)[\s]=[\s]))(?>(?[\n\s(]DESCRIPTION[\s=\s](?>(?[\n\s(]*ADDRESS_LIST[\s=\s]*[\n\s(]ADDRESS[\s=\s](?([\n\s(]COMMUNITY)([\n\s(]COMMUNITY[\s=\s](?[\w.)]+))|())[\s\n](?([\n\s(]PROTOCOL)([\n\s(]PROTOCOL[\s=\s](?[\w.)]+))|())[\s\n](?([\n\s(]HOST)([\n\s(]HOST[\s=\s](?[\w.)]+))|())[\s\n](?([\n\s(]PORT)([\n\s(]PORT[\s=\s](?[\w.)]+))|())[\s\n](?())())|())))[\s\n](?>(?[\n][\s][(]CONNECT_DATA\s*[=]\s*[\n](?([(]SID\s[=]\s*)(([(]SID\s*[=]\s*(?[\w.]+)\s*[)]))|())[\s\n](?([(]SERVER\s[=]\s*)(([(]SERVER\s*[=]\s*(?[\w.]+)\s*[)]))|())[\s\n]*(?([(]SERVICE_NAME\s*[=]\s*)(([(]SERVICE_NAME\s*[=]\s*(?[\w.]+)\s*[)]))|())[\s\n](?())())|())))[\s\n](?())())|()))) *--end
Undoubtedly, the problem is the multiple that is in the form of writing that file. As you say the file must be unique, this is solved by using the TNS_ADMIN variable.