We have a large number (read: 50,000) of relatively small (read under 500K, typically under 50K) log files created using log4net from our client application. A typical log looks like:
Start Painless log
Framework:8.1.7.0
Application:8.1.7.0
2010-05-05 19:26:07,678 [Login ] INFO Application.App.OnShowLoginMessage(194) - Validating Credentials...
2010-05-05 19:26:08,686 [1 ] INFO Application.App.OnShowLoginMessage(194) - Checking for Application Updates...
2010-05-05 19:26:08,830 [1 ] INFO Framework.Globals.InstanceStartup(132) - Application Startup
2010-05-05 19:26:09,293 [1 ] INFO Framework.PluginManager.LogPluginState(150) - Plugin <Purchase History Data>:True
2010-05-05 19:26:09,293 [1 ] INFO Framework.PluginManager.LogPluginState(150) - Plugin <Shopping Assistant>:True
2010-05-05 19:26:09,294 [1 ] INFO Framework.PluginManager.LogPluginState(150) - Plugin <Shopping List>:True
2010-05-05 19:26:09,294 [1 ] INFO Framework.PluginManager.LogPluginState(150) - Plugin <Teeth>:True
2010-05-05 19:26:09,294 [1 ] INFO Framework.PluginManager.LogPluginState(150) - Plugin <Scanner>:True
2010-05-05 19:26:09,294 [1 ] INFO Framework.PluginManager.LogPluginState(150) - Plugin <Value Comparison>:True
2010-05-05 19:26:09,294 [1 ] INFO Framework.PluginManager.LogPluginState(150) - Plugin <Lotus Notes CRM>:True
2010-05-05 19:26:09,295 [1 ] INFO Framework.PluginManager.LogPluginState(150) - Plugin <Salesforce.com>:False
2010-05-05 19:26:09,295 [1 ] INFO Framework.PluginManager.LogPluginState(150) - Plugin <Lotus Notes Mail>:True
2010-05-05 19:26:09,295 [1 ] INFO Framework.PluginManager.LogPluginState(150) - Plugin <Sales Leads>:True
2010-05-05 19:26:09,295 [1 ] INFO Framework.PluginManager.LogPluginState(150) - Plugin <Configurator>:True
2010-05-05 19:26:09,297 [1 ] INFO Application.App.OnShowLoginMessage(194) - Validating Database...
2010-05-05 19:26:10,342 [1 ] INFO Application.App.OnShowLoginMessage(194) - Validating Database...
2010-05-05 19:26:10,489 [1 ] INFO Application.App.OnShowLoginMessage(194) - Loading Global Handlers...
2010-05-05 19:26:10,495 [1 ] INFO Application.App.OnShowLoginMessage(194) - Starting Main Window...
2010-05-05 19:26:10,496 [1 ] INFO Application.App.OnShowLoginMessage(194) - Initializing Components...
2010-05-05 19:26:11,145 [1 ] INFO Application.App.OnShowLoginMessage(194) - Restoring Location...
2010-05-05 19:26:11,164 [1 ] INFO Application.App.OnShowLoginMessage(194) - Loading Plug Ins...
2010-05-05 19:26:11,169 [1 ] INFO Application.App.OnShowLoginMessage(194) - Loading Panels...Order Manager
2010-05-05 19:26:11,181 [1 ] INFO Application.App.OnShowLoginMessage(194) - Loading Orders...
2010-05-05 19:26:11,274 [1 ] INFO Application.App.OnShowLoginMessage(194) - Done Loading 1 Order
2010-05-05 19:26:11,314 [1 ] INFO Application.App.OnShowLoginMessage(194) - Loading Panels...All Products
2010-05-05 19:26:11,471 [1 ] INFO Application.App.OnShowLoginMessage(194) - Loading Tabbed Areas...Purchase History Data
2010-05-05 19:26:11,545 [1 ] INFO Application.App.OnShowLoginMessage(194) - Loading Tabbed Areas...Shopping List
2010-05-05 19:26:11,597 [1 ] INFO Application.App.OnShowLoginMessage(194) - Loading Tabbed Areas...Teeth
2010-05-05 19:26:11,768 [1 ] INFO Application.App.OnShowLoginMessage(194) - Loading Tabbed Areas...Scanner
2010-05-05 19:26:11,810 [1 ] INFO Application.App.OnShowLoginMessage(194) - Loading Tabbed Areas...Value Comparison
2010-05-05 19:26:11,840 [1 ] INFO Application.App.OnShowLoginMessage(194) - Loading Tabbed Areas...Sales Leads
2010-05-05 19:26:11,922 [1 ] INFO Application.App.OnShowLoginMessage(194) - Loading Tabbed Areas...(Done!)
2010-05-05 19:26:11,923 [1 ] INFO Application.App.OnShowLoginMessage(194) - Adding Handlers...
2010-05-05 19:26:11,925 [1 ] INFO Application.App.OnShowLoginMessage(194) - Loading Main Window Handlers...
2010-05-05 19:26:11,932 [1 ] INFO Application.App.OnShowLoginMessage(194) - Loading Main Menu...
2010-05-05 19:26:11,949 [1 ] INFO Application.App.OnShowLoginMessage(194) - Initialization Complete.
2010-05-05 19:26:13,662 [1 ] INFO Framework.ProductSearch.Search(342) - User entered term: <>
I'd like to be able to to parse these logs server side (either when they're uploaded or nightly) to extract either exceptions (which always log at ERROR or FATAL) or other specific log messages like:
2010-05-05 20:05:24,951 [1 ] INFO Framework.ProductSearch.Search(342) - User entered term: <kavo>
to get the 'kavo' term so we can find out what people are really searching for.
I've tried parsing the text by hand (using String.Split() and similar methods) but it really feels like I'm reinventing the wheel.
Is there a nice library to do this kind of log extracting?
You could use Microsoft's Log Parser (Download here). The tool also exposes a COM interface that will let you access the results programmatically. This blog post has a small (partial) instructions for doing that with version 2.1.
So now instead of feeling it is reinventing the wheel, you probably feel like it is over-engineered. ;)
can log4net not read it's own format patterns? Once it reads the format back in, can't you then parse just the %message% section?
A better question maybe why you don't just write that search term to a database at the same time you make the log call or have another log file through log4net that just logs the %message% and all you write to that log is the search term? I really don't think you want to be parsing logs if you can avoid it...at least going forward.
This looks like a typical case of .NET regular expressions
This way you can search for patterns like "User entered term: ". You can extract the group match afterwards
We ended up simply outputting XML as our log format and it became trivial to parse.