User Tags
LogZilla's Rewrite Rules Feature also allows the extraction and "tagging" of any arbitrary data extracted from events.
For more examples please be sure to check our GitHub Page which also contains any recently added rules submitted by the community but not implemented in the software yet.
Extracting Insight From Arbitrary Data
User Tags allow extraction and transformation of any arbitrary data from incoming events in order to obtain insight from various metrics such as:
- Device types
- Users
- Locations
- GeoIP
- Authentication Failures
- Audit Log Tracking
- Malware Types/Sources/Destinations
These are only a few of the thousands of possibilities of what users are able to extract as tags from LogZilla's rule parser.
"User Tags" makes it possible to extract and track any information which may provide insight into day-to-day NetOps, SecOps, DevOps, etc. functions.
For example, given a list of incoming events such as:
%AUTHPRIV-3-SYSTEM_MSG: pam_aaa:Authentication failed for user bob from 10.87.8.1
Log-in failed for user 'agents' from 'ssh'
It is easy to extract and track the name of the users as well as the source address of that user:
- Create a new rule such as
100-failed-login-tracking.yaml
- Add the pattern match and user tag of your choice
- Set the rule to mark this event as
actionable
in the system (note that statuses may also be set asnon-actionable
).
rewrite_rules:
-
comment: "Auth Fail User Tracking"
match:
field: "message"
op: "=~"
value: "for (?:user)? '?([^\\s']+)'? from '?([^\\s']+)'?"
tag:
authfail_users: "$1"
authfail_source: "$2"
rewrite:
status: "actionable"
- Add your new rule using
logzilla rules add 100-failed-login-tracking.yaml
- Add a new
TopN
widget to any dashboard (such asTop Hosts
) and edit that widget to select the newly created user tag field:
User Tags Field Selector
- Your
TopN
chart will now display the top 5 Client Usernames.
Top Auth Fail Usernames chart
Match/Update Based on Previously Created Tags
It is also possible to set a custom tag, then use that tag in the same rule or even other rules files. If a tag-based match/update is used, that tag should be created first of course. If created in another rule file, be sure that the rule file containing the new tag comes first alphabetically. For example:
001-rule.yaml - create the tag based on a message match:
rewrite_rules:
-
comment:
- "Extract denied List Name, Protocol and Port Numbers from Cisco Access List logs"
- "Sample Log: Oct 4 22:33:40.985 UTC: %SEC-6-IPACCESSLOGP: list PUBLIC_INGRESS denied tcp 201.166.237.25(59426) -> 212.174.130.30(23), 1 packet"
match:
field: "message"
op: "=~"
value: "list (\\S+) denied (\\S+) \\d+\\.\\d+\\.\\d+\\.\\d+\\((\\d+)\\).+?\\d+\\.\\d+\\.\\d+\\.\\d+\\((\\d+)\\)"
tag:
cisco_acl_deny_acl_name: "$1"
cisco_acl_deny_src_proto: "$2"
cisco_acl_deny_src_port: "$3"
cisco_acl_deny_dst_port: "$4"
002-rule.yaml - use the tag created in 001-rule.yaml to map port numbers to names:
first_match_only: true
rewrite_rules:
-
comment: "Match on previously created Cisco ACL tags and convert the port numbers extracted stored in that same tag to a name for ports 22, 23, 80 and 443"
match:
field: "cisco_acl_deny_dst_port"
value: "22"
tag:
cisco_acl_deny_dst_port: "ssh"
-
match:
field: "cisco_acl_deny_dst_port"
value: "23"
tag:
cisco_acl_deny_dst_port: "telnet"
-
match:
field: "cisco_acl_deny_dst_port"
value: "80"
tag:
cisco_acl_deny_dst_port: "http"
-
match:
field: "cisco_acl_deny_dst_port"
value: "443"
tag:
cisco_acl_deny_dst_port: "https"
Example 2
In the example below, a previous rule file (or even a rule in the same file which appears before this rule) has created the su_sessions
user tag. The example below assumes this has already been done.
The rule below tells the system to match on su_sessions
and set the program
to su
, but only if the matched value does not equal an empty string (blank messages).
rewrite_rules:
-
comment: "Track su sessions"
match:
field: "su_sessions"
op: "ne"
value: ""
rewrite:
program: "su"
Makemeta
A helper script located on our GitHub is available to be used to create rules automatically using a tab separated file as input. You can download the script here
Input fields
The .tsv
(tab-separated-values) file must contain at least 6 columns
Columns 1-4
Columns 1-4 must be:
For exampleColumn 1
Indicates whether or not (0 or 1) a user tag should also be created for this entry
Column 2
The string you want to match on, for example: my.host.com
or foo bar baz
Column 3
The field to match on in LogZilla, such as host
, program
, message
, etc.
Column 4
Defines the match Operator to use. Options are:
Operator | Match Type | Description |
---|---|---|
eq | String or Integer | Matches entire incoming message against the string/integer specified in the match condition |
ne | String or Integer | Does not match anything in the incoming message match field. |
gt | Integer Only | Given integer is greater than the incoming integer value |
lt | Integer Only | Given integer is less than the incoming integer value |
ge | Integer Only | Given integer is greater than or equal to the incoming integer value |
le | Integer Only | Given integer is less than or equal to the incoming integer value |
=~ | RegEx | Match based on RegEx pattern |
!~ | RegEx | Does not match based on RegEx pattern |
=* | RegEx | RegEx appears anywhere in the incoming message |
Columns 5 and greater
All columns after column 4 are key-value pairs to be added. For example, given the following entire row in a file:
1 10.1.2.3 host eq deviceID rtp-core-sw DeviceDescription RTP Core Layer2 DeviceImportance High DeviceLocation Raleigh DeviceContact [email protected]
key="value"
pairs, like so:
Key = DeviceImportance, value = High
Key = DeviceDescription, value = RTP Core Layer2
Key = DeviceLocation, value = Raleigh
Key = deviceID, value = rtp-core-sw
Key = DeviceContact, value = [email protected]
1 10.1.2.3 host eq deviceID rtp-core-sw DeviceDescription RTP Core Layer2 DeviceImportance High DeviceLocation Raleigh DeviceContact
This would produce errors when the perl script runs, e.g.:
Odd number of elements in hash assignment at ./makemeta line 60, <$fh> line 4.
Use of uninitialized value $kvs{"DeviceContact"} in string comparison (cmp) at ./makemeta line 78, <$fh> line 4.
Use of uninitialized value $kvs{"DeviceContact"} in string comparison (cmp) at ./makemeta line 78, <$fh> line 4.
Use of uninitialized value $kvs{"DeviceContact"} in string comparison (cmp) at ./makemeta line 78, <$fh> line 4.
Use of uninitialized value $kvs{"DeviceContact"} in string eq at ./makemeta line 80, <$fh> line 4.
Usage
./makemeta
Usage:
makemeta
-debug [-d] <1 or 2>
-format [-f] (json or yaml - default: yaml)
-infile [-i] (Input filename, e.g.: test.tsv)
Sample test.tsv file:
1 <TAB> host-a <TAB> host <TAB> eq <TAB> deviceID <TAB> lax-srv-01 <TAB> DeviceDescription <TAB> LA Server 1
User Tags
If column 1 on your .tsv
contains a 1
, user tags will also be created for every key/value pair. As such, you will now see these fields available in your widgets. For example, the following rule:
- match:
- field: host
op: eq
value: host-a
tag:
metadata_importance: High
metadata_roles: Core
metadata_locations: Los Angeles
update:
message: $MESSAGE DeviceDescription="LA Server 1" DeviceLocation="Los Angeles" DeviceImportance="Low" deviceID="lax-srv-01" DeviceContact="[email protected]"
- match:
- field: message
op: =~
value: down
update:
message: $MESSAGE DeviceImportance="Med" DeviceDescription="NYC Router" DeviceLocation="New York" deviceID="nyc-rtr-01" DeviceContact="[email protected]"
Will produce fields available similar to the screenshot below:
Screenshot: Available Fields
Caveats/Warnings
-
Tag names are free-form allowing any alphabetic characters. Once a message matches the pattern, the tag is automatically created in the API, then made available in the UI. If a tag is created but does not show up in the UI, it may simply mean there have been no matches on it yet. (note: users may want to try a browser refresh to ensure a non-cached page is loaded).
-
Any
_
's in the tag name will be converted to aspace character
when displayed in the UI. -
Tagging highly variable data may result in degradation or even failure of metrics tracking (not log storage/search) based on the capability of your system. This is due to cardinality limitations in InfluxDB. The following article outlines this limitation in more detail.
NOTE: certain user tag names are reserved for LogZilla internal use, and
cannot be used as user tags; in these cases you will need to choose an
alternative (a simple option would be to prefix the field name with ut_
).
The reserved names are:
* first_occurrence
* last_occurrence
* counter
* message
* host
* program
* cisco_mnemonic
* severity
* facility
* status
* type
CAUTION: Care should be taken to keep the number of tags below 1m entries per tag.
Tag Performance
As with any large amount of data stream manipulation, performance degradation can happen depending on many variables such as CPU, Memory, Disc I/O, and, of course, the way the rules are presented to the parsing engine.
Ensuring Good Rule Performance
When writing large rulesets, it may be useful to use a precheck match to match on a string before matching on a large regular expression pattern. The precheck in this context is not a special type, rather, it is the same syntax as a match type, but uses eq
(string) instead of =~
(regex). This also ensures that "generic" regex patterns don't match on a message it was not intended for.
The example below shows how to use an eq
match (string match) for the incoming event. Then, if the string matches, a more complex regex (=~
) match may be used.
Sample "pre-match"
rewrite_rules:
- comment:
- 'Vendor: HP Aruba'
- 'Type: Hardware'
- 'Category: 802.1x'
- 'Description: This log event informs the number of auth timeouts for the last
known time for 802.1x authentication method.'
- 'Sample Log: <NUMBER_OF> auth-timeouts for the last <TIME> sec.'
match:
- field: message
op: eq
value: auth-timeouts for the last
- field: message
op: =~
value: \S+ auth-timeouts \S+ \S+ \S+ \S+ sec .*
rewrite:
program: HP_Switch
tag:
category: 802.1x
type: hardware
vendor: HP
Bad Regex
It's important to make sure that the regex used is the most efficient. This will go a long way when using thousands of rules. In the example above, a prematch should not even be used (but was for demonstrative purposes only). A much better method for the example above would have been to not use a prematch and simply use a better regex pattern such as \\S+ auth-timeouts for the last \\S+ sec
.
Be sure to use a tool such as RegEx101 to make sure the patterns work and that they perform well.
Testing
A command-line tool logzilla rules
may be used to perform various functions including:
- list - List rewrite rules
- reload - Reload rewrite rules
- add - Add rewrite rule
- remove - Remove rewrite rule
- export - Save rule to file
- enable - Enable rewrite rule
- disable - Disable rewrite rule
- errors - Show rules having errors, with counts
- performance - Test rules single thread performance
- test - Check rule for validity and correct operation
To add your rule, simply type logzilla rules add myfile.yaml
Tag Naming
For rules provided by LogZilla (and as a recommendation), user tag names will fall into one of two categories. The first category is the most common data fields found in event log messages. For these fields the user tags and meanings are:
User Tag | Example | Meaning |
---|---|---|
SrcIP |
127.0.0.1 |
source IPv4 address |
SrcIPv6 |
2001:0db8:85a3:0000:0000:8a2e:0370:7334 |
source IPv6 address |
DstIP |
11.22.33.44 |
destination IPv4 address |
DstIPv6 |
2001:0db8:85a3:0000:0000:8a2e:0370:7334 |
destination IPv6 address |
SrcPort |
dynamic |
source port (instead of numeric this will be provided as a descriptive abbreviation, or dynamic if usage unspecified) |
DstPort |
https |
destination port (instead of numeric this will be provided as a descriptive abbreviation, or dynamic if usage unspecified) |
Proto |
TCP |
communications protocol (typically TCP , UDP , or ICMP ) |
MAC |
00:00:5e:00:53:af |
MAC address |
IfIn |
enp8s0 |
interface in |
IfOut |
enp8s0 |
interface out |
The second category of user tag names is for all other tags besides the above list. This will obviously constitute a large variety of tag names. These tag names are set to be the same as the vendor field names, so that those familiar with the vendor event log messages will be able to find that same data as LogZilla user tags with the same names. Some random examples of such tags might be:
Vendor Field and User Tag | Meaning |
---|---|
act |
action taken |
cat |
category |
cnt |
count |
dhost |
destination host |
dvchost |
device host |