...
_source=BridgeServer2-Prod MetricsFilter | parse "\"status\":*," as status | where status >= 400 and status < 500 != 401 and status != 401404 and status < 500 | count by status, uri | sort by _count
This query matches our 4XX alarm. All requests that had a 4XX error, except 401s (which are surprisingly common) .and 404s (frequent bot-scans), and orders them by most frequent code & URI combination
_source=BridgeServer2-Prod MetricsFilter | parse "\"status\":," as status | where status >= 400 and status < 500 and status != 401 | parse "\"remote_address\":\"\"" as ipAddress | count by ipAddress | order by _count desc
...
This simple query parses the userId from our MetricsFilter. The nodrop
means that if the line doesn’t have a userId, we still preserve the line in our results, but the parsed userId shows up as blank.
(_source=bridge-exporter-prod or _source=bridgeworker-prod or _source=BridgeServer2-prod) not ("connection reset by peer" or " HTTP/1.1\" 404 " or "com.newrelic") | where _loglevel = "ERROR"
Find all ERROR logs across all 3 applications, ignoring new relic errors, 404s for URIs containing the string “error”, and TCP connection resets.
(_source=bridge-exporter-prod or _source=bridgeworker-prod or _source=BridgeServer2-prod) not (MetricsFilter or " HTTP/1.1\" ") | where isBlank(_loglevel)
Find all unparsed log messages across all applications.
Graphs
_source=BridgeServer2-Prod MetricsFilter | parse "\"elapsedMillis\":*}" as latency | num(latency) | timeslice 1h | pct(latency,50,95,99) by _timeslice | order by _timeslice asc
...