Document toolboxDocument toolbox

Sumo Logic Examples

Querying by Common Fields

_source=BridgeServer2-Prod MetricsFilter | parse "\"status\":*," as status | where status >= 400 and status != 401 and status != 404 and status < 500 | count by status, uri | sort by _count

This query matches our 4XX alarm. All requests that had a 4XX error, except 401s (which are surprisingly common) and 404s (frequent bot-scans), and orders them by most frequent code & URI combination

 

_source=BridgeServer2-Prod MetricsFilter | parse "\"status\":," as status | where status >= 400 and status < 500 and status != 401 | parse "\"remote_address\":\"\"" as ipAddress | count by ipAddress | order by _count desc

Groups the results by IP Address and orders them by most common IP Address.

 

_source=BridgeServer2-Prod MetricsFilter | parse "\"status\":," as status | where status >= 400 and status < 500 and status != 401 | parse "\"user_id\":\"\"" as userId | count by userId | order by _count desc

Groups the results by User ID and orders them by most common User ID.

 

_source=BridgeServer2-Prod MetricsFilter | parse "\"user_id\":\"\"" as userId nodrop

This simple query parses the userId from our MetricsFilter. The nodrop means that if the line doesn’t have a userId, we still preserve the line in our results, but the parsed userId shows up as blank.

 

(_source=bridge-exporter-prod or _source=bridgeworker-prod or _source=BridgeServer2-prod) not ("connection reset by peer" or " HTTP/1.1\" 404 " or "com.newrelic") | where _loglevel = "ERROR"

Find all ERROR logs across all 3 applications, ignoring new relic errors, 404s for URIs containing the string “error”, and TCP connection resets.

 

(_source=bridge-exporter-prod or _source=bridgeworker-prod or _source=BridgeServer2-prod) not (MetricsFilter or " HTTP/1.1\" ") | where isBlank(_loglevel)

Find all unparsed log messages across all applications.

Graphs

_source=BridgeServer2-Prod MetricsFilter | parse "\"elapsedMillis\":*}" as latency | num(latency) | timeslice 1h | pct(latency,50,95,99) by _timeslice | order by _timeslice asc

Shows hourly latency, 50th percentile (median), 95th percentile, and 99th perceptile. Works best if you use the Line Chart option.

Advanced Queries

_source=BridgeServer2-Prod MetricsFilter reauth "\"status\":200" "\"user_agent\":\"Blood Pressure/88" | parse "\"remote_address\":\"\"" as ipAddress | where [subquery: _source=BridgeServer2-Prod MetricsFilter reauth "\"status\":404" "\"user_agent\":\"Blood Pressure/88" | parse "\"remote_address\":\"\"" as ipAddress | count ipAddress | compose ipAddress] | count ipAddress | order by _count desc

Get all MetricsFilter entries for reauth calls that succeeded that also had failing reauth calls in the same time frame.

 

_source=BridgeServer2-Prod MetricsFilter | parse "\"request_id\":\"*\"" as requestId | where [subquery:_source=BridgeServer2-Prod error "broken pipe" | parse regex "(?<D>ERROR)" | parse "BridgeExceptionHandler - request: * " as requestId | compose requestId]

Get all MetricsFilter entries for all requests affected by broken pipe errors.

Nested queries. Also works with where ![subquery: ...].