Friday 24 June 2016

Working with logstash



Logstash a centralized tool to collect and aggregate logs. It is so intuitive and it's configuration are so easy to understand that you would just love it.

The post describes how to work with Logstash and Logstash configuration.

In nut shells, Logstash is composed of three main components.

  1. Input
  2. Filter
  3. Output


- Input :  What is the medium/source through which Logstash would receive your log events.

A valid input source could be stdin,tcpudpzeromq etc. In fact, Logstash has a wide range of input tools which you can choose from.(to get full list input plugin click here)

The input block essentially looks like this.

input {
   stdin {
      codec => 'plain'
    }
}



- Output : The source or medium to which the Logstash would send or store it's event.

Just like input Logstash provide a wide range of Output plugin as well.

The vanilla output block looks like this -

output {
   stdout {
      codec => 'rubydebug'
   }
}

If you really aren't considering to perform any filtration on data or log message you receive, most of the times the above blocks(input and output) is sufficient to start with Logstash.

Note: We are making a minor adjustment in our working example. Instead of using the stdin we would be using tcp as the input plugin.

A final look at our configuration 

## logstash.conf
 input {
   tcp {
      port => '5300'
   }
}

output {
   stdout {
      codec => 'rubydebug'
   }
}

Testing Configuration -

logstash -f logstash.conf --configtest


Loading the LogStash

logstash -f logstash.conf


You might get a little help from the below screenshots to understand how Logstash output looks like.


Note: I had used Telnet to send logs to Logstash.

@timestamp: An ISO 8601 timestamp.
message: The event's message. 
@version: the version of the event format. The current version is 1.
host: host from which the message / event's was sent. 
port: port of the client.

- Filter Filter plugin, are used to massage(filter) the logs(if needed) so that one can modify the received log message before output(ting) it via output plugin.

A simple filter block look like this. (we will explore this in our next example)

filter {
   grok {
     ## grok filter plugin 
   }
}

To explain the power of Logstash, let us just work with a demo example.

Here we have an application which generates logs of various types

  •  Custom debugging logs.
  •  SQL logs etc.
Example.

[20-JUN-2016 14:00:23 UTC] Received Message

[20-JUN-2016 14:00:24 UTC] Before query the IP Address
(1.0ms)  SELECT "ip_addresses"."address" FROM "ip_addresses" WHERE "ip_addresses"."resporg_accnt_id" = 3
[20-JUN-2016 14:00:24 UTC] After query the IP Address
[20-JUN-2016 14:00:24 UTC] The Ip address found is X.X.X.X

[20-JUN-2016 14:00:27 UTC] Quering ResporgID
ResporgAccountId Load (2.0ms)  SELECT resporg_account_ids.*, tfxc_fees.fee as fee FROM "resporg_account_ids" AS resporg_account_ids LEFT JOIN ip_addresses ON resporg_account_ids.id = ip_addresses.resporg_accnt_id LEFT JOIN tfxc_fees ON resporg_account_ids.id = tfxc_fees.resporg_account_id_id WHERE "resporg_account_ids"."active" = 't' AND (((ip_addresses.address = 'x.x.x.x' AND ip_addresses.reserve = 't') AND ('x.x.x.x' = ANY (origin_sip_trunk_ip))) OR (resporg_account_ids.resporg_account_id = 'XXXX') OR (resporg_account_ids.resporg_account_id = 'XXXX'))
[20-JUN-2016 14:00:27] Resporg ID is TIN

[20-JUN-2016 14:00:29 UTC] Querying Freeswitchinstance 
FreeswitchInstance Load (1.0ms)  SELECT  "freeswitch_instances".* FROM "freeswitch_instances" WHERE "freeswitch_instances"."state" = 'active'  ORDER BY "freeswitch_instances"."calls_count" ASC, "freeswitch_instances"."average_system_load" ASC LIMIT 1
[20-JUN-2016 14:00:29 UTC] FreeswitchInstance is IronMan.

[20-JUN-2016 14:00:29 UTC] Get the individual rate
IndividualCeilingRate Load (0.0ms)  SELECT  "individual_ceiling_rates".* FROM "individual_ceiling_rates" WHERE "individual_ceiling_rates"."resporg_account_id_id" = 7 AND "individual_ceiling_rates"."originating_resporg_id" = 3 LIMIT 1
[20-JUN-2016 14:00:29 UTC] The individual rate is 20

[20-JUN-2016 14:00:30 UTC] Query the individual rate
Rate Load (1.0ms)  SELECT  "rates".* FROM "rates" WHERE "rates"."resporg_account_id_id" = 3 LIMIT 1
[20-JUN-2016 14:00:30 UTC] The Selected rate is 40 


Now, we need our system to output(or store) the logs based on their type(SQL and Custom type)

This is where the power the Filter(plugin) outshine.


GROK filter plugin

A closer look at filter(grok) plugin suggests that one can add a regex for the incoming log events(for filtering).

Note: Grok has a wide range of regex pattern (120+) that you can choose from. But it's power is not limited to predefined regex pattern. In fact, one can provide a custom regex pattern as well (like in our case)

In our cases, we can apply regex on either SQL or Custom logs(we are choosing SQL message) and then segregate them.

Note. If you need help building patterns to match your logs, you will find the grokdebug and grokconstructor application quite useful.

The Regex -




Let's define our configuration now.

## input the log event via TCP.
input {
   tcp {
      port => '5300'
   }
}

filter {
  ## apply this filter only to log event of type custom
  if ([type] == "custom") {
    grok {
       ## load your custom regex pattern 
       patterns_dir => "./pattern"
       ## Compare the message with the you applied regex
       match => { "message" => "%{ARSQL:sql}" }
       ## if the message matched the given regex apply a field called "grok" match
       add_field => {"grok" => "match"} 
    }

  ## if the field has a grok match, which means that  above regex match
   if ([grok] == 'match') {
      ## apply mutate filter plugin to replace the type from CUSTOM to SQL
      mutate {
        replace => {"type" => "sql"}
        ##  remove the grok field that was added in the earlier filter
        remove_field => ["grok"]
       }
    }
  }
}

## output plugin. For now we will be using rubydebug but we can every easily used any of the output plugin 
output {
   stdout {
      codec => 'rubydebug'
   }
}


Let examine output


{
       "message" => "Received Message",
      "@version" => "1",
    "@timestamp" => "2016-06-20T14:00:23.320Z",
          "host" => "werain",
          "type" => "custom" ## custom tag
}

{
       "message" => "(1.0ms)  SELECT "ip_addresses"."address" FROM "ip_addresses" WHERE "ip_addresses"."resporg_accnt_id" = 3
",
      "@version" => "1",
    "@timestamp" => "2016-06-20T14:00:24.520Z",
          "host" => "werain",
          "type" => "sql" ## we have successfully managed to change the type to sql(from custom) based 
                          ## on the grok regex filteration
}


Notice the type sql being mutated(replaced) in place of custom type.

Note:  Well if that is not enough you can ask a LogStash to filter the event from an external program.If you want you simply try my demo example and LogStash configuration defined over here and here


That all folks. I hope I manage to do justice to the amazing library called LogStash which has simplified my tasks of log-management to such ease.

Thanks.


What did I learn today?

Welcome to the what did I learn today series. The intention of this blog spot is to compose the stuff that I learnt day-to-day basics and jo...