Logstash a centralized tool to collect and aggregate logs. It is so intuitive and it's configuration are so easy to understand that you would just love it.
The post describes how to work with Logstash and Logstash configuration.
In nut shells, Logstash is composed of three main components.
- Input
- Filter
- Output
- Input : What is the medium/source through which Logstash would receive your log events.
A valid input source could be stdin,tcp, udp, zeromq etc. In fact, Logstash has a wide range of input tools which you can choose from.(to get full list input plugin click here)
The input block essentially looks like this.
input {
stdin {
codec => 'plain'
}
}
- Output : The source or medium to which the Logstash would send or store it's event.
Just like input Logstash provide a wide range of Output plugin as well.
The vanilla output block looks like this -
output {
stdout {
codec => 'rubydebug'
}
}
If you really aren't considering to perform any filtration on data or log message you receive, most of the times the above blocks(input and output) is sufficient to start with Logstash.
Note: We are making a minor adjustment in our working example. Instead of using the stdin we would be using tcp as the input plugin.
A final look at our configuration
## logstash.conf input { tcp { port => '5300' } } output { stdout { codec => 'rubydebug' } }
Testing Configuration -
logstash -f logstash.conf --configtest
Loading the LogStash
logstash -f logstash.conf
You might get a little help from the below screenshots to understand how Logstash output looks like.
Note: I had used Telnet to send logs to Logstash.
@timestamp: An ISO 8601 timestamp.
message: The event's message.
@version: the version of the event format. The current version is 1.
host: host from which the message / event's was sent.
port: port of the client.
- Filter : Filter plugin, are used to massage(filter) the logs(if needed) so that one can modify the received log message before output(ting) it via output plugin.
A simple filter block look like this. (we will explore this in our next example)
filter { grok { ## grok filter plugin } }
To explain the power of Logstash, let us just work with a demo example.
Here we have an application which generates logs of various types
- Custom debugging logs.
- SQL logs etc.
[20-JUN-2016 14:00:23 UTC] Received Message [20-JUN-2016 14:00:24 UTC] Before query the IP Address (1.0ms) SELECT "ip_addresses"."address" FROM "ip_addresses" WHERE "ip_addresses"."resporg_accnt_id" = 3 [20-JUN-2016 14:00:24 UTC] After query the IP Address [20-JUN-2016 14:00:24 UTC] The Ip address found is X.X.X.X [20-JUN-2016 14:00:27 UTC] Quering ResporgID ResporgAccountId Load (2.0ms) SELECT resporg_account_ids.*, tfxc_fees.fee as fee FROM "resporg_account_ids" AS resporg_account_ids LEFT JOIN ip_addresses ON resporg_account_ids.id = ip_addresses.resporg_accnt_id LEFT JOIN tfxc_fees ON resporg_account_ids.id = tfxc_fees.resporg_account_id_id WHERE "resporg_account_ids"."active" = 't' AND (((ip_addresses.address = 'x.x.x.x' AND ip_addresses.reserve = 't') AND ('x.x.x.x' = ANY (origin_sip_trunk_ip))) OR (resporg_account_ids.resporg_account_id = 'XXXX') OR (resporg_account_ids.resporg_account_id = 'XXXX')) [20-JUN-2016 14:00:27] Resporg ID is TIN [20-JUN-2016 14:00:29 UTC] Querying Freeswitchinstance FreeswitchInstance Load (1.0ms) SELECT "freeswitch_instances".* FROM "freeswitch_instances" WHERE "freeswitch_instances"."state" = 'active' ORDER BY "freeswitch_instances"."calls_count" ASC, "freeswitch_instances"."average_system_load" ASC LIMIT 1 [20-JUN-2016 14:00:29 UTC] FreeswitchInstance is IronMan. [20-JUN-2016 14:00:29 UTC] Get the individual rate IndividualCeilingRate Load (0.0ms) SELECT "individual_ceiling_rates".* FROM "individual_ceiling_rates" WHERE "individual_ceiling_rates"."resporg_account_id_id" = 7 AND "individual_ceiling_rates"."originating_resporg_id" = 3 LIMIT 1 [20-JUN-2016 14:00:29 UTC] The individual rate is 20 [20-JUN-2016 14:00:30 UTC] Query the individual rate Rate Load (1.0ms) SELECT "rates".* FROM "rates" WHERE "rates"."resporg_account_id_id" = 3 LIMIT 1 [20-JUN-2016 14:00:30 UTC] The Selected rate is 40
Now, we need our system to output(or store) the logs based on their type(SQL and Custom type)
This is where the power the Filter(plugin) outshine.
GROK filter plugin
A closer look at filter(grok) plugin suggests that one can add a regex for the incoming log events(for filtering).
Note: Grok has a wide range of regex pattern (120+) that you can choose from. But it's power is not limited to predefined regex pattern. In fact, one can provide a custom regex pattern as well (like in our case)
In our cases, we can apply regex on either SQL or Custom logs(we are choosing SQL message) and then segregate them.
Note. If you need help building patterns to match your logs, you will find the grokdebug and grokconstructor application quite useful.
The Regex -
Let's define our configuration now.
## input the log event via TCP. input { tcp { port => '5300' } } filter { ## apply this filter only to log event of type custom if ([type] == "custom") { grok { ## load your custom regex pattern patterns_dir => "./pattern" ## Compare the message with the you applied regex match => { "message" => "%{ARSQL:sql}" } ## if the message matched the given regex apply a field called "grok" match add_field => {"grok" => "match"} } ## if the field has a grok match, which means that above regex match if ([grok] == 'match') { ## apply mutate filter plugin to replace the type from CUSTOM to SQL mutate { replace => {"type" => "sql"} ## remove the grok field that was added in the earlier filter remove_field => ["grok"] } } } } ## output plugin. For now we will be using rubydebug but we can every easily used any of the output plugin output { stdout { codec => 'rubydebug' } }
Let examine output
{ "message" => "Received Message", "@version" => "1", "@timestamp" => "2016-06-20T14:00:23.320Z", "host" => "werain", "type" => "custom" ## custom tag } { "message" => "(1.0ms) SELECT "ip_addresses"."address" FROM "ip_addresses" WHERE "ip_addresses"."resporg_accnt_id" = 3 ", "@version" => "1", "@timestamp" => "2016-06-20T14:00:24.520Z", "host" => "werain", "type" => "sql" ## we have successfully managed to change the type to sql(from custom) based ## on the grok regex filteration }
Notice the type sql being mutated(replaced) in place of custom type.
Note: Well if that is not enough you can ask a LogStash to filter the event from an external program.If you want you simply try my demo example and LogStash configuration defined over here and here
That all folks. I hope I manage to do justice to the amazing library called LogStash which has simplified my tasks of log-management to such ease.
Thanks.