Adventures In Hindsight 2
This is the follow-up of my previous post regarding hindsight experimentations and discovery
Outputing to ES
This one is fairly easy, since it uses the lua sandbox extensions for ES. I ended up with the following run/output/es.cfg
config
filename = "elasticsearch_bulk_api.lua"
message_matcher = "TRUE"
memory_limit = 200 * 1024 * 1024
-- ticker_interval = 10
address = "es"
port = 9200
timeout = 10
flush_count = 1
flush_on_shutdown = false
preserve_data = not flush_on_shutdown --in most cases this should be the inverse of flush_on_shutdown
discard_on_error = true
max_retry = 1
-- See the elasticsearch module directory for the various encoders and configuration documentation.
-- https://mozilla-services.github.io/lua_sandbox_extensions/elasticsearch/io_modules/encoders/elasticsearch/payload.html
encoder_module = "encoders.elasticsearch.payload"
encoders_elasticsearch_common = {
es_index_from_timestamp = true,
index = "logs-v1-%{%Y.%m.%d}",
type_name = "%{Logger}",
}
Several things to note here : my values of configuration are for testing : flushing the bulk at 1 make no sense outside of that context. I also discard messages on error, which I would not do in production. An insteresting thing, and maybe a best practice, is that the name of the index embeds a “version”. You can also note that it is composed out of interpolation of variables.
Getting input from the network
To get input from syslog, RFC 3164 formatted, I rely on the udp module, as follows in run/input/syslog1.cfg
(naming matters here, we’ll see later) :
filename = "udp.lua"
instruction_limit = 0
-- listen on all interfaces
address = "0.0.0.0"
-- unprivileged port
port = 1514
-- decode the flow !
decoder_module = "decoders.syslog"
-- display errors when decoder fails
send_decode_failures = true
Doing so, it listens on port 1514, but the interesting part is the use of a decoder : it will split the message, separating standard fields from the payload.
logger -n 127.0.0.1 --rfc3164 -P 1514 test
This will give an output looking like this :
:Uuid: EAB2822B-985-4CAA-AA96-A83D0834A58
:Timestamp: 2017-07-31T07:45:58.000000000Z
:Type: <nil>
:Logger: input.syslog1 <<<<< Note that the input file name reflects into logger name
:Severity: 5
:Payload: test
:EnvVersion: <nil>
:Pid: <nil>
:Hostname: fe5a39526650
:Fields:
| name: syslogfacility type: 3 representation: <nil> value: 1
| name: programname type: 0 representation: <nil> value: root
| name: sender_ip type: 0 representation: <nil> value: 127.0.0.1
| name: sender_port type: 2 representation: <nil> value: 44439
Here I still need to explore (and probably write my own) subdecoders, which may be the subject of another post. Regarding the Logger
field, you can think of using it in your index name, to split source in indexing, to have different policies in your ES storage; for example with a logger named awesome_app
you could create an index named logs-v1-awesome_app-2017.07.31
.
Pruning input
This one cleans messages that have processed for output (using checkpoints to do so), which is something you clearly want to do. It relies on the standard prune_input
module
filename = "prune_input.lua"
message_matcher = "TRUE"
input = true