Adventures In Hindsight
I have been recently introduced to hindsight by beorn and since I found that the learning curve was a bit steep I thought this blog migration would be a nice occasion to share some sort of “field notes”. This is by no mean a “hello world” for hindsight, a best practices or anything that should be considered as a tutorial. The goal is to share some feedback and help newcomers like me to avoid being stuck on some basic steps. I really do think that the hindsight documentation could be improved for beginners, but that’s not the point here.
A bit of context
This discovery of hindsight has a goal : replacing my logstash setup by something lighter and more reliable. At this point I’m not even sure it will succeed, but I will have at least learned new things :) Another thing is that my knowledge of lua does not extend much beyond writing a configuration for my hammerspoon, so I may write stupid code, since I am way outside my comfort zone.
Starting slowly
When I discover a new technology, I tend to read the docs first, looking for some common sections like “Installing software”, “Quickstart”, “Hello world in XXX” etc. Unfortunately, none of this exists in hindsight documentation. The “Install” part is somehow covered in the github Readme but no fast track here. Even if building is not hard, it requires some pretty recent dependencies, and fortunately Debian has packages for it. For my first setup I have been awarded freshly built packages, a bit of cheating here :D The only thing to mention is that hindsight alone, without the lua sandbox extensions is pretty much an empty shell. You will need to install the extensions to match your needs.
Configuration breakdown
Following the schema in the architecture overview, you will see there 3 main components, input
, analysis
and output
. I only explored input
and output
at the time of this writing.
Here is (partly) how my setup looks like :
├── hindsight.cfg
├── load
│ ├── analysis
│ ├── input
│ └── run
├── run
│ ├── analysis
│ ├── input
│ │ ├── input_test.cfg
│ │ ├── input_test.lua
│ │ ├── prune_input.cfg
│ │ ├── syslog.cfg
│ │ ├── syslog2.cfg
│ └── output
│ ├── counter.cfg
│ ├── counter.lua
│ ├── es.cfg
│ ├── es.rtc
│ ├── heka_debug.cfg
Some of these files come from the benchmark directory of hindsight. It contains interesting resources, like the input_test
that allows to read from a file and counter
to say how many messages events were treated.
Step one : getting some input
My first thought was to read from syslog. For some reason I could not get it working in my awkward setup. Since I work on a mac I built a docker environment to tinker with hindsight and it’s probably there that the magic did not happen, so I fell back on reading from a file, which lead to using input_test.lua. Here is how things go : if you want to use some custom code (here for input) you have to make live along its .cfg
file in the input
directory. This directive is documented in the configuration section that applies to hindsight.cfg
. So my input_test.cfg
consists of :
filename = "input_test.lua"
instruction_limit = 0
input_file = "sample_logs/meh.log"
The meh.log
file is formatted as specified by the input_test.lua
grammar.
This is a first step, now I wanted to output this input to stdout, which is the most simple form to me
Step two : outputing what we got in
To achieve this I used a lua sandbox extension that is shipped through a package, so only a heka_debug.cfg
here. Its content being :
filename = "heka_debug.lua" -- the file to use
message_matcher = "TRUE" -- display ALL messages
Starting hindsight with hindsight hindsight.cfg 7
(yes, the loglevel value has to be numeric, not in the docs) give me the following output :
:Uuid: F06E5BE4-E640-48CD-AFD9-3F56D12F9EF7
:Timestamp: 1970-01-01T00:00:03.416000000Z
:Type: logfile
:Logger: FxaAuthWebserver
:Severity: 7
:Payload: <nil>
:EnvVersion: <nil>
:Pid: <nil>
:Hostname: trink-x230
:Fields:
| name: body_bytes_sent type: 3 representation: B value: 6428434
| name: remote_addr type: 0 representation: ipv4 value: 1.2.3.4
| name: request type: 0 representation: <nil> value: GET /foobar HTTP/1.1
| name: status type: 3 representation: <nil> value: 200
| name: http_user_agent type: 0 representation: <nil> value: curl-and-your-UA
| name: http_x_forwarded_for type: 0 representation: <nil> value: true
| name: http_referer type: 0 representation: <nil> value: -
So weeee \o/
That’s all for this one. Next time I will try to cover the following points :
- which gotchas I faced before I forget them :)
- getting input from network
- decoders
- output-ing to ElasticSearch