Showing all posts tagged metrics:

I'm a big fan of metrics. My metrics toolchain of choice is Sensu -> Graphite -> Grafana. Sensu runs on each node in my infrastructure and reports metrics to Graphite. Grafana makes the Graphite graphs all pretty, and add quite a bit of functionality to the graphs. The only problem with metrics, is they're usually completely separated from any sort of alerts or events in the infrastructure that could correlate events to change in infrastructure behavior.

The excellent folks at Etsy have a great write up on showing deploys in Graphite graphs. Though, when you're deploying more than one service, it important to know which service deployment caused the uptick or downtick in your metrics.

In digging around for a good way to add some events to my graphs I came across the events feature in Graphite (as of Version 0.9.9 according to some sources). The documentation on events seems fairly sparse, but it works well with the Annotations feature in Grafana. The added value over using events instead of a "1" metric the way Etsy has, is I can add metadata to the event and it will show up on mouse over of the Annotation in Grafana.

A quick and dirty way to get events into Graphite is with a simple curl -X POST. \

curl -X POST "http://graphite.internal.server/events/"-d '{"what": "bad code push", "tags": "production deploy", "data":"Jeff plays too much Ingress"}'

I can replace the "what" with what service was deployed, "tags" with the appropriate tags, and "data" with pertinent data like the version of the service that was deployed.

curl -X POST "http://graphite.internal.server/events/" -d '{"what": "Web Service", "tags": "production deploy web", "data":"version 1.1.7"}'

If you follow the directions on the Annotations Page and choose Graphite Events as your Type, you can then type in the tags to look for. By default it's space delimited, and it's an OR function. This means "production deploy" with find events tagged "production" or "deploy".

That annotation will actually contain (from top to bottom) the "what" (deployed service), tags, deploy time, and "data" (deployed version). The tags allow for filtering based on what matters to the graph. If we're looking at overall Website Response Time, we'll probably want a list of all related services that could potentially impact Website Response Time. If it's a specific API, we're probably looking at less deploys. With added annotations, you can filter based on tags. In this case, any event tagged with "production" or "deploy" will be displayed.

This will allow us to see the impact of various services being pushed with little effort. Events just need to be tagged correctly to show up in the appropriate graph. We can see short term and long term impact and it'll allow us to make better decisions about deployments moving forward. There are limitations however. Annotations are Dashboard wide, meaning, If you have multiple graphs on a single dashboard, Every graph on the dashboard is annotated. This just means you need to be mindful (if you aren't already) about what is being displayed and annotated on your dashboard. You could also do things the way they're referenced in the Etsy write up and have your annotations on your Graphite graph instead of Grafana, but I think you'll lose some of the extra functionality I've written about here.