Grafana for Telemetry: Things I wish I knew before I started

Grafana for Telemetry: Things I wish I knew before I started


There are a number of free frameworks and tools out there that can make viewing of telemetry or monitoring data much easier. One of such tools is Grafana. In Grafana you simply select a graph, specify a query to retrieve the data of interest and configure presentation. Grafana supports many input (i.e., data source) and output (i.e., visualization panel) plugins. For simple dashboards you can get the job done in 1 day if you are not experienced with Grafana and the data source query language. Pretty good isn′t it? And monitoring dashboards look great in Grafana.

I have been working with Grafana (and InfluxDB time-series database) for 1 year now, mostly focused on (near) real-time telemetry of high bit rate optical communications systems. Grafana was key to get us going and come up with a PoC quickly. Nevertheless, from PoC to product, it has been a little more complicated.

Here are a few things you should know before picking Grafana for your project (considering current version 6):

  1. Grafana can not update values with periods shorter than 5 seconds, so depending on your needs Grafana may or may not be the right tool for the job;
  2. A Grafana graph panel can show multiple data points of one variable obtained as result of a query. So if you need to show a label that is based on a number of other variables, in addition to the value returned in the query, you will have to create an additional field in the database structure to allow for Grafana to show that label. You will also have to do some processing to create that label before storing it into the database of choice. Worse, depending on the database engine (e.g., InfluxDB) you will end up with higher RAM and storage consumption (because of higher series cardinality) and lower performance because of that little trick you pulled off;
  3. If you need to create an additional label from multiple device variables that may come and go depending on the dynamics of your application, and that label is used as search key in a query, you will have an additional and rather complex task of keeping all the variables and the label in sync across devices, memory, database, Grafana. Seems easy, but it is not if you have many to deal with;
  4. A Grafana graph panel typically used to show status (singlestat-based tile) supports 3 colors only. So if you need to represent more than 3 states, each with its own color to ease visualization, Grafana will not do it unless it is possible to aggregate more than 3 states into 3 colors. There is an open ticket to remove this limitation, but no release date so far;
  5. Grafana provides for alarms and notifications based on matching the values returned in the query against a pre-defined threshold value. Nevertheless, Grafana does not support variable-based threshold values, only constants. So if you change a given threshold value in a database or device, Grafana won′t be able to update the threshold value. If I′m not mistaken, there is an open ticket to address this - there are certainly many requests;
  6. Grafana provides the means to monitor highly dynamic systems that may come and go and have components added or removed dynamically. Nevertheless, there is a learning curve. There is a number of ways to make a new curve appear or an existing one disappear in a graph panel. There is a number of ways to make a new graph panel appear or an existing one disappear in a dashboard. To make it work one will have to design the database structure and queries accordingly and most likely implement some pre-storage processing logics.

These are the main Grafana drawbacks I have experienced in 1 year of experience monitoring highly-dynamic optical communications systems.

I still think Grafana is a tremendous tool that can get you from zero to production very quickly, with beautiful looks, and I strongly recommend it for that. Nevertheless one should analyze the application requirements and understand whether Grafana is the best choice before starting implementation. Because Grafana is not a silver bullet as many might think. In my case, even with my 1-year experience with Grafana (using InfluxDB as main data source) there are still some things I can′t do with Grafana. And to get some things done, code logics got more complex and both RAM and storage consumption increased.

Ryan Anderson

IBM CTO for Palo Alto Networks; IBM Architect in Residence, San Francisco; Cambridge University; VC Investor and Advisor

5 å¹´
赞
回复

要查看或添加评论,请登录

Marcos R Salvador的更多文章

  • Fighting Cyber Extortion through Storage

    Fighting Cyber Extortion through Storage

    We just entered a new era of cybercrime, the era of cyber extortion. In this new era of cyber crimes, digital attackers…

    5 条评论
  • Lei de Informática fomentando Startups (?)

    Lei de Informática fomentando Startups (?)

    Estamos vivendo um momento muito bom no país de empreendedorismo baseado em TICs. A cada dia em alguma regi?o do pais…

  • Doutores precisam de trabalho e o país de inova??o: pensando fora da caixa

    Doutores precisam de trabalho e o país de inova??o: pensando fora da caixa

    A CAPES e o CNPq acertaram quando decidiram, há algumas décadas, estimular a forma??o de pesquisadores (mestres e…

    2 条评论
  • Vereadores: precisamos mesmo dos nossos?

    Vereadores: precisamos mesmo dos nossos?

    A democracia apoia-se sobre 3 poderes soberanos, cada qual com fun??es distintas bem definidas: executivo, legislativo…

    2 条评论
  • Is the industry sacrificing quality for shorter product releases?

    Is the industry sacrificing quality for shorter product releases?

    No matter the industry or the market, companies are competing ferosciously to launch products first. The obvious…

  • The Gaming Business is flourishing

    The Gaming Business is flourishing

    Many still believe gaming is for kids, for fun they say. They are correct, actually partly correct.

    1 条评论
  • Corporate life in Brazil: selling tomorrow's lunch

    Corporate life in Brazil: selling tomorrow's lunch

    One of the corporate complaints I hear the most and have experienced many times throughout my career in Brazil is that…

    4 条评论
  • Why do we do things the way we do?

    Why do we do things the way we do?

    I lived in The Netherlands for a number of years and I remember every day early in the morning I would take the (very…

    1 条评论

社区洞察

其他会员也浏览了