Automation & orchestration with fdmon/blackbox cloud

Automation & orchestration with fdmon/blackbox cloud

Since the beginning of the project, the fdmon (Fast Deployment Monitoring) solution has been including, as standard features, automation, orchestration and scheduling. Monitoring and orchestration are unified so that your orchestration code can dynamically and very easily depend from the status of monitored resources or their related objects and current attributes (including trending metrics or forecasts).

For example, the following instruction :

install_package (intersect (query ("DEV1"), query ("disk", "/", "FREE", "+7d", ">", "1024")), "gcc") ;

will install the gcc package including dependencies on all servers that are in the DEV1 group and for which the forecast, in 7 days, of free space in the root file system, is greater than 1 GB.

The following one :

reboot (children (intersect (query ("SITE2"), query ("ESX_TEST")))) ;

will reboot all virtual machines hosted by ESX servers that are in the ESX_TEXT group and located on SITE2 site.

The following one :

snap (query ("PROD_DB")) ;

will create a VMware snapshot of all virtual machines of PROD_DB group.

In fact, functions used in these examples are built from the task_exec () internal function :

  • install_package (LIST, package) = task_exec (LIST, "install", package)
  • reboot (LIST) = task_exec (LIST, "reboot") + monitoring maintenance mode activation
  • snap (LIST) = task_exec (LIST, "parent.snap")

Note : the "parent." prefix indicates an execution of the "snap" task from the parent of each CI of the list, not the CI itself.

All functions are normalized : whatever the technology behind, each function will be the same. For example, the install_package () function can be used indifferently for any Unix or Windows system. So, the fdmon orchestration provides an abstraction layer between the monitored equipment and the orchestration code. Administration and orchestration of technologies no longer depends on the technologies vendors, whatever the technology type (operating system, database, storage, backup, virtualization, network, ...).

Indeed, tasks referenced in the exec_task () function correspond to "primitives" associated to monitored technologies, defined and invoked on the fdmon Proxy Server only (not on the cloud, nor on the monitored components). For example, Linux primitives are written in bash whereas Windows ones are written in Powershell. Primitives accept arguments provided by workflows or by the user when invoked manually. The customer can implement his own primitives or use his own Ansible playbooks, as primitives. Automation and orchestration features described here don't need any agent.

As seen previously, complex workflows can be implemented in a few minutes, from a real programming language (C style, but with much less declarative and syntactic constraints) that provides a complete set of high-level functions related to all monitoring features of fdmon, including Smart Inventory (the structured in-memory real-time database of all resources, objects, metrics and attributes), trending, logs, and interactions with the other automation solutions of the market.

Aucun texte alternatif pour cette image
Aucun texte alternatif pour cette image

Workflows can be triggered manually by the user, automatically by a specific event or by the fdmon scheduler. A scheduling is not only based on time (period, frequency, seasonality, ...), but also on the status of resources, objects, metrics, forecasts, results of tasks or other workflows, etc ... (concept of context).

For example, we can define the following scheduling (or context) :

  • From Monday to Friday, except on 1st January
  • Every hour at 05, from 08:00 to 18:00
  • "backup" resource status of servers "serv01", "serv02" and "serv0"3 is green

In this example, if we associate a workflow to this "context", this workflow will be triggered every hour from 08:00 to 18:00, only if the last backup of serv01, serv02 and serv03 servers has been successful.

Within a given workflow, all interactions (tasks or primitives) are executed asynchronously and the wait_task () function allows to implement any form of parallelism with points of synchronization at any time. Furthermore, locking mechanism can be implemented between distinct workflows.

For each CI or group of CI, we can define tasks execution permissions rules (allow, deny) associated with users, contexts, or an external ITSM solution (for change management compliancy purpose, this feature will be described in a next post). The historization of the logs of all tasks or workflows allows a complete traceability of all interactions with your IT infrastructures. We remind you that all these interactions are managed from the cloud, without incoming connection to the IT infrastructures (when fdmon is not used as a on-premise solution) and without any user connection to the monitored equipment.

AUTO_TASK and AUTO_FIX monitoring parameters allow to configure the automated execution of a task or a workflow, with or without delay :

  • AUTO_FIX : triggers a task or a workflow when an event occurs on the CI and the resource
  • AUTO_TASK : triggers a task or a workflow when another task or workflow is complete or fails.

For example, the following parameter (defined for a given CI or group of CI) :

AUTO_FIX appl,orange,,,install_package

will trigger the automatic installation of any missing critical package or the reinstallation of any critical package that has been accidentally dropped, depending on the configuration of the "appl" resource (Applications).

The FIX parameter allows to associate an automated corrective action with a resource (task or workflow), but triggered manually by the user from the blackbox cloud interface.

Furthermore, a set of specific HTML tags provided by fdmon, allow you to create on your dashboards buttons and dropdown menu from where you can trigger or check tasks or workflows. An usual use-case of this feature consists in implementing your own cloud management interface, with all monitoring features provided by fdmon around. You can apply a task or a workflow on a result of a query executed from the fdmon search engine. At last, you can create your own specific resources (Meta-Resources) that will be the result of one or a combination of tasks or workflows.

Aucun texte alternatif pour cette image
Aucun texte alternatif pour cette image

By making converge monitoring, trending, automation, orchestration and scheduling into the same product, by normalizing all IT operations, blackbox cloud provides a simple way to dramatically increase the productivity of your IT infrastructures management and implement autonomous data-centers.

For more details : [email protected]

Cyril Alata

Musicien, Réalisateur sonore, Assistant

4 年

Amazing ! Awesomed

回复

要查看或添加评论,请登录

Nicolas Corbin的更多文章

社区洞察

其他会员也浏览了