Monday, May 20, 2013

Opensource monitoring rapid testing, the 2013 update of capabilities

It was a busy week so far, I've been re-examining the status of different monitoring solutions based on opensource soft, and since Monday I deployando Nagios, Icinga, Ganglia, Cacti, OpenNMS and Zabbix, and Sensu'm installing now.

 Basically OpenNMS is what worked best out-of-the-box, carrying only a couple of hours the first complete installation + configuration (the 1st. Was the test too), and then with a few clicks, a self-discovery well swept detecting range of my network devices testing. Set thresholds and messaging was a bit more work, another hour of work, reading some documentation rather confusing and confused mails requests for assistance from users and vague answers. Sure, the solution was quite simple and intuitive ... after having operated the first time. Basically it works quite well but is a bit unstable in the sense that additional plugins deployar fast - track apt for example - will not end up always in a completely stable OpenNMS, and can have a service running perfectly for hours after installing plugins, and then the first system reboot, something triggerea boot failure, and error messages are basically the output dump of Java VM, and rarely contain useful information for recovery (the "profile "Response of the forums / lists that aims lot OpenNMS advanced users are very familiar with identifying" parts "of the soft setting to change / fix directly looking JVM dump rather than search / find that information in some documentation).

For example, to install the DHCP monitoring plugin, configure it, and then uninstall it, let the software unable to boot due to lack of binaries to start the service, in this case was lucky and the error message clearly indicated that the failure was because I could not start the DHCP service monitoring, and the solution was simply to return to comment on the service in the corresponding configuration file (and thus the attempt to start it off at the start of OpenNMS).

Cacti was very easy to install, use Cacti is so simple that almost no one bothered to create tutorials on how to add a device and potentially generate (and order), the graphics, but .. "simple" does not imply that it is fully intuitive, and I was a good half hour playing with the GUI to understand the workflow to add devices and generate graphical information, reason deployar basis for Cacti (anyway, apparently, OpenNMS is generating "exactly" the same information, but of course , must navigate several menus to find, in Cacti it has in view after login directly).


Ganglia is always my first choice to gather information and use performance servers, mainly because it installs quickly, requires no more than installing the software server, the soft client, and "hook" in configuration (you have to tell the client soft , the agent on the server ganglia to monitor, what is your - or their - server to accept communications). After installing Ganglia and leave taking information, I began to review the other options and half way to be pretty determined to implement OpenNMS and Cacti, Ganglia already had armed graphic profiles of my testing systems.

A Zabbix installed it in minutes (and a couple of agents as well), and the GUI is very attractive although it is not as intuitive as the OpenNMS (which is not too intuitive either), anyway I placed the capability of self- triggereé rapid discovery and discovery, which failed to capture a single device on the same network range that had loaded two hours earlier in OpenNMS (where the latter soft and perfectly detected my test servers and devices, snmp data included). So I went to the GUI to find documentation without further explanation of the procedure added - "intuitive" - ??devices, hence I find in forums and mailing lists, finding nothing back, I guess it is so easy that anyone bother to write a step-by-step, so I left without configuring Zabbix running for now (to find how to add devices). Similarly, at each stroke of Google I keep finding recommendations that Zabbix is ??"very easy", I guess you refer to the installation, but I have to devote more than the 20 minutes I spent in order to conclude something about the software ( and be able to charge at least a couple of test devices). If it does work, you might have a little utility that OpenNMS inclusive.


Nagios (and Icinga, in my first contact with the soft, I used my expertise in Nagios and I could configure / manage without any issue, so the portability of skills I can confirm) is what I left to prove in the end, it is tempting the desire to produce the software easier and faster set deployar, this does not always mean that the software is reasonably easy to manage on a day to day (well, in the case of Nagios IS easy to manage), and / or that scales very well even in the medium term.

Nagios does not scale at all well in dynamic environments where servers production up and down constantly, and the basic measure of this is to implement nagios to monitor clouds environments, however, if you implement Nagios in virtualized environments, quickly see how your servers only and stable production are constantly monitored, while the other servers that are plucked and off dynamically, even though they are in production, while slowly being left behind Nagios configurations only dedicated to control the servers that are running continuously without downtimes dynamic .

Besides there is the temptation to integrate Nagios + Cacti, Nagios Royal Decrees + Nagios + whatever, a solution that will quickly stop correctly reflect the true performance profile total virtualized environments, of course, unless you choose to handle the architecture so your servers "fixed" in production are always working on certain hypervisors and others, but dynamically torn production / off and of testing (that are created and deleted regularly), are limited to other hypervisors.

Mmm, there is a problem in that precisely the possibility of using idle capacity in hypervisors virtualize is reason in principle, so "limit" the focus of virtualized infrastructure for one (1) software does not have capabilities to "follow "the dynamism of virtual infra take excess capabilities in virtualization solution. Consider for example that the power limitation is dramatically when run on servers virtualized infras complex configurations (which define hierarchies for dynamic off hypervisor VMs under certain performance profiles, for example).

Sure, Nagios can be "adapted" to dynamic scenarios, but those settings will be static (basically you could "play" with schedulear downtimes scheduled downtimes match the estimable that VMs will take off when the servers virtualized infra ), with the result that on one hand we set the virtual infra automatically to fit on the other hand we have to deal with (re) adapt manual / software configuration continuously monitoring for servers.

Almost none of these conclusions is new (see the links monitoring-sucks), or use commercial software is the solution (it has the same limitations to adapt to dynamic infrastructures in general), and not that the same thing happens with the rest software I tried: OpenNMS, Cacti, Ganglia, etc.

I lack Groundwork and HypericHQ test (similar to Zabbix, commercial, but at least opensource or freeware) and see how they behave. I find it funny how the pages of all monitoring software sold say they are the best or something like this:-D> "The World's Largest Monitoring Web Applications"

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.