5.2.1.1. ON Core Monitoring

We define and strongly recommend to have a monitoring process in place for each Role (Sensor, Core, Analytics) in any productive environment.

We classify and define different monitoring methods as:

  • Trending: Where the system resources monitor hardware performance and its status.

  • External Network services: Availability where those are checked from outside.

  • Processes and Events to be monitored: That are up and running along with its related events.

  • Healthcheck: ON Core has a multiple internal checks to make sure services are up and running as expected.

  • Monitor User: To check if radius petitions work properly.

To understand better how to monitor the ON Core, we recommend to review the ON Core Architecture section

5.2.1.1.2. External Network Services

Check service availability:

  • DNS server (port TCP/53 and UDP/53), if this service is enabled.

  • DHCP server (port UDP/67), if this service is enabled.

  • DHCP-HELPER-READER service (port UDP/67), if this service is enabled.

  • Radius server (port UDP/1812 and UDP/1813)
    • It would be interesting to use a RADIUS connection check with a valid user and credentials.

  • MySQL server (port TCP/3306)
    • iptables firewall would have to be modified to enable access from the monitor server to this service.

  • Queues server (port TCP/4730)
    • iptables firewall would have to be modified to enable access from the monitor server to this service.

  • HTTP/HTTPS server (port TCP/80 and TCP/443): Apart from checking the HTTP/HTTPS service, a status page is defined as http://openNACServer/status, where the output would be a JSON like the following:

*{"db":1,"queue":{"pending_jobs":0,"running_jobs":0,"available_workers":5}}*
**db** field has to be "1", and @queue@ depending on your queues configuration and usage.

5.2.1.1.3. Processes and Events to be monitored

The following services can be externally monitored:

  • httpd

  • krb5kdc

  • Radiusd

  • Opennac

  • Mysqld

  • Radius log events monitoring:
    • Auth fails more than 100 per minute

    • Errors regarding duplicated request not bigger that 50 per minutes.

5.2.1.1.4. Healthcheck

Different modules are being checked by the out of the Box ON Core instances. For the different roles of the ON Core we can find:

5.2.1.1.4.1. ON Principal

To configure the ON Principal healthcheck, visit the healthcheck configuration:

  • AD_DOMAIN_MEMBER

  • ADM_USER_PASSWD_EXPIRATION

  • BACKEND

  • BACKUP

  • CACHE

  • CAPTIVE_PORTAL

  • CAPTIVE_PORTAL_THEMES

  • COLLECTD

  • DB

  • DISK_BACKUP

  • DISK_ROOT

  • DISK_TMP

  • DISK_VAR

  • DISK_VAR_LOG

  • DNS

  • FILEBEAT

  • HTTP_CERTIFICATE

  • LDAP

  • LOGCOLLECTOR

  • NTLM

  • PORTAL

  • QUEUE

  • RADIUS

  • RADIUS_CERTIFICATE

  • RAM

  • SYSTEM_INFO

  • SYSTEM_LOAD

  • TIME_SYNC

  • UDS

  • VPN_NODES

  • WINBIND

5.2.1.1.4.2. ON Worker

To configure the ON Worker healthcheck, visit the healthcheck configuration section:

  • AD_DOMAIN_MEMBER

  • BACKEND

  • BACKUP

  • CACHE

  • CAPTIVE_PORTAL

  • CAPTIVE_PORTAL_THEMES

  • COLLECTD

  • DB

  • DBREPLICATION

  • DHCPHELPERREADER

  • DISK_BACKUP

  • DISK_ROOT

  • DISK_TMP

  • DISK_VAR

  • DISK_VAR_LOG

  • DNS

  • FILEBEAT

  • HTTP_CERTIFICATE

  • LOGCOLLECTOR

  • NTLM

  • NXLOG

  • PORTAL

  • QUEUE

  • RADIUS

  • RADIUS_CERTIFICATE

  • RAM

  • SYSTEM_INFO

  • SYSTEM_LOAD

  • TIME_SYNC

  • UDS

  • VPN_NODES

  • WINBIND

5.2.1.1.4.3. ON Proxy

To configure ON Proxy healthcheck, visit the healthcheck configuration:

  • BACKUP

  • COLLECTD

  • DISK_BACKUP

  • DISK_ROOT

  • DISK_TMP

  • DISK_VAR

  • DISK_VAR_LOG

  • DNS

  • LOGCOLLECTOR

  • RADIUS

  • RADIUS_CERTIFICATE

  • RAM

  • SYSTEM_INFO

  • SYSTEM_LOAD

  • TIME_SYNC

5.2.1.1.4.4. ON Portal

To configure ON Portal healthcheck, visit the healthcheck configuration:

  • BACKUP

  • CACHE

  • CAPTIVE_PORTAL_THEMES

  • COLLECTD

  • DISK_BACKUP

  • DISK_ROOT

  • DISK_TMP

  • DISK_VAR

  • DISK_VAR_LOG

  • DNS

  • HTTP_CERTIFICATE

  • LOGCOLLECTOR

  • PORTAL

  • RAM

  • SYSTEM_INFO

  • SYSTEM_LOAD

  • TIME_SYNC

5.2.1.1.5. Monitor User

The Monitor user is a utility integrated to our tool that, in this case, will help us to know if RADIUS server authentications are working.

This user implemented in our system is in charge of carrying out authentication processes against the RADIUS every minute, by sending polevals that precisely simulate this authentication process.

We can see the result of the authentication process mentioned in the ON NAC > Default view > Unassigned window. This user always has the MAC address value of 00:00:00:00 :00:00, IP address value of 0.0.0.0 and User value as the name “monitor”.

If the Last Access value of this user is greater than 1 minute, we can conclude that either the RADIUS server is off or the authentication process does not work as it should.

Important

To avoid unwanted access to our system taking advantage of this monitor user, it is important to create a policy that restricts access. This way, we can use this monitoring tool without compromising our security. An example of said policy is shown below, which we will have to position as #1 in our list of policies.

It is also important when creating said user, to configure a complex password and not to write it down.

  • Name: Monitor

  • Enabled: YES

  • Preconditions:Users(It will have to be created if it does not exist)
  • Preconditions: Sources
    • Supplicant User: YES

    • User: YES

  • Postconditions
    • VLAN
      • ID: 4095

      • Type: Service

      • VLAN by default: false

      • Name: ACCESS DENIED

This username and password will be used by the different network devices to check the status of the RADIUS server.