5.1.1.1. ON Core Monitoring
We define and strongly recommend to have a monitoring process in place for each Role (Sensor, Core, Analytics) in any productive environment.
We classify and define different monitoring methods as:
- Trending: Where the system resources monitor hardware performance and its status. 
- External Network services: Availability where those are checked from outside. 
- Processes and Events to be monitored: That are up and running along with its related events. 
- Healthcheck: ON Core has a multiple internal checks to make sure services are up and running as expected. 
- Monitor User: To check if radius petitions work properly. 
To understand better how to monitor the ON Core, we recommend to review the ON Core Architecture section
5.1.1.1.1. Trending
It is possible to find the status of the system resources. We can find it in the Status -> Trending. The system resources monitored are:
- CPU 
- OpenNAC 
- Disk 
- Interface 
- Load 
- Memory 
- Mysql 
- Redis 
- Other 
- Conntrack 
5.1.1.1.2. External Network Services
Check service availability:
DNS server (port TCP/53 and UDP/53), if this service is enabled.
DHCP server (port UDP/67), if this service is enabled.
DHCP-HELPER-READER service (port UDP/67), if this service is enabled.
- Radius server (port UDP/1812 and UDP/1813)
It would be interesting to use a RADIUS connection check with a valid user and credentials.
- MySQL server (port TCP/3306)
iptables firewall would have to be modified to enable access from the monitor server to this service.
- Queues server (port TCP/4730)
iptables firewall would have to be modified to enable access from the monitor server to this service.
HTTP/HTTPS server (port TCP/80 and TCP/443): Apart from checking the HTTP/HTTPS service, a status page is defined as http://openNACServer/status, where the output would be a JSON like the following:
*{"db":1,"queue":{"pending_jobs":0,"running_jobs":0,"available_workers":5}}*
**db** field has to be "1", and @queue@ depending on your queues configuration and usage.
5.1.1.1.3. Processes and Events to be monitored
The following services can be externally monitored:
httpd
krb5kdc
Radiusd
Opennac
Mysqld
- Radius log events monitoring:
Auth fails more than 100 per minute
Errors regarding duplicated request not bigger that 50 per minutes.
5.1.1.1.4. Healthcheck
Different modules are being checked by the out of the Box ON Core instances. For the different roles of the ON Core we can find:
5.1.1.1.4.1. ON Principal
To configure the ON Principal healthcheck, visit the healthcheck configuration:
- AD_DOMAIN_MEMBER 
- ADM_USER_PASSWD_EXPIRATION 
- BACKEND 
- BACKUP 
- CACHE 
- CAPTIVE_PORTAL 
- CAPTIVE_PORTAL_THEMES 
- COLLECTD 
- DB 
- DISK_BACKUP 
- DISK_ROOT 
- DISK_TMP 
- DISK_VAR 
- DISK_VAR_LOG 
- DNS 
- FILEBEAT 
- HTTP_CERTIFICATE 
- LDAP 
- LOGCOLLECTOR 
- NTLM 
- PORTAL 
- QUEUE 
- RADIUS 
- RADIUS_CERTIFICATE 
- RAM 
- SYSTEM_INFO 
- SYSTEM_LOAD 
- TIME_SYNC 
- UDS 
- VPN_NODES 
- WINBIND 
5.1.1.1.4.2. ON Worker
To configure the ON Worker healthcheck, visit the healthcheck configuration section:
- AD_DOMAIN_MEMBER 
- BACKEND 
- BACKUP 
- CACHE 
- CAPTIVE_PORTAL 
- CAPTIVE_PORTAL_THEMES 
- COLLECTD 
- DB 
- DBREPLICATION 
- DHCPHELPERREADER 
- DISK_BACKUP 
- DISK_ROOT 
- DISK_TMP 
- DISK_VAR 
- DISK_VAR_LOG 
- DNS 
- FILEBEAT 
- HTTP_CERTIFICATE 
- LOGCOLLECTOR 
- NTLM 
- NXLOG 
- PORTAL 
- QUEUE 
- RADIUS 
- RADIUS_CERTIFICATE 
- RAM 
- SYSTEM_INFO 
- SYSTEM_LOAD 
- TIME_SYNC 
- UDS 
- VPN_NODES 
- WINBIND 
5.1.1.1.4.3. ON Proxy
To configure ON Proxy healthcheck, visit the healthcheck configuration:
- BACKUP 
- COLLECTD 
- DISK_BACKUP 
- DISK_ROOT 
- DISK_TMP 
- DISK_VAR 
- DISK_VAR_LOG 
- DNS 
- LOGCOLLECTOR 
- RADIUS 
- RADIUS_CERTIFICATE 
- RAM 
- SYSTEM_INFO 
- SYSTEM_LOAD 
- TIME_SYNC 
5.1.1.1.4.4. ON Portal
To configure ON Portal healthcheck, visit the healthcheck configuration:
- BACKUP 
- CACHE 
- CAPTIVE_PORTAL_THEMES 
- COLLECTD 
- DISK_BACKUP 
- DISK_ROOT 
- DISK_TMP 
- DISK_VAR 
- DISK_VAR_LOG 
- DNS 
- HTTP_CERTIFICATE 
- LOGCOLLECTOR 
- PORTAL 
- RAM 
- SYSTEM_INFO 
- SYSTEM_LOAD 
- TIME_SYNC 
5.1.1.1.5. Monitor User
The Monitor user is a utility integrated to our tool that, in this case, will help us to know if RADIUS server authentications are working.
This user implemented in our system is in charge of carrying out authentication processes against the RADIUS every minute, by sending polevals that precisely simulate this authentication process.
We can see the result of the authentication process mentioned in the ON NAC > Default view > Unassigned window. This user always has the MAC address value of 00:00:00:00 :00:00, IP address value of 0.0.0.0 and User value as the name “monitor”.
If the Last Access value of this user is greater than 1 minute, we can conclude that either the RADIUS server is off or the authentication process does not work as it should.
Important
To avoid unwanted access to our system taking advantage of this monitor user, it is important to create a policy that restricts access. This way, we can use this monitoring tool without compromising our security. An example of said policy is shown below, which we will have to position as #1 in our list of policies.
It is also important when creating said user, to configure a complex password and not to write it down.
Name: Monitor
Enabled: YES
- Preconditions:Users(It will have to be created if it does not exist)
User ID:monitor
E-mail: monitor@monitor.es
Password:<Random>
TTL (in minutes):0
- Preconditions: Sources
Supplicant User: YES
User: YES
- Postconditions
- VLAN
ID: 4095
Type: Service
VLAN by default: false
Name: ACCESS DENIED
This username and password will be used by the different network devices to check the status of the RADIUS server.