Support Training Part II

Support Training Part II

 

  • Data Processor Overview.

  • Sending Data to the Platform Manager.

  • The State Folder.

  • Log Spooling.

  • Diagnostic Logs.

  • Troubleshooting Data Processor.

Chap v. data processor

logrhythm components review

Sending Data to the Platform Manager

Disk spooling parameters

The Insert Manager also supports disk spooling to handle overload for any unprocessed logs. Those configuration settings are as follows.
 

The State Folder

The Data Processor's Mediator maintains files for keeping track of unprocessed logs and Events to ensure data is not lost if the Mediator Server service shuts down. It stores logs and Events in memory.  These data files are stored in a sub-directory of the Mediator server at this location (by default):
C:\Program Files\LogRhythm\LogRhythm Mediator Server\state

When the Mediator Server service is restarted, the logs and Events are read from the files in the state folder and then processed in first in first out (FIFO) order.

Unprocessed Logs

In the Deployment Monitor, specific component processing queues starting to grow will be an indication logs are unable to be processed quick enough to handle its current volume. If this is being experienced, double check the UnprocessedLogs state folder, located (by default) at: C:\Program Files\LogRhythm\LogRhythmMediator Server\state\UnprocessedLogs
 

Under normal operation, this folder should be empty. If logs are present in the state folder, this indicates logs are being written to the local drive and there may be an issue with processing.
 

Log Spooling

In the Deployment Monitor, specific component processing queues starting to grow will be an indication logs are unable to be processed quick enough to handle its current volume. If this is being experienced, double check the UnprocessedLogs state folder, located (by default) at: C:\Program Files\LogRhythm\LogRhythmMediator Server\state\UnprocessedLogs
 

Under normal operation, this folder should be empty. If logs are present in the state folder, this indicates logs are being written to the local drive and there may be an issue with processing.
 

diagnostic logs

Troubleshooting Data Processors

Problems with the Data Processor (DP) can affect the LogRhythm Platform's overall log volume and database insert rates. You may need to troubleshoot the Mediator to verify it is handling the current processing load, if there are issues with:

  • not receiving log data from a Log Source.
  • inserting log data into a database or archives.
  • keeping up with processing logs
     

Log Processing Issue

In scenarios where you are experiencing MPE Rule processing issues, your counters may show the following:

  • Queue Count Unprocessed logs will be high or full.
  • Rate Logs Processed are low or zero.
  • The Unprocessed Queue is the bottleneck to the rest of the system so downstream queues and rates will be low or empty. If this folder starts spooling, the Mediator cannot keep up with current log volume or is behind and there is something else causing the issue.
  • You will want to investigate MPE Rule processing stall warnings or unprocessed queue spooling warnings in the scmedsvr.log and scmpe.log files.
    • When a rule takes longer than 1 second to process, a warning message will be generated. Gather the Rule ID, Regex ID, and Log Source ID from this message to identify what to troubleshoot further. Updates to the Log Source configuration, the regular expression used for the MPE Rule, or the insert rate may need to be adjusted to resolve processing issues.

Archiving Issue

Common causes of archiving issues include the following:

  • Disk space exhaustion at active and/or inactive archive folder locations.
    • Free up disk space to resolve this issue.
  • Networking issues when writing inactive archive to a remote location, between theData Processor and Platform Manager when writing to LogRhythm EMDB. These issues can be sporadic and should be verified by examining errors in the archive.log file.
  • Database performance issues slowing Data Processor writes to the LogRhythm EMDB.

Indexing Issue

In scenarios where you are experiencing log indexing problems, the performance counters and diagnostic logs may show the following:

  • Rate Messages Sent and Rate Acks Received will be low or zero.
  • The # Messages Waiting for Ack will be high.
    • If the # Messages Waiting for Ack is high, then the Data Indexer may be backed up and lowering the insert rate may be needed.
  • If you find you are having issues with log indexing
    • Investigate for errors or warnings regarding reliable messaging spooling in the scmedsvr.log file.
    • Review the anubis.log file for errors or warnings regarding the gigawatt database being full.

Common causes of indexing problems include the following:

  • Index rate is very high and causing indices to get very low.
  • Logs are not found in the database.

Event Insert Issues

Events insert issues are most commonly caused by an Events Insert Rate is above 5%.

  • If there is a high volume of Events, check the SQL maintenance jobs.
    • In LogRhythm version 7.1+, Event Index Maintenance jobs only take place on Sundays. If those jobs don't complete, then Events are spooled. You can rerun the Saturday job, or wait for the Sunday Maintenance to re-run that same job.
      • Add the Events Index Maintenance step to the Weekday Maintenance job. This will help the Sunday Maintenance job complete the Events Index Maintenance step in less than 5 hours.
  • EMIM Overall Insert Rate will be low or zero.
  • EMIM Realtime Insert Queue Size and EMIM Disk Insert Queue Size will be very high

In scenarios where you are experiencing Event insert issues, the performance counters and diagnostic logs may show the following:

LogMart Forwarding Issues

Common causes of LogMart issues include the following:

  • LogMart database is at or near capacity.
    • You can quickly review the Deployment Monitor to see the LogMart DB Utilization
  • MS SQL Maintenance jobs running for extended periods cause slow processing of the LoadTable.
  • Too much data to load in the five minute batch insert period.

In scenarios where you are experiencing LogMart forwarding problems, the performance counters or diagnostic logs may show the following:

  • Log Forwarding Rate will be low or zero.
  • Queue Counter Unprocessed Logs and % Full LogMart Heap will be high.
     

AI Engine Forwarding Issue

All identified logs are sent to AI Engine. Creating a new GLPR (Global Log Processing Rule) is the only way you can prevent a log from being forwarded to AI Engine.
In scenarios where you are experiencing AI Engine forwarding issues, the performance counters, and diagnostic logs may show the following:

  • Rate Logs Flushed will be low or zero.
  • Data Queue Size (Kb) will be high or constantly growing.
  • Investigate errors and warnings regarding AI Engine data forwarding in the scmedsvr.log and lraiedp.log files.
  • Review the Performance Counters for AI Engine Data Provider. The Total Logs Flushed number may not be increasing steadily if logs are not being forwarded.

Updates to the network, configuration of the component, or adjustments to AI Engine Alarm Rules to help with performance may be needed to resolve issues.

chap vi. data indexers - dx

 

  • Data Indexer Overview.

  • Working with DX on Windows.

  • Working with DX on Linux.

Support Training Part II

Data Indexer (DX) Overvie

The Data Indexer (also known as the Indexer or DX) provides persistence and search capabilities, as well as high-performance, distributed, and highly scalable indexing of machine and forensic data. Data Indexers store both the original log message and metadata parsed from the logs to enable search-based analytics in the LogRhythm Platform.
The DX is supported on Windows Server 2008 R2, Windows Server 2012 R2, Wibndows Server 2016 and CentOS Linux 7.x minimal, as follows:

  • On Windows: You can install the Data Indexer on an XM appliance, or an upgraded Data Processor appliance (this configuration is called a DPX, and the Indexer is "pinned" to the Data Processor).
  • On Linux: You can install a single Indexer or a cluster of three to 10 Data Indexers on a Linux DX appliance, your own Linux server, or virtual machine. This configuration is called a DX, and the Data Indexer is installed alone. DXs can be clustered in a replicated configuration to enable highavailability, improved search performance, and support for a greater number of simultaneous users.

Working with Data Indexers on a Windows HosT

If you encounter issues when structured searches are performed, review the following information.

  • In a web browser, navigate to:   localhost:9200/_cat/indices?v
    • Look at field_translations entry to confirm there are 119 docs to make sure structured investigations work in the console.
      • If this number is 0, analysts will not be able to perform searches for structured metadata and will receive a Fields Translations error.

Confirm the Status of the DX

The status of a Windows Data Indexer (DX) can be confirmed via a web browser.
Via Web Browser To review the state of the Data Indexer (DX), open a web browser and navigate to:

localhost:9200/_cluster/health?pretty

This page will show one of three possible states of the DX cluster:

  • Green = All is good.
  • Yellow = Indices and shards are still assigning and starting. Log data is still actively being indexed.
    • As long as the percentage complete continues to increase, you just need to wait. If you have large indices this status will just be normal.
  • Red = Cluster is not accepting logs from the DXReliablePersist folder (DXRP) on the Mediator. Log data is not being indexed.

Troubleshooting with Grafana

The LogRhythm Data Indexer has different options to monitor the health of the system that can be found in the Grafana dashboard.

To access Grafana, open a browser on the host where the Windows DX is installed, and navigate to the
following URL:  

http://localhost:8110/

The LogRhythm Monitoring dashboard will display the System CPU, System Memory, and System Disk Used. If Elasticsearch is using an excessive amount of CPU or Memory, restart the Elasticsearch service to free up resources.

Working with Data Indexers on Linux HostS

What version of the LogRhythm Data Indexer software is installed on this
appliance?

  sudo cat /usr/local/logrhythm/version

Confirm The Status Of The DX

The status of a Linux Data Indexer (DX) can be confirmed via the command line.
 

Via Command Line.


You can verify the state, or health, of the DX by opening a command prompt and running the  following command:

  watch -n 2 "curl localhost:9200/_cluster/health?pretty

  curl localhost:9200/_cat/indices?v | grep "red"

 

  curl localhost:9200/_cat/shards?v

chap vii. reports

  • Report Center.

  • Report Templates.

  • Runing a Report as Investigation.

  • Report Packages.

Support Training Part II

report center

report Templates

Report templates are used to format the data returned by the report filters. Report templates can be created by any LogRhythm administrator to customize the layout or format of the data defined in a report.
Report Templates cannot be edited or cloned, they can only be created.

Each report template contains two columns that define the data contained in the report and the organization, or layout, for that data.

Running a Report as an InvestigatioN

A benefit of system reports is the ability to run those reports as an Investigation. When run as an Investigation, the report will pass all of its data filter settings and Log Source Criteria to the Investigate tool and the Data Processor's database is queried

Running a Report as an InvestigatioN

The final component of the Report Center is the Report Packages tab. Many report packages are available out-of-the-box, mainly associated with compliance modules, or you can create custom report packages.
Below is a screenshot of the
Log Volume Daily Reports package and some of the reports included with that report package

chap viii. Alarms

  • Alarms Overview.

  • Managing Alarms.

  • System vs Custom Alarms.

  • Create Alarm Rule from Log Data.

  • Drilldown.

  • Tuning Alarm Rules for Performance.

  • Troubleshooting Alarms.

Support Training Part II

alarms overview

An Alarm Rule is a saved set of criteria that looks for a matching log or group of logs. Alarms are flexible and can be configured to monitor for individual occurrences of a log message, or monitor for several types of different logs. When logs are identified as meeting the criteria specified in the Alarm Rule, an Alarm is generated.

 

There are two varieties of Alarm Rules in LogRhythm:

  • Traditional Alarm Rules.
  • AI Engine Alarm Rules.

Managing Alarms

Management of Alarm Rules is performed through the Alarms or AI Engine tab in the Deployment Manager.

System vs. Custom Alarm Rules

LogRhythm has hundreds of system Alarm Rules available out-of-the-box.
System Alarm Rules have been created and tested by the LogRhythm Labs team.
Custom Alarm Rules are created by administrators. Most issues related to Alarms stem from custom Alarm Rules.

Create Alarm Rules from Log Data

To create a new Alarm rule from log data, locate your desired log message in the Personal Dashboard or Investigate tool, and simply right-click and select Create Alarm Rule. You can create Alarm Rules directly from the Personal Dashboard in the Aggregate Log/Event List view or from the Investigate tool in the Log Viewer page.

drilldown

Drilldown is a feature that allows you to view the detailed log data associated with an AI Engine Alarm.
Keep in mind that only Events cached in memory are available during a Drilldown operation. If the log messages that triggered the Alarm were sent directly to archives instead of indexed in a database,perhaps due to the Classification Based Data Management (CBDM) Settings, you may not see the expected data after performing a Drilldown.

Tuning Alarm Rules for Performance

Alarm Rules, and especially AI Engine Alarm Rules, add powerful capabilities to the detection of risks and threats in an environment. It might seem wise to enable all out-of-the-box Alarm Rules, however, remember that each rule comes at a cost to performance.
 

AI Engine tracks the performance of a rule with CPU Cost value, shown in the AI Engine tab. The more logs that an Alarm Rule must examine and the more filters it must process, the higher the associated CPU Cost.

Troubleshooting Alarms

When troubleshooting issues with Alarms, review the scarm.log and nfns.log files found in the following
directory:


C:\Program Files\LogRhythm\LogRhythm Alarming and Response Manager\logs


If you observe Maximum Memory reached error or warning messages in the scarm.log file, the MaxServiceMemory_ARM value can be increased in the Platform Manager Advanced Properties window in the
Client Console (this value is entered in MB):

LogRhythm Support Training Part II

By Julio César

LogRhythm Support Training Part II

Courso de LogRhythm

  • 283
Loading comments...

More from Julio César