The event log is a Coral feature that makes it possible to log incoming and outgoing events for every actor in a runtime. To enable the event log in your runtime, you first need to make sure that the event log is enabled on the platform.
In your Coral configuration file, set the following value:
Additionally, coral.cassandra.event-log.table
should be set to a valid table name. For the event log to work properly, a valid connection to Cassandra must be provided.
If this is all set up properly, it is now possible to enable the event log in your runtime. The event log is configured per actor, and can be enabled for the following types of events:
Incoming messages
Enable incoming messages for an actor if you want to use the event log for state recalculation through spark-batch or log-batch. You cannot enable spark-batch or log-batch for an actor which does not have incoming event logging enabled.
Outgoing messages
Enable outgoing message logging for an actor if you want to use the event log for the following reasons:
suspicious
and marks all suspicious transactions with true
and the others with false
, you can perform an offline analysis to investigate properties of suspicious transactions in a Spark job.Both incoming and outgoing messages are written to the same event log table in Cassandra. By default, both incoming and outgoing event logging are disabled for all actors.
Please keep in mind that event logging can create a large load on your Cassandra instance and on Coral in general. If an actor receives 100.000 events per second, this means that 100.000 events per second will be written to the event log table if incoming messages are logged for an actor. If outgoing messages are also logged, there can be a maximum of 200.000 messages written to Cassandra per second, and this is just for a single actor. If we assume that a single event is 1 KB, a single actor alone will write 97 MB to the event log per second. To put this into perspective, this is almost 3 petabyte per year (3.079.687.500 MB)!
The number of messages becomes even bigger if you have multiple actors in your runtime and you enable event logging for incoming and outgoing events for all of them. It is therefore recommended to only turn on event logging for certain actors if you really need it, and to be very reluctant to enable incoming and outgoing event logging for all of your actors in your runtime.
It is generally a good idea to think really hard about what you want to log. There are no hard rules, but the following guidelines could provide helpful:
Overall, try to reduce the number of logged messages to an absolute minimum.
The log file table is structured as follows:
Name | Type | Description |
logtype | int | 1 for incoming and 2 for outgoing messages |
owner | text | the unique name of the owner of the runtime |
runtime | text | the name of the runtime |
actor | text | the name of the actor |
day | int | the day, ranging from 1 to 31 in which the event was logged |
hour | int | the hour, ranging from 0 to 23 in which the event was logged |
timestamp | bigint | the timestamp on which the event was logged |
json | text | the JSON object in textual representation |
fjson | text | the flattened JSON in textual representation |
The primary key for the event log table is (logtype, owner, runtime, actor, day, hour)
. This means that if you want to query for events spanning multiple hours, days, actors, runtimes or owners, you have to break the ranges up in separate queries.
Below, an example of a fraction of an actor definition is shown, wich uses incoming event logging to calculate spark batch every hour:
Note that it would have been an invalid runtime definition if incoming
was set to false
and spark-batch
was enabled for this actor.
As described in the section about state calculation, the event log can be turned on or off for individual actors, and for incoming and outgoing messages separately. The event log is meant as a mechanism to store events to perform long-running state calculations on and to make offline analysis possible. The event log, however, is not meant as a mechanism for fast recovery after an actor, runtime or machine crashes.
For this reason, Coral implements persistency for resiliency purposes separately from the event log. In contrast to the event log, Akka persistence cannot be turned on or off for individual actors, and is always enabled for all actors. It can be turned on or off for the entire platform, however.
The number of events and the number of snapshots stored in the persistency tables can be configured, but it is recommended not to extend the storage period for the persistency tables beyond the minimum amount of time needed to store a single snapshot and all events after that. For instance, if an actor is configured to calculate a new snapshot every hour, it is sufficient to configure persistency to keep data for a little longer than an hour. Older snapshots can safely be discarded.
Keep in mind that persistency is turned on for all actors by default, meaning that the size of the storage can grow very fast. It is therefore recommended to keep the time to store persistency snapshots and events as short as possible.