Appuri Documentation

Welcome to the Appuri Documentation. You'll find comprehensive guides and instructions to help you start working with Appuri as quickly as possible, as well as support if you get stuck.

Documentation

From the status page, you will see an overview of the status of your Appuri Data Warehouse, and all the event data being transferred into it.

This includes disk utilization, free space, and errors. However, there are some caveats, so please read the following sections carefully.

Event Activity

This is a graph of events being transferred into your data warehouse, and a log of how many events were processed over time. This is particularly useful for developers who want to verify that activity is being transferred and processed into the Appuri Data Warehouse, but it can be a bit misleading at first glance. We will provide a brief explanation of Event Activity, but please feel free to reach out if you have any questions

Events Collected

This is an estimated count of all events received by the Appuri Data Warehouse. Each event sent will appear here, although some may be filtered out by reduplication and validation processes. This count is not intended for auditing purposes, but is useful in estimating utilization and storage requirements.

Events Ingested

This is an estimated count of unique events available in Redshift (and the Appuri Platform), after filtering, reduplication, and validation. From time to time, this number may appear larger (due to data normalization), smaller (due to heavy de-duplication), or roughly equal. This is normal, as this count is not intended for auditing purposes, but only to provide estimates that are useful in estimating utilization and storage requirements.

Average Backlog

This is the average number of events collected which are queued and waiting to be processed. Since Appuri performs several steps to schematize, de-duplicate, and normalize your data, from time to time a backlog will occur. The average backlog helps us allocate additional computation resources to accounts with heavy data ingestion.

Cluster Utilization

This shows the amount of disk space available to your Redshift cluster, and how much disk space is currently being utilized. Please note, 100% utilization is not possible for a redshift cluster. Because some free space is required for background jobs to process, we recommend resizing to a larger cluster once utilization reaches 70%. In most cases, you will experience delays when querying data, and background jobs will begin to fail once utilization reaches about 75%.

We send automated warning messages at 80% utilization, and stop event ingestion at 90% to prevent system failures.

Frequently Asked Questions

Q. My utilization is only at 80%. Why am I being asked to add a new Redshift node to my cluster?

A. Most tasks in Redshift require temporary space to complete. Because some jobs will begin to fail at as little as 75% utilization, we recommend adding nodes or enabling a retention policy at this time to reduce utilization. See Cluster Utilization above for more details.

Q. How can I increase my cluster size, to increase the amount of space available?

A: You can add additional nodes to your cluster on the Settings page. Note only users with a role of Admin may increase the node count.

Status