> ## Documentation Index
> Fetch the complete documentation index at: https://docs.honeycomb.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Scale and Size Honeycomb Refinery

> Size and scale your Refinery cluster using Honeycomb's recommended configuration options and the Refinery Operations Board Template to track performance.

export const CalloutExample = ({children}) => {
  return <Callout icon="clipboard-check" color="#6B7280">
      {children}
    </Callout>;
};

<Tip>
  Use the [Refinery Board Template](/observe/boards/templates/#refinery-operations) to create Boards that provide an overview of your sampling operations.
</Tip>

Refinery offers a range of [configuration options](/manage-data-volume/sample/honeycomb-refinery/configure/) to help operators tune it for varying volumes and shapes of telemetry data.

After your initial [setup](/manage-data-volume/sample/honeycomb-refinery/set-up/), we recommend increasing RAM and CPU cores as needed.
Use the guidance on this page for scaling, and consult our [troubleshooting](/troubleshoot/common-issues/refinery/) documentation for additional support.

<Note>
  Refinery is a stateful service and is not optimized for dynamic auto-scaling.
  Changes in cluster membership can temporarily cause inconsistent sampling decisions or dropped traces.
  We recommend provisioning Refinery for your anticipated peak load.
</Note>

## Understanding Stress Relief

Refinery includes a built-in mechanism called [Stress Relief](/manage-data-volume/sample/honeycomb-refinery/configure/#stress-relief) that activates when the system is under heavy load.
Frequent or prolonged activations indicate that Refinery is under-provisioned for the current load.
You can monitor this via the `stress_relief_activated` field in Refinery internal metrics.

### Identifying Resources to Adjust

To determine which resources need to be increased, check the activation reasons in Refinery logs:

1. Look for log messages like `StressRelief has been activated`.
2. Check the `reason` field in the log message to understand what triggered the activation.
   For example, a reason of [`MaxAlloc`](/manage-data-volume/sample/honeycomb-refinery/configure/#maxalloc) indicates a sudden memory usage spike.
3. Use this information to determine which resources need to be increased, such as memory, CPU, or queue sizes.

## Scaling Refinery

Scaling Refinery effectively involves choosing the right balance between vertical and horizontal scaling.

### Vertical vs. Horizontal Scaling

We recommend prioritizing vertical scaling (adding resources to existing nodes) over horizontal scaling (adding more nodes) whenever possible.
This approach:

* Reduces cluster size
* Decreases the amount of peer-to-peer communication traffic
* Simplifies management by having fewer nodes to maintain

Focus on ensuring fewer nodes can handle your peak load effectively before considering adding additional instances.

<Info>
  Refinery's maximum throughput is limited by single-thread CPU performance.
  If Refinery is not using all allocated CPU but is still falling behind processing incoming traffic, adding more CPU to a single host will not increase throughput.
  In this case, increase cluster size to add parallel Refinery instances.
</Info>

## Managing Queues and Mapping Resources

Queues control how spans are buffered before sampling.
Proper queue configuration ensures that Refinery can handle peak load efficiently.

### Configuring `IncomingQueueSize`

The [`IncomingQueueSize`](/manage-data-volume/sample/honeycomb-refinery/configure/#incomingqueuesize) value sets the maximum number of spans that a Refinery host can receive and queue for sampling.
Monitor the current queue size using the `collector_incoming_queue_length` metric and watch for `incoming_router_dropped` values above 0.

#### Interpreting Queue Length Metrics

Understand what queue behavior tells you about Refinery’s ability to handle incoming traffic.

* **Temporary increases:** Normal during traffic spikes when Refinery temporarily cannot process incoming data at arrival rate.
* **Rising trend:** Indicates Refinery is gradually falling behind the incoming load.
* **Queue at maximum:** Indicates Refinery cannot handle peak load and is dropping data.

#### Scaling Guidance

Use these steps to decide how to adjust queues, CPU, and cluster size for optimal performance.

1. **Memory check:** If `memory_inuse` is within 80% of allocated memory, try increasing [`IncomingQueueSize`](/manage-data-volume/sample/honeycomb-refinery/configure/#incomingqueuesize) to absorb load.
2. **Queue size limitation:** Increasing queue size delays failure but does not increase overall throughput.
3. **CPU scaling:** To increase throughput, identify whether CPU is the bottleneck and scale CPU resources accordingly.
4. **Horizontal scaling:** Add instances only if vertical scaling is insufficient.

### Configuring `PeerQueueSize`

The [`PeerQueueSize`](/manage-data-volume/sample/honeycomb-refinery/configure/#peerqueuesize) value sets the maximum spans that can be received from peer Refinery hosts and queued for sampling.

Apply the same scaling strategy as [`IncomingQueueSize`](/manage-data-volume/sample/honeycomb-refinery/configure/#incomingqueuesize), but note that adding instances to reduce peer queue length has diminishing returns: more peers increase overall cluster communication overhead, reinforcing the preference for vertical scaling.

### Configuring `AvailableMemory`

The [`AvailableMemory`](/manage-data-volume/sample/honeycomb-refinery/configure/#availablememory) value sets the maximum amount or RAM that Refinery can use for processing and queues.

#### Setting Initial Memory Values

Set memory values to ensure Refinery has enough headroom for normal operation.

* Set `AvailableMemory` to roughly 85% of total system memory.
* Set [`MaxMemoryPercentage`](/manage-data-volume/sample/honeycomb-refinery/configure/#maxmemorypercentage) to `75`, indicating that Refinery can use up to 75% of [`AvailableMemory`](/manage-data-volume/sample/honeycomb-refinery/configure/#availablememory).

<CalloutExample>
  For a 4GB system, set [`AvailableMemory`](/manage-data-volume/sample/honeycomb-refinery/configure/#availablememory) to \~3.4GB and [`MaxMemoryPercentage`](/manage-data-volume/sample/honeycomb-refinery/configure/#maxmemorypercentage) to 75% (\~2.5GB usable).
</CalloutExample>

#### Tuning Memory for Stability

Adjust memory allocations to prevent restarts and handle peak load safely.

Monitor `process_uptime_seconds` for unexpected restarts.
If Refinery restarts due to Out-of-Memory exceptions or the host's Out-of-Memory Killer, either increase the memory made available to the Refinery host or reduce [`MaxMemoryPercentage`](/manage-data-volume/sample/honeycomb-refinery/configure/#maxmemorypercentage) to provide more headroom.
