RabbitMQ Stream PerfTest is a Java tool to test the RabbitMQ Stream Plugin.
Getting Started
Stream PerfTest is based on the stream Java client and uses the stream protocol to communicate with a RabbitMQ cluster. Use PerfTest if you want to test streams or queues with AMQP 0.9.1 .
Stream PerfTest is usable as an uber JAR downloadable from GitHub Release or as a Docker image . It can be built separately as well.
Snapshots are on GitHub release as well.
Use the pivotalrabbitmq/stream-perf-test:dev
image to use the latest snapshot in Docker.
Pre-requisites
Stream PerfTest requires Java 11 or later to run.
With Docker
The performance tool is available as a Docker image . You can use the Docker image to list the available options:
docker run -it --rm pivotalrabbitmq/stream-perf-test --help
There are all sorts of options, if none is provided, the tool will start publishing to and consuming from a stream created only for the test.
When using Docker, the container running the performance tool must be able to connect to the broker, so you have to figure out the appropriate Docker configuration to make this possible. You can have a look at the Docker network documentation to find out more.
Docker on macOS
Docker runs on a virtual machine when using macOS, so do not expect high performance when using RabbitMQ Stream and the performance tool inside Docker on a Mac. |
We show next a couple of options to easily use the Docker image.
With Docker Host Network Driver
This is the simplest way to run the image locally, with a local broker running in Docker as well. The containers use the host network , this is perfect for experimenting locally.
# run the broker
docker run -it --rm --name rabbitmq --network host rabbitmq:4.0
# open another terminal and enable the stream plugin
docker exec rabbitmq rabbitmq-plugins enable rabbitmq_stream
# run the performance tool
docker run -it --rm --network host pivotalrabbitmq/stream-perf-test
Docker Host Network Driver Support
According to Docker’s documentation, the host networking driver only works on Linux hosts. Nevertheless, the commands above work on some Mac hosts. |
With Docker Bridge Network Driver
Containers need to be able to communicate with each other with the bridge network driver , this can be done by defining a network and running the containers in this network.
# create a network
docker network create stream-perf-test
# run the broker
docker run -it --rm --network stream-perf-test --name rabbitmq rabbitmq:4.0
# open another terminal and enable the stream plugin
docker exec rabbitmq rabbitmq-plugins enable rabbitmq_stream
# run the performance tool
docker run -it --rm --network stream-perf-test pivotalrabbitmq/stream-perf-test \
--uris rabbitmq-stream://rabbitmq:5552
With the Java Binary
The Java binary is available on GitHub Release . Snapshots are available as well. To use the latest snapshot:
wget https://github.com/rabbitmq/rabbitmq-java-tools-binaries-dev/releases/download/v-stream-perf-test-latest/stream-perf-test-latest.jar
To launch a run:
$ java -jar stream-perf-test-latest.jar 17:51:26.207 [main] INFO c.r.stream.perf.StreamPerfTest - Starting producer 1, published 560277 msg/s, confirmed 554088 msg/s, consumed 556983 msg/s, latency min/median/75th/95th/99th 2663/9799/13940/52304/57995 µs, chunk size 1125 2, published 770722 msg/s, confirmed 768209 msg/s, consumed 768585 msg/s, latency min/median/75th/95th/99th 2454/9599/12206/23940/55519 µs, chunk size 1755 3, published 915895 msg/s, confirmed 914079 msg/s, consumed 916103 msg/s, latency min/median/75th/95th/99th 2338/8820/11311/16750/52985 µs, chunk size 2121 4, published 1004257 msg/s, confirmed 1003307 msg/s, consumed 1004981 msg/s, latency min/median/75th/95th/99th 2131/8322/10639/14368/45094 µs, chunk size 2228 5, published 1061380 msg/s, confirmed 1060131 msg/s, consumed 1061610 msg/s, latency min/median/75th/95th/99th 2131/8247/10420/13905/37202 µs, chunk size 2379 6, published 1096345 msg/s, confirmed 1095947 msg/s, consumed 1097447 msg/s, latency min/median/75th/95th/99th 2131/8225/10334/13722/33109 µs, chunk size 2454 7, published 1127791 msg/s, confirmed 1127032 msg/s, consumed 1128039 msg/s, latency min/median/75th/95th/99th 1966/8150/10172/13500/23940 µs, chunk size 2513 8, published 1148846 msg/s, confirmed 1148086 msg/s, consumed 1149121 msg/s, latency min/median/75th/95th/99th 1966/8079/10135/13248/16771 µs, chunk size 2558 9, published 1167067 msg/s, confirmed 1166369 msg/s, consumed 1167311 msg/s, latency min/median/75th/95th/99th 1966/8063/9986/12977/16757 µs, chunk size 2631 10, published 1182554 msg/s, confirmed 1181938 msg/s, consumed 1182804 msg/s, latency min/median/75th/95th/99th 1966/7963/9949/12632/16619 µs, chunk size 2664 11, published 1197069 msg/s, confirmed 1196495 msg/s, consumed 1197291 msg/s, latency min/median/75th/95th/99th 1966/7917/9955/12503/15386 µs, chunk size 2761 12, published 1206687 msg/s, confirmed 1206176 msg/s, consumed 1206917 msg/s, latency min/median/75th/95th/99th 1966/7893/9975/12503/15280 µs, chunk size 2771 ... ^C Summary: published 1279444 msg/s, confirmed 1279019 msg/s, consumed 1279019 msg/s, latency 95th 12161 µs, chunk size 2910
The previous command will start publishing to and consuming from a stream
stream that
will be created. The tool outputs live metrics on the console and write more
detailed metrics in a stream-perf-test-current.txt
file that get renamed to
stream-perf-test-yyyy-MM-dd-HHmmss.txt
when the run ends.
To see the options:
java -jar stream-perf-test-latest.jar --help
The performance tool comes also with a
completion script .
You can download it and enable it in your ~/.zshrc
file:
alias stream-perf-test='java -jar target/stream-perf-test.jar' source ~/.zsh/stream-perf-test_completion
Note the activation requires an alias which must be stream-perf-test
. The command can be anything
though.
Common Usage
Connection
The performance tool connects by default to localhost, on port 5552, with
default credentials (guest
/guest
), on the default /
virtual host.
This can be changed with the --uris
option:
java -jar stream-perf-test.jar --uris rabbitmq-stream://rabbitmq-1:5552
The URI follows the same rules as the
AMQP 0.9.1 URI ,
except the protocol must be rabbitmq-stream
.
The next command shows how to set up the different elements of the URI:
java -jar stream-perf-test.jar \ --uris rabbitmq-stream://guest:guest@localhost:5552/%2f
The option accepts several values, separated by commas. By doing so, the tool will be able to pick another URI for its "locator" connection, in case a node crashes:
java -jar stream-perf-test.jar \ --uris rabbitmq-stream://rabbitmq-1:5552,rabbitmq-stream://rabbitmq-2:5552
Note the tool uses those URIs only for management purposes, it does not use them to distribute publishers and consumers across a cluster.
It is also possible to enable TLS by using the rabbitmq-stream+tls
scheme:
java -jar stream-perf-test.jar \ --uris rabbitmq-stream+tls://guest:guest@localhost:5551/%2f
Note the performance tool will automatically configure the client to trust all server certificates and to not use a private key (for client authentication).
Have a look at the connection logic section of the stream Java client in case of connection problem.
Publishing Rate
It is possible to limit the publishing rate with the --rate
option:
java -jar stream-perf-test.jar --rate 10000
RabbitMQ Stream can easily saturate the resources of the hardware, it can especially max out the storage IO. Reasoning when a system is under severe constraints can be difficult, so setting a low publishing rate can be a good idea to get familiar with the performance tool and the semantics of streams.
Number of Producers and Consumers
You can set the number of producers and consumers with the --producers
and
--consumers
options, respectively:
java -jar stream-perf-test.jar --producers 5 --consumers 5
With the previous command, you should see a higher consuming rate than publishing rate. It is because the 5 producers publish as fast as they can and each consumer consume the messages from the 5 publishers. In theory the consumer rate should be 5 times the publishing rate, but as stated previously, the performance tool may put the broker under severe constraints, so the numbers may not add up.
You can set a low publishing rate to verify this theory:
java -jar stream-perf-test.jar --producers 5 --consumers 5 --rate 10000
With the previous command, each publisher should publish 10,000 messages per second, that is 50,000 messages per second overall. As each consumer consumes each published messages, the consuming rate should be 5 times the publishing rate, that is 250,000 messages per second. Using a small publishing rate should let plenty of resources to the system, so the rates should tend towards those values.
Streams
The performance tool uses a stream
stream by default, the --streams
option allows
specifying streams that the tool will try to create. Note producer
and consumer counts must be set accordingly, as they are not spread across the
stream automatically. The following command will run a test with 3 streams, with
a producer and a consumer on each of them:
java -jar stream-perf-test.jar --streams stream1,stream2,stream3 \ --producers 3 --consumers 3
The stream creation process has the following semantics:
-
the tool always tries to create streams.
-
if the target streams already exist and have the exact same properties as the ones the tool uses (see retention below), the run will start normally as stream creation is idempotent.
-
if the target streams already exist but do not have the exact same properties as the ones the tool uses, the run will start normally as well, the tool will output a warning.
-
for any other errors during creation, the run will stop.
-
the streams are not deleted after the run.
-
if you want the tool to delete the streams after a run, use the
--delete-streams
flag.
Specifying streams one by one can become tedious as their number grows, so the --stream-count
option can be combined with the --streams
option to specify a number or a range and a stream name
pattern, respectively. The following table shows the usage of these 2 options and the resulting
exercised streams. Do not forget to also specify the appropriate number of producers and
consumers if you want all the declared streams to be used.
Options | Computed Streams | Details |
---|---|---|
|
|
Stream count starts at 1. |
|
|
Possible to specify a Java printf-style format string . |
|
|
Not bad, but not correctly sorted alphabetically. |
|
|
Better for sorting. |
|
|
The default format string handles the sorting issue. |
|
|
Ranges are accepted. |
|
|
Default format string. |
Publishing Batch Size
The default publishing batch size is 100, that is a publishing frame is sent every 100 messages.
The following command sets the batch size to 50 with the --batch-size
option:
java -jar stream-perf-test.jar --batch-size 50
There is no ideal batch size, it is a tradeoff between throughput and latency. High batch size values should increase throughput (usually good) and latency (usually not so good), whereas low batch size should decrease throughput (usually not good) and latency (usually good).
Unconfirmed Messages
A publisher can have at most 10,000 unconfirmed messages at some point. If it reaches this value,
it has to wait until the broker confirms some messages. This avoids fast publishers overwhelming
the broker. The --confirms
option allows changing the default value:
java -jar stream-perf-test.jar --confirms 20000
High values should increase throughput at the cost of consuming more memory, whereas low values should decrease throughput and memory consumption.
Message Size
The default size of a message is 10 bytes, which is rather small. The --size
option lets you
specify a different size, usually higher, to have a value close to your use case. The next command
sets a size of 1 KB:
java -jar stream-perf-test.jar --size 1024
Note the message body size cannot be smaller that 8 bytes, as the performance tool stores a long in each message to calculate the latency. Note also the actual size of a message will be slightly higher, as the body is wrapped in an AMQP 1.0 message .
Connection Pooling
The performance tool does not use connection pooling by default: each producer and consumer has its own connection. This can be appropriate to reach maximum throughput in performance test runs, as producers and consumers do not share connections. But it may not always reflect what applications do, as they may have slow producers and not-so-busy consumers, so sharing connections becomes interesting to save some resources.
It is possible to configure connection pooling with the --producers-by-connection
and --consumers-by-connection
options.
They accept a value between 1 and 255, the default being 1 (no connection pooling).
In the following example we use 10 streams with 1 producer and 1 consumer on each of them. As the rate is low, we can re-use connections:
java -jar stream-perf-test.jar --producers 10 --consumers 10 --stream-count 10 \ --rate 1000 \ --producers-by-connection 50 --consumers-by-connection 50
We end up using 2 connections for the producers and consumers with connection pooling, instead of 20.
Advanced Usage
Retention
If you run performance tests for a long time, you might be interested in setting
a retention strategy for
the streams the performance tool creates for a run. This
would typically avoid saturating the storage devices of your servers.
The default values are 20 GB for the maximum size of a stream and
500 MB for each segment files that composes a stream. You can change
these values with the --max-length-bytes
and --stream-max-segment-size-bytes
options:
java -jar stream-perf-test.jar --max-length-bytes 10gb \ --stream-max-segment-size-bytes 250mb
Both options accept units (kb
, mb
, gb
, tb
), as well as no unit to
specify a number of bytes.
It is also possible to use the time-based retention strategy with the --max-age
option.
This can be less predictable than --max-length-bytes
in the context of performance tests though.
The following command shows how to set the maximum age of segments to 5 minutes with
a maximum segment size of 250 MB:
java -jar stream-perf-test.jar --max-age PT5M \ --stream-max-segment-size-bytes 250mb
The --max-age
option uses the
ISO 8601 duration format .
Offset (Consumer)
Consumers start by default at the very end of a stream (offset next
).
It is possible to specify an offset
to start from with the --offset
option,
if you have existing streams, and you want to consume from them at a specific offset.
The following command sets the consumer to start consuming at the beginning of
a stream:
java -jar stream-perf-test.jar --offset first
The accepted values for --offset
are first
, last
, next
(the default),
an unsigned long for a given offset, and an ISO 8601 formatted timestamp
(eg. 2020-06-03T07:45:54Z
).
Offset Tracking (Consumer)
A consumer can track the point it has reached
in a stream to be able to restart where it left off in a new incarnation.
The performance tool has the --store-every
option to tell consumers to store
the offset every x
messages to be able to measure the impact of offset tracking
in terms of throughput and storage. This feature is disabled by default.
The following command shows how to store the offset every 100,000 messages:
java -jar stream-perf-test.jar --store-every 100000
Consumer Names
When using --store-every
(see above) for offset tracking ,
the performance tool uses a default name using the pattern {stream-name}-{consumer-number}
.
So the default name of a single tracking consumer consuming from stream
will be stream-1
.
The consumer names pattern can be set with the --consumer-names
option, which uses
the Java printf-style format string .
The stream name and the consumer number are injected as arguments, in this order.
The following table illustrates some examples for the --consumer-names
option
for a s1
stream and a second consumer:
Option | Computed Name | Details |
---|---|---|
|
|
Default pattern. |
|
|
|
|
|
The argument indexes ( |
|
|
Random UUID that changes for every run. |
Note you can use --consumer-names uuid
to change the consumer names for every run. This
can be useful when you want to use tracking consumers in different runs but you want to
force the offset they start consuming from. With consumer names that do not change between runs,
tracking consumers would ignore the specified offset and would start where they left off
(this is the purpose of offset tracking).
Producer Names
You can use the --producer-names
option to set the producer names pattern and therefore
enable message deduplication (using the default
publishing sequence starting at 0 and incremented for each message).
The same naming options apply as above in consumer names with the only
difference that the default pattern is empty (i.e. no deduplication).
Here is an example of the usage of the --producer-names
option:
java -jar stream-perf-test.jar --producer-names %s-%d
The run will start one producer and will use the stream-1
producer reference (default stream is stream
and the number of the producer is 1.)
Load Balancer in Front of the Cluster
A load balancer can misguide the performance tool when it tries to connect to nodes that host stream leaders and replicas. The Connecting to Streams blog post covers why client applications must connect to the appropriate nodes in a cluster.
Use the --load-balancer
flag to make sure the performance tool always goes through the load balancer that sits in front of your cluster:
java -jar stream-perf-test.jar --uris rabbitmq-stream://my-load-balancer:5552 \ --load-balancer
The same blog post covers why a load balancer can make things more complicated for client applications like the performance tool and how they can mitigate these issues .
Single Active Consumer
If the --single-active-consumer
flag is set, the performance tool will create single active consumer instances.
This means that if there are more consumers than streams, there will be only one active consumer at a time on a stream, if they share the same name.
Note offset tracking gets enabled automatically if it’s not with --single-active-consumer
(using 10,000 for --store-every
).
Let’s see a couple of examples.
In the following command we have 1 producer publishing to 1 stream and 3 consumers on this stream.
As --single-active-consumer
is used, only one of these consumers will be active at a time.
java -jar stream-perf-test.jar --producers 1 --consumers 3 --single-active-consumer \ --consumer-names my-app
Note we use a fixed value for the consumer names: if they don’t have the same name, the broker will not consider them as a group of consumers, so they will all get messages, like regular consumers.
In the following example we have 2 producers for 2 streams and 6 consumers overall (3 for each stream).
Note the consumers have the same name on their streams with the use of --consumer-names my-app-%s
, as %s
is a placeholder for the stream name.
java -jar stream-perf-test.jar --producers 2 --consumers 6 --stream-count 2 \ --single-active-consumer --consumer-names my-app-%s
Super Streams
The performance tool has a --super-streams
flag to enable super streams on the publisher and consumer sides.
This support is meant to be used with the --single-active-consumer
flag, to benefit from both features .
We recommend reading the appropriate sections of the documentation to understand the semantics of the flags before using them.
Let’s see some examples.
The example below creates 1 producer and 3 consumers on the default stream
, which is now a super stream because of the --super-streams
flag:
java -jar stream-perf-test.jar --producers 1 --consumers 3 --single-active-consumer \ --super-streams --consumer-names my-app
The performance tool creates 3 individual streams by default, they are the partitions of the super stream.
They are named stream-0
, stream-1
, and stream-2
, after the name of the super stream, stream
.
The producer will publish to each of them, using a hash-based routing strategy .
A consumer is composite with --super-streams
: it creates a consumer instance for each partition.
This is 9 consumer instances overall – 3 composite consumers and 3 partitions – spread evenly across the partitions, but with only one active at a time on a given stream.
Note we use a fixed consumer name so that the broker considers the consumers belong to the same group and enforce the single active consumer behavior.
The next example is more convoluted.
We are going to work with 2 super streams (--stream-count 2
and --super-streams
).
Each super stream will have 5 partitions (--super-stream-partitions 5
), so this is 10 streams overall (stream-1-0
to stream-1-4
and stream-2-0
to stream-2-4
).
Here is the command line:
java -jar stream-perf-test.jar --producers 2 --consumers 6 --stream-count 2 \ --super-streams --super-stream-partitions 5 \ --single-active-consumer \ --consumer-names my-app-%s
We see also that each super stream has 1 producer (--producers 2
) and 3 consumers (--consumers 6
).
The composite consumers will spread their consumer instances across the partitions.
Each partition will have 3 consumers but only 1 active at a time with --single-active-consumer
and --consumer-names my-app-%s
(the consumers on a given stream have the same name, so the broker make sure only one consumes at a time).
Note the performance tool does not use connection pooling by default.
The command above opens a significant number of connections – 30 just for consumers – and may not reflect exactly how applications are deployed in the real world.
Don’t hesitate to use the --producers-by-connection
and --consumers-by-connection
options to make the runs as close to your workloads as possible.
Monitoring
The tool can expose some runtime information on HTTP. The default port is 8080. The following options are available:
-
--monitoring
: add athreaddump
endpoint to display a thread dump of the process. This can be useful to inspect threads if the tool seems blocked. -
--prometheus
: add ametrics
endpoint to expose metrics using the Prometheus format. The endpoint can then be declared in a Prometheus instance to scrape the metrics. -
--monitoring-port
: set the port to use for the web server.
Synchronizing Several Instances
This feature is available only on Java 11 or more. |
Instances of the performance tool can synchronize to start at the same time.
This can prove useful when you apply different workloads and want to compare them on the same monitoring graphics.
The --id
flag identifies the group of instances that need to synchronize and the --expected-instances
flag sets the size of the group.
Let’s start a couple of instances to compare the impact of message size. The first instance uses 100-byte message:
java -jar stream-perf-test.jar --id msg-size-comparison --expected-instances 2 \ --size 100
The instance will wait until the second one is ready:
java -jar stream-perf-test.jar --id msg-size-comparison --expected-instances 2 \ --size 200
Both instances must share the same --id
if they want to communicate to synchronize.
The default synchronization timeout is 10 minutes.
This can be changed with the --instance-sync-timeout
flag, using a value in seconds.
Instance synchronization is compatible with PerfTest , the AMQP 0.9.1 performance tool for RabbitMQ: instances of both tools can synchronize with each other. The 2 tools use the same flags for this feature. |
Instance synchronization requires IP multicast to be available.
IP multicast is not necessary when the performance tool runs on Kubernetes pods.
In this case, the tool asks Kubernetes for a list of pod IPs.
The performance tool instances are expected to run in the same namespace, and the namespace must be available in the MY_POD_NAMESPACE
environment variable or provided with the --instance-sync-namespace
flag.
As soon as the namespace information is available, the tool will prefer listing pod IPs over using IP multicast.
Here is an example of using instance synchronization on Kubernetes by providing the namespace explicitly:
java -jar stream-perf-test.jar --id workload-1 --expected-instances 2 \ --instance-sync-namespace qa
The performance tool needs permission to ask Kubernetes for a list of pod IPs. This is done by creating various policies e.g. with YAML. See the Kubernetes discovery protocol for JGroups page for more information. |
Using Environment Variables as Options
Environment variables can sometimes be easier to work with than command line options. This is especially true when using a manifest file for configuration (with Docker Compose or Kubernetes) and the number of options used grows.
The performance tool automatically uses environment variables that match the snake case version of its long options.
E.g. it automatically picks up the value of the BATCH_SIZE
environment variable for the --batch-size
option, but only if the environment variable is defined.
You can list the environment variables that the tool picks up with the following command:
java -jar stream-perf-test.jar --environment-variables
The short version of the option is -env
.
To avoid collisions with environment variables that already exist, it is possible to specify a prefix for the environment variables that the tool looks up.
This prefix is defined with the RABBITMQ_STREAM_PERF_TEST_ENV_PREFIX
environment variable, e.g.:
RABBITMQ_STREAM_PERF_TEST_ENV_PREFIX="STREAM_PERF_TEST_"
With RABBITMQ_STREAM_PERF_TEST_ENV_PREFIX="STREAM_PERF_TEST_"
defined, the tool looks for the STREAM_PERF_TEST_BATCH_SIZE
environment variable, not BATCH_SIZE
.
Logging
The performance tool binary uses Logback with an internal configuration file.
The default log level is warn
with a console appender.
It is possible to define loggers directly from the command line, this is useful for quick debugging.
Use the rabbitmq.streamperftest.loggers
system property with name=level
pairs, e.g.:
java -Drabbitmq.streamperftest.loggers=com.rabbitmq.stream=debug -jar stream-perf-test.jar
It is possible to define several loggers by separating them with commas, e.g. -Drabbitmq.streamperftest.loggers=com.rabbitmq.stream=debug,com.rabbitmq.stream.perf=info
.
It is also possible to use an environment variable:
export RABBITMQ_STREAM_PERF_TEST_LOGGERS=com.rabbitmq.stream=debug
The system property takes precedence over the environment variable.
Use the environment variable with the Docker image:
docker run -it --rm --network host \ --env RABBITMQ_STREAM_PERF_TEST_LOGGERS=com.rabbitmq.stream=debug \ pivotalrabbitmq/stream-perf-test