JDBC Wrapper Valkey Cache Test — The ElastiCache Serverless timeout pitfall
Table of Contents
Introduction
In the previous article, we verified that the Remote Query Cache Plugin with node-based ElastiCache for Valkey and 1 million rows can improve query latency by up to 300x.
This article shares the results of testing the same plugin with ElastiCache for Valkey Serverless. The key finding: the first connection to ElastiCache Serverless always times out, causing CacheMonitor to enter an unstable state. See the official docs at Caching database query results and Remote Query Cache Plugin.
Test Environment
| Item | Value |
|---|---|
| Region | ap-northeast-1 (Tokyo) |
| Database | Aurora PostgreSQL Serverless v2 (16.6, 0.5-2 ACU) |
| Cache | ElastiCache for Valkey Serverless (Valkey 8.1) |
| Client | EC2 t3.small (Amazon Linux 2023, same VPC) |
| Java | Amazon Corretto 21.0.10 |
| AWS JDBC Wrapper | 3.3.0 |
| PostgreSQL JDBC | 42.7.8 |
| Valkey Glide | 2.3.0 |
| Commons Pool | 2.12.0 |
Prerequisites:
- AWS CLI configured (
rds:*,elasticache:*,ec2:*permissions) - Java 21 + Maven
Skip to Summary if you only want the findings.
Setup
Infrastructure setup (VPC / Aurora / ElastiCache / EC2)
VPC, Subnets, and Security Groups
# VPC
VPC_ID=$(aws ec2 create-vpc --cidr-block 10.0.0.0/16 \
--tag-specifications 'ResourceType=vpc,Tags=[{Key=Name,Value=jdbc-cache-test}]' \
--query 'Vpc.VpcId' --output text --region ap-northeast-1)
aws ec2 modify-vpc-attribute --enable-dns-hostnames '{"Value":true}' --vpc-id $VPC_ID
# Subnets (3 AZs)
SUBNET_A=$(aws ec2 create-subnet --vpc-id $VPC_ID --cidr-block 10.0.1.0/24 \
--availability-zone ap-northeast-1a --query 'Subnet.SubnetId' --output text --region ap-northeast-1)
SUBNET_C=$(aws ec2 create-subnet --vpc-id $VPC_ID --cidr-block 10.0.2.0/24 \
--availability-zone ap-northeast-1c --query 'Subnet.SubnetId' --output text --region ap-northeast-1)
SUBNET_D=$(aws ec2 create-subnet --vpc-id $VPC_ID --cidr-block 10.0.3.0/24 \
--availability-zone ap-northeast-1d --query 'Subnet.SubnetId' --output text --region ap-northeast-1)
# IGW for SSH access
IGW_ID=$(aws ec2 create-internet-gateway --query 'InternetGateway.InternetGatewayId' \
--output text --region ap-northeast-1)
aws ec2 attach-internet-gateway --internet-gateway-id $IGW_ID --vpc-id $VPC_ID
RTB_ID=$(aws ec2 describe-route-tables --filters "Name=vpc-id,Values=$VPC_ID" \
--query 'RouteTables[0].RouteTableId' --output text --region ap-northeast-1)
aws ec2 create-route --route-table-id $RTB_ID --destination-cidr-block 0.0.0.0/0 \
--gateway-id $IGW_ID --region ap-northeast-1
aws ec2 modify-subnet-attribute --subnet-id $SUBNET_A --map-public-ip-on-launch
# Security groups
SG_EC2=$(aws ec2 create-security-group --group-name jdbc-cache-test-ec2 \
--description "EC2" --vpc-id $VPC_ID --query 'GroupId' --output text --region ap-northeast-1)
SG_AURORA=$(aws ec2 create-security-group --group-name jdbc-cache-test-aurora \
--description "Aurora" --vpc-id $VPC_ID --query 'GroupId' --output text --region ap-northeast-1)
SG_CACHE=$(aws ec2 create-security-group --group-name jdbc-cache-test-cache \
--description "ElastiCache" --vpc-id $VPC_ID --query 'GroupId' --output text --region ap-northeast-1)
MY_IP="$(curl -s https://checkip.amazonaws.com)/32"
aws ec2 authorize-security-group-ingress --group-id $SG_EC2 --protocol tcp --port 22 --cidr $MY_IP
aws ec2 authorize-security-group-ingress --group-id $SG_AURORA --protocol tcp --port 5432 --source-group $SG_EC2
aws ec2 authorize-security-group-ingress --group-id $SG_CACHE --protocol tcp --port 6379 --source-group $SG_EC2Aurora PostgreSQL Serverless v2
aws rds create-db-subnet-group --db-subnet-group-name jdbc-cache-test \
--db-subnet-group-description "JDBC cache test" \
--subnet-ids "$SUBNET_A" "$SUBNET_C" "$SUBNET_D" --region ap-northeast-1
aws rds create-db-cluster --db-cluster-identifier jdbc-cache-test \
--engine aurora-postgresql --engine-version 16.6 \
--master-username postgres --master-user-password '<password>' \
--db-subnet-group-name jdbc-cache-test \
--vpc-security-group-ids $SG_AURORA \
--serverless-v2-scaling-configuration MinCapacity=0.5,MaxCapacity=2 \
--storage-encrypted --no-deletion-protection --region ap-northeast-1
aws rds create-db-instance --db-instance-identifier jdbc-cache-test-writer \
--db-cluster-identifier jdbc-cache-test \
--db-instance-class db.serverless --engine aurora-postgresql --region ap-northeast-1
aws rds wait db-instance-available --db-instance-identifier jdbc-cache-test-writer --region ap-northeast-1ElastiCache for Valkey Serverless
aws elasticache create-serverless-cache \
--serverless-cache-name jdbc-cache-test \
--engine valkey \
--subnet-ids "$SUBNET_A" "$SUBNET_C" "$SUBNET_D" \
--security-group-ids $SG_CACHE --region ap-northeast-1EC2 Instance
AMI_ID=$(aws ec2 describe-images --owners amazon \
--filters "Name=name,Values=al2023-ami-2023.*-x86_64" "Name=state,Values=available" \
--query 'sort_by(Images, &CreationDate)[-1].ImageId' --output text --region ap-northeast-1)
aws ec2 create-key-pair --key-name jdbc-cache-test --key-type ed25519 \
--query 'KeyMaterial' --output text > jdbc-cache-test.pem
chmod 600 jdbc-cache-test.pem
aws ec2 run-instances --image-id $AMI_ID --instance-type t3.small \
--key-name jdbc-cache-test --security-group-ids $SG_EC2 \
--subnet-id $SUBNET_A \
--tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=jdbc-cache-test}]' \
--region ap-northeast-1
ssh ec2-user@<public-ip> 'sudo dnf install -y java-21-amazon-corretto-devel maven postgresql16'Test Data
PGPASSWORD='<password>' psql -h <aurora-endpoint> -U postgres -d postgres -c "
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
category VARCHAR(50) NOT NULL,
price NUMERIC(10,2) NOT NULL,
stock INT NOT NULL DEFAULT 0
);
INSERT INTO products (name, category, price, stock) VALUES
('MacBook Pro 14', 'laptop', 248000, 50),
('ThinkPad X1 Carbon', 'laptop', 198000, 30),
('Dell XPS 15', 'laptop', 178000, 25),
('iPhone 16 Pro', 'phone', 159800, 100),
('Galaxy S25 Ultra', 'phone', 189800, 80),
('Pixel 9 Pro', 'phone', 129800, 60),
('iPad Air', 'tablet', 98800, 70),
('Galaxy Tab S10', 'tablet', 118800, 40),
('AirPods Pro', 'audio', 39800, 200),
('Sony WH-1000XM5', 'audio', 44800, 150);
"Test Application
The plugin requires only three changes from a standard JDBC connection:
- Change the URL prefix to
jdbc:aws-wrapper:postgresql:// - Set
wrapperPluginstoremoteQueryCache - Set
cacheEndpointAddrRwto the ElastiCache endpoint
ElastiCache Serverless requires TLS, and cacheUseSSL defaults to true, so no explicit setting is needed. In the node-based article, we set cacheUseSSL=false to disable TLS — that's not an option with Serverless.
Properties props = new Properties();
props.setProperty("user", "postgres");
props.setProperty("password", password);
props.setProperty("wrapperPlugins", "remoteQueryCache");
props.setProperty("cacheEndpointAddrRw", "my-cache.serverless.apne1.cache.amazonaws.com:6379");
Connection conn = DriverManager.getConnection(
"jdbc:aws-wrapper:postgresql://my-aurora.cluster-xxx.ap-northeast-1.rds.amazonaws.com:5432/postgres",
props);To cache a query, add a SQL comment hint with the TTL:
ResultSet rs = stmt.executeQuery(
"/* CACHE_PARAM(ttl=300s) */ SELECT * FROM products WHERE category = 'laptop' ORDER BY price DESC");Queries without the hint execute directly against the database as usual.
pom.xml (full)
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>cachetest</groupId>
<artifactId>jdbc-cache-test</artifactId>
<version>1.0</version>
<properties>
<maven.compiler.source>21</maven.compiler.source>
<maven.compiler.target>21</maven.compiler.target>
</properties>
<dependencies>
<dependency>
<groupId>software.amazon.jdbc</groupId>
<artifactId>aws-advanced-jdbc-wrapper</artifactId>
<version>3.3.0</version>
</dependency>
<dependency>
<groupId>org.postgresql</groupId>
<artifactId>postgresql</artifactId>
<version>42.7.8</version>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-pool2</artifactId>
<version>2.12.0</version>
</dependency>
<dependency>
<groupId>io.valkey</groupId>
<artifactId>valkey-glide</artifactId>
<version>2.3.0</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-simple</artifactId>
<version>2.0.16</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<version>3.4.2</version>
<configuration>
<archive><manifest>
<mainClass>cachetest.QueryCacheTest</mainClass>
</manifest></archive>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-dependency-plugin</artifactId>
<version>3.8.1</version>
<executions>
<execution>
<id>copy-deps</id><phase>package</phase>
<goals><goal>copy-dependencies</goal></goals>
<configuration>
<outputDirectory>${project.build.directory}/lib</outputDirectory>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>QueryCacheTest.java (full test application)
package cachetest;
import java.sql.*;
import java.time.Duration;
import java.time.Instant;
import java.util.Properties;
public class QueryCacheTest {
public static void main(String[] args) throws Exception {
String dbEndpoint = args[0], dbPassword = args[1], cacheEndpoint = args[2];
Properties props = new Properties();
props.setProperty("user", "postgres");
props.setProperty("password", dbPassword);
props.setProperty("wrapperPlugins", "remoteQueryCache");
props.setProperty("cacheEndpointAddrRw", cacheEndpoint + ":6379");
// Serverless requires TLS — cacheUseSSL defaults to true
String url = "jdbc:aws-wrapper:postgresql://" + dbEndpoint + ":5432/postgres";
String query = "/* CACHE_PARAM(ttl=60s) */ SELECT * FROM products "
+ "WHERE category = 'laptop' ORDER BY price DESC";
System.out.println("=== Cache Behavior Test (ElastiCache Serverless) ===");
try (Connection conn = DriverManager.getConnection(url, props)) {
for (int i = 1; i <= 8; i++) {
Instant start = Instant.now();
try (Statement s = conn.createStatement();
ResultSet r = s.executeQuery(query)) {
int count = 0;
while (r.next()) count++;
long ms = Duration.between(start, Instant.now()).toMillis();
System.out.printf(" Query %d: %d rows, %d ms%n", i, count, ms);
}
if (i <= 2) Thread.sleep(1000);
}
}
}
}Build and run
# Create directory structure
mkdir -p jdbc-cache-test/src/main/java/cachetest
# Place pom.xml in jdbc-cache-test/ and QueryCacheTest.java in jdbc-cache-test/src/main/java/cachetest/
export JAVA_HOME=/usr/lib/jvm/java-21-amazon-corretto
cd jdbc-cache-test && mvn package -q
CP="target/jdbc-cache-test-1.0.jar"
for jar in target/lib/*.jar; do CP="$CP:$jar"; done
java -cp "$CP" cachetest.QueryCacheTest \
"<aurora-endpoint>" "<password>" "<cache-serverless-endpoint>"Test 1: Initial Connection and Cache Behavior (10-row table)
Ran the same query 8 times on a single connection to observe cache behavior.
Query 1: 3 rows, 10568 ms
Query 2: 3 rows, 7 ms
Query 3: 3 rows, 7 ms
Query 4: 3 rows, 19 ms
Query 5: 3 rows, 100 ms
Query 6: 3 rows, 216 ms
Query 7: 3 rows, 110 ms
Query 8: 3 rows, 90 msQuery 1 took ~10.5 seconds. The logs reveal a TimeoutException on the first cache read, causing CacheMonitor to transition from HEALTHY to SUSPECT:
[HEALTHY→SUSPECT] jdbc-cache-test-xxx.serverless.apne1.cache.amazonaws.com:6379
READ failed: CONNECTION - TimeoutException: Request timed outQuery 1 fell back to the database (cache miss), while writing the result to the cache in the background. Queries 2-3 at 7ms are suspiciously fast — the CacheMonitor was still in SUSPECT state, so the plugin bypassed the cache entirely and queried the database directly. After the CacheMonitor's periodic health check (runs every few seconds) succeeded and recovered to HEALTHY, Queries 5-8 at 90-216ms represent actual cache hit latency.
Additionally, the CacheMonitor health checks consistently failed with a pool borrow timeout:
SEVERE: Non-recoverable error (DATA) for jdbc-cache-test-xxx:6379:
Timeout waiting for idle object, borrowMaxWaitDuration=PT0.1SThe internal borrow timeout is hardcoded at 100ms, which is shorter than the TLS handshake time for ElastiCache Serverless.
Test 2: Do Heavy Queries Hit the Same Issues? (1M-row table)
Tested with 1 million rows and the same aggregate queries from the node-based article to check whether the initial timeout still occurs and how cache performance compares.
Test data was replaced with 1 million rows (see the node-based article setup for the INSERT statement).
Test 2 reproduction steps
Replace test data with 1 million rows.
PGPASSWORD='<password>' psql -h <aurora-endpoint> -U postgres -d postgres -c "
TRUNCATE products;
INSERT INTO products (name, category, price, stock)
SELECT
'Product-' || i,
(ARRAY['laptop','phone','tablet','audio','camera','monitor','keyboard','mouse'])[1 + (i % 8)],
(random() * 500000 + 1000)::numeric(10,2),
(random() * 1000)::int
FROM generate_series(1, 1000000) AS i;
ANALYZE products;
"Use the QueryCacheTest.java from the node-based article, modified for Serverless: remove the cacheUseSSL setting (Serverless requires TLS, default is true).
[Category stats]
No Cache: avg=410.3, median=401.0, min=322, max=500
Cached: avg=36.9, median=2.5, min=1, max=344
[Price tier]
No Cache: avg=421.4, median=404.0, min=333, max=513
Cached: avg=1.4, median=1.0, min=1, max=3
[Window function]
No Cache: avg=847.6, median=753.5, min=643, max=1400
Cached: avg=1.1, median=1.0, min=1, max=2The initial timeout and CacheMonitor SUSPECT transition occurred with heavy queries too. Category stats Cached max=344ms is from the CacheMonitor bypassing the cache during SUSPECT state, falling back to the database.
However, after CacheMonitor recovered to HEALTHY, cache hits stabilized at 1-2ms — on par with node-based. This contrasts with the 90-216ms seen in Test 1 with the 10-row table. The difference is that aggregate results are just a few rows, so TLS overhead is relatively small compared to the serialized result set.
The initial timeout is a Serverless-specific issue, but once stable, heavy queries benefit from caching just as effectively.
Summary
- The first connection always times out — The TLS handshake to ElastiCache Serverless exceeds the plugin's internal connection timeout (default 2s). Even increasing
cacheConnectionTimeoutMsto 10s or 30s doesn't help because the Valkey Glide client has its own timeout. The first query always falls back to the database as a cache miss, adding significant latency at application startup - CacheMonitor health checks consistently fail — The internal connection pool borrow timeout is fixed at 100ms, shorter than ElastiCache Serverless TLS connection establishment. Health checks always report
Non-recoverable error (DATA), but the data-path cache reads/writes use a separate path and still function. However, the unstable health state causes intermittent cache bypasses - Serverless cache hit latency varies with result set size — With node-based, cache hits were a stable 1-4ms regardless of result set size. With Serverless, simple queries against a 10-row table showed 90-216ms, while aggregate queries against 1 million rows (returning just a few rows) achieved 1-2ms. TLS overhead increases latency when it's large relative to the serialized result set. For heavy queries, Serverless caching still delivers significant benefit
- Fail-safe behavior works correctly — When the cache is unhealthy, the plugin automatically falls back to the database. With
failWhenCacheDownset tofalse(default), cache issues never cascade into application failures
With ElastiCache Serverless, the initial connection timeout and CacheMonitor instability are significant concerns. As confirmed in the node-based article, a node-based ElastiCache cluster (without TLS) does not exhibit these issues and works correctly from the first query. I strongly recommend testing with your actual query patterns and latency requirements before production adoption.
Cleanup
Resource deletion commands
# Aurora
aws rds delete-db-instance --db-instance-identifier jdbc-cache-test-writer \
--skip-final-snapshot --region ap-northeast-1
aws rds wait db-instance-deleted --db-instance-identifier jdbc-cache-test-writer --region ap-northeast-1
aws rds delete-db-cluster --db-cluster-identifier jdbc-cache-test \
--skip-final-snapshot --region ap-northeast-1
# ElastiCache
aws elasticache delete-serverless-cache --serverless-cache-name jdbc-cache-test --region ap-northeast-1
# EC2
aws ec2 terminate-instances --instance-ids <instance-id> --region ap-northeast-1
# Wait for deletions, then remove network resources
aws ec2 delete-key-pair --key-name jdbc-cache-test --region ap-northeast-1
aws ec2 delete-security-group --group-id <sg-aurora> --region ap-northeast-1
aws ec2 delete-security-group --group-id <sg-cache> --region ap-northeast-1
aws ec2 delete-security-group --group-id <sg-ec2> --region ap-northeast-1
aws rds delete-db-subnet-group --db-subnet-group-name jdbc-cache-test --region ap-northeast-1
aws ec2 detach-internet-gateway --internet-gateway-id <igw-id> --vpc-id <vpc-id> --region ap-northeast-1
aws ec2 delete-internet-gateway --internet-gateway-id <igw-id> --region ap-northeast-1
aws ec2 delete-subnet --subnet-id <subnet-a> --region ap-northeast-1
aws ec2 delete-subnet --subnet-id <subnet-c> --region ap-northeast-1
aws ec2 delete-subnet --subnet-id <subnet-d> --region ap-northeast-1
aws ec2 delete-vpc --vpc-id <vpc-id> --region ap-northeast-1