@shinyaz

Lambda Rust First-Request 900ms Explained — TLS Handshake Accounts for 99%

Table of Contents

Introduction

In the previous post, I discovered that the first request from a Rust Lambda + DynamoDB function takes roughly 900ms. Init Duration (112ms) was fast, yet the first DynamoDB call was abnormally slow. I hypothesized that DNS resolution, TCP connection, and TLS handshake all happen on the first request, but didn't measure the breakdown.

This article measures each phase individually to identify the bottleneck, then quantifies the effect of connection warming as a mitigation.

Test Environment

Same environment as the previous post.

ItemValue
Regionap-northeast-1 (Tokyo)
Architecturearm64 (Graviton)
Memory128 MB
Rust version1.94.1
DynamoDB endpointdynamodb.ap-northeast-1.amazonaws.com

Implementation

Since the SDK doesn't expose its internal connection handling, I measured each network phase individually. Specifically, the handler executes DNS resolution → TCP connection → TLS handshake manually right after Init, measuring each with Instant. Then it executes a GetItem via the SDK to record the SDK's first-request time.

This measurement does not directly observe the SDK's internal implementation. The actual processing order and overhead within the SDK may differ.

Breakdown measurement Rust function code (Cargo.toml + src/main.rs)
Cargo.toml
[package]
name = "rust-latency-breakdown"
version = "0.1.0"
edition = "2021"
 
[dependencies]
aws-config = { version = "1", features = ["behavior-version-latest"] }
aws-sdk-dynamodb = "1"
lambda_http = "1"
serde = { version = "1", features = ["derive"] }
serde_json = "1"
tokio = { version = "1", features = ["macros", "net"] }
tokio-rustls = "0.26"
rustls = "0.23"
webpki-roots = "0.26"
src/main.rs
use aws_sdk_dynamodb::types::AttributeValue;
use aws_sdk_dynamodb::Client;
use lambda_http::{Body, Error, Request, RequestExt, Response, run, service_fn};
use serde::Serialize;
use std::sync::Arc;
use std::time::Instant;
use tokio::net::TcpStream;
 
const DDB_HOST: &str = "dynamodb.ap-northeast-1.amazonaws.com";
const DDB_ADDR: &str = "dynamodb.ap-northeast-1.amazonaws.com:443";
 
#[derive(Serialize)]
struct BreakdownResult {
    mode: String,
    dns_ms: f64,
    tcp_ms: f64,
    tls_ms: f64,
    sdk_first_call_ms: f64,
    total_breakdown_ms: f64,
}
 
async fn measure_dns() -> Result<(f64, std::net::SocketAddr), Error> {
    let start = Instant::now();
    let addr = tokio::net::lookup_host(DDB_ADDR)
        .await?
        .next()
        .ok_or("DNS resolution failed")?;
    Ok((start.elapsed().as_secs_f64() * 1000.0, addr))
}
 
async fn measure_tcp(addr: std::net::SocketAddr) -> Result<(f64, TcpStream), Error> {
    let start = Instant::now();
    let stream = TcpStream::connect(addr).await?;
    Ok((start.elapsed().as_secs_f64() * 1000.0, stream))
}
 
async fn measure_tls(stream: TcpStream) -> Result<f64, Error> {
    let mut root_store = rustls::RootCertStore::empty();
    root_store.extend(webpki_roots::TLS_SERVER_ROOTS.iter().cloned());
    let config = rustls::ClientConfig::builder()
        .with_root_certificates(root_store)
        .with_no_client_auth();
    let connector = tokio_rustls::TlsConnector::from(Arc::new(config));
    let server_name = DDB_HOST.try_into()?;
    let start = Instant::now();
    let _tls_stream = connector.connect(server_name, stream).await?;
    Ok(start.elapsed().as_secs_f64() * 1000.0)
}
 
async fn function_handler(
    client: &Client, event: Request,
) -> Result<Response<Body>, Error> {
    let (dns_ms, addr) = measure_dns().await?;
    let (tcp_ms, stream) = measure_tcp(addr).await?;
    let tls_ms = measure_tls(stream).await?;
 
    let sdk_start = Instant::now();
    client.get_item()
        .table_name("lambda-rust-bench")
        .key("pk", AttributeValue::S("bench".into()))
        .key("sk", AttributeValue::S("item-0".into()))
        .send().await?;
    let sdk_first_call_ms = sdk_start.elapsed().as_secs_f64() * 1000.0;
 
    let result = BreakdownResult {
        mode: "breakdown".into(),
        dns_ms, tcp_ms, tls_ms, sdk_first_call_ms,
        total_breakdown_ms: dns_ms + tcp_ms + tls_ms,
    };
    let body = serde_json::to_string(&result)?;
    let resp = Response::builder()
        .status(200)
        .header("content-type", "application/json")
        .body(body.into())
        .map_err(Box::new)?;
    Ok(resp)
}
 
#[tokio::main]
async fn main() -> Result<(), Error> {
    let config = aws_config::load_defaults(
        aws_config::BehaviorVersion::latest(),
    ).await;
    let client = Client::new(&config);
    run(service_fn(|event: Request| {
        let client = client.clone();
        async move { function_handler(&client, event).await }
    })).await
}
Deploy steps
Terminal
# Reuse IAM role and DynamoDB table from previous post
cargo lambda build --release --arm64
cargo lambda deploy rust-latency-breakdown \
  --iam-role arn:aws:iam::${ACCOUNT_ID}:role/lambda-rust-dynamodb-role \
  --region ap-northeast-1 --memory 128 --timeout 30 \
  --env-vars TABLE_NAME=lambda-rust-bench
Benchmark execution steps (4 cold starts)
Terminal
REGION="ap-northeast-1"
FN="rust-latency-breakdown"
PAYLOAD='{"requestContext":{"http":{"method":"GET"}},"queryStringParameters":{"mode":"breakdown"},"rawPath":"/","headers":{}}'
 
for i in 1 2 3 4; do
  aws lambda update-function-configuration \
    --function-name "$FN" \
    --environment "Variables={TABLE_NAME=lambda-rust-bench,RUN=$i$(date +%s)}" \
    --region "$REGION" --output text --query 'FunctionName' > /dev/null
  aws lambda wait function-updated --function-name "$FN" --region "$REGION"
  sleep 2
 
  aws lambda invoke --function-name "$FN" --cli-binary-format raw-in-base64-out \
    --payload "$PAYLOAD" --region "$REGION" --log-type Tail \
    --query 'LogResult' --output text \
    /tmp/breakdown_${i}.json | base64 -d | grep REPORT
  cat /tmp/breakdown_${i}.json
done

Breaking Down the First 900ms

Results from 4 forced cold starts.

Phase#1#2#3#4Average
DNS resolution11.1 ms4.3 ms11.7 ms2.2 ms7.3 ms
TCP connection0.6 ms0.6 ms1.0 ms0.5 ms0.7 ms
TLS handshake679 ms820 ms662 ms819 ms745 ms
Manual total691 ms825 ms675 ms822 ms753 ms
Reference metric#1#2#3#4Average
SDK first GetItem84 ms90 ms99 ms80 ms88 ms
Init Duration91 ms118 ms89 ms118 ms104 ms
Lambda Duration812 ms945 ms793 ms922 ms868 ms

TLS handshake at 745ms accounts for 99% of the manual measurement total. DNS resolution is 7ms, TCP connection is under 1ms — neither is the bottleneck.

At 128MB, Lambda has very limited CPU resources (roughly 7% of a vCPU). TLS handshakes require cryptographic operations like key exchange and certificate verification, which take 700ms+ under this CPU constraint. The 662-820ms variance in TLS timing may also reflect fluctuations in Lambda's CPU allocation.

The SDK's first GetItem took 88ms. The TLS connection established by the manual measurement is independent of the SDK's connection pool, so the SDK creates its own new connection. Why 88ms is much faster than the manual TLS measurement (745ms) is not entirely clear — possible factors include OS-level DNS caching, TLS session resumption, or the cost of building the webpki-roots root certificate store included in the manual TLS measurement.

Correspondence with Previous 900ms

In the previous post, the SDK's first request took roughly 912ms. This article's manual total (753ms) + SDK first GetItem (88ms) = 841ms, which roughly corresponds to the previous 912ms. The ~70ms gap can be explained by the manual measurement not being identical to the SDK's internal processing, plus per-measurement variance.

Connection Warming Effect

Now that TLS handshake is identified as the bottleneck, let's test warming the SDK's connection during Init. Adding client.list_tables().limit(1).send().await at the end of Init establishes a TLS connection in the SDK's connection pool before the handler runs. ListTables is used because it doesn't depend on any specific table and works in any environment.

Warmup Rust function code (Cargo.toml + src/main.rs)
Cargo.toml
[package]
name = "rust-latency-warmup"
version = "0.1.0"
edition = "2021"
 
[dependencies]
aws-config = { version = "1", features = ["behavior-version-latest"] }
aws-sdk-dynamodb = "1"
lambda_http = "1"
serde = { version = "1", features = ["derive"] }
serde_json = "1"
tokio = { version = "1", features = ["macros"] }
src/main.rs
use aws_sdk_dynamodb::types::AttributeValue;
use aws_sdk_dynamodb::Client;
use lambda_http::{Body, Error, Request, Response, run, service_fn};
use serde::Serialize;
use std::time::Instant;
 
#[derive(Serialize)]
struct WarmupResult {
    mode: String,
    warmup_ms: f64,
    sdk_first_call_ms: f64,
}
 
#[tokio::main]
async fn main() -> Result<(), Error> {
    let config = aws_config::load_defaults(
        aws_config::BehaviorVersion::latest(),
    ).await;
    let client = Client::new(&config);
 
    // Warm the connection during Init
    let warmup_start = Instant::now();
    let _ = client.list_tables().limit(1).send().await;
    let warmup_ms = warmup_start.elapsed().as_secs_f64() * 1000.0;
 
    run(service_fn(move |_event: Request| {
        let client = client.clone();
        let warmup_ms = warmup_ms;
        async move {
            let sdk_start = Instant::now();
            client.get_item()
                .table_name("lambda-rust-bench")
                .key("pk", AttributeValue::S("bench".into()))
                .key("sk", AttributeValue::S("item-0".into()))
                .send().await
                .map_err(|e| -> Error { e.into() })?;
            let sdk_first_call_ms = sdk_start.elapsed().as_secs_f64() * 1000.0;
 
            let result = WarmupResult {
                mode: "warmup".into(),
                warmup_ms, sdk_first_call_ms,
            };
            let body = serde_json::to_string(&result)?;
            let resp = Response::builder()
                .status(200)
                .header("content-type", "application/json")
                .body(Body::from(body))
                .map_err(Box::new)?;
            Ok::<_, Error>(resp)
        }
    })).await
}
Deploy steps (warmup version)
Terminal
cargo lambda build --release --arm64
cargo lambda deploy rust-latency-warmup \
  --iam-role arn:aws:iam::${ACCOUNT_ID}:role/lambda-rust-dynamodb-role \
  --region ap-northeast-1 --memory 128 --timeout 30 \
  --env-vars TABLE_NAME=lambda-rust-bench
Benchmark execution steps (warmup version)
Terminal
REGION="ap-northeast-1"
FN="rust-latency-warmup"
 
for i in 1 2 3 4; do
  aws lambda update-function-configuration \
    --function-name "$FN" \
    --environment "Variables={TABLE_NAME=lambda-rust-bench,RUN=$i$(date +%s)}" \
    --region "$REGION" --output text --query 'FunctionName' > /dev/null
  aws lambda wait function-updated --function-name "$FN" --region "$REGION"
  sleep 2
 
  aws lambda invoke --function-name "$FN" --cli-binary-format raw-in-base64-out \
    --payload '{"requestContext":{"http":{"method":"GET"}},"queryStringParameters":{},"rawPath":"/","headers":{}}' \
    --region "$REGION" --log-type Tail \
    --query 'LogResult' --output text \
    /tmp/warmup_${i}.json | base64 -d | grep REPORT
  cat /tmp/warmup_${i}.json
done

Results from 4 cold starts.

MetricWithout warming (previous)With warmingChange
Init Duration112 ms166 ms+54 ms
First Duration912 ms21 ms-891 ms
Billed Duration total1,025 ms188 ms-837 ms (82% reduction)

First Duration dropped from 912ms to 21ms. Init Duration increases by 54ms, but total Billed Duration drops from 1,025ms to 188ms — an 82% reduction.

The ListTables call during Init takes about 75ms on average, which includes DNS + TCP + TLS + API call. After warming, the SDK's connection pool likely has an established TLS connection, and the handler's GetItem reuses it to complete in 21ms.

Discussion

Should You Add Warming?

Connection warming delivers significant results (82% reduction), but consider these trade-offs.

Init phase 10-second timeout: Lambda's Init phase has a 10-second limit. The 75ms warming call is well within this, but network failures could cause delays. Using let _ = ... to ignore the result ensures Init succeeds even if warming fails.

Init failure retry behavior: If warming causes a panic, Init fails and Lambda retries. Using let _ = prevents panics from propagating.

Cold start frequency: In environments with rare cold starts (e.g., Provisioned Concurrency), warming benefits are limited. In environments with frequent scale-outs, the benefit is substantial.

Relationship with Memory Size

The 745ms TLS handshake is likely driven by cryptographic processing costs under 128MB's limited CPU. Increasing memory proportionally increases CPU, which should speed up TLS handshakes. Memory-size-specific measurements are a topic for future investigation.

Summary

  • TLS handshake accounts for 99% of the 900ms — DNS (7ms) and TCP (1ms) are not bottlenecks. Cryptographic processing under 128MB's limited CPU is likely the dominant factor
  • Connection warming reduces Billed Duration by 82% — A single list_tables().limit(1) call during Init drops first Duration from 912ms to 21ms
  • Warming cost is +54ms on Init Duration — Total Billed Duration goes from 1,025ms to 188ms. The trade-off is well worth it
  • Increasing memory may speed up TLS — The 128MB CPU constraint is the root cause; memory-size-specific benchmarks are a future topic

Cleanup

Resource deletion commands
Terminal
REGION="ap-northeast-1"
aws lambda delete-function --function-name rust-latency-breakdown --region $REGION
aws lambda delete-function --function-name rust-latency-warmup --region $REGION
 
# DynamoDB table and IAM role were created in the previous post
# See previous post's cleanup section if no longer needed

Share this post

Shinya Tahara

Shinya Tahara

Solutions Architect @ AWS

I'm a Solutions Architect at AWS, providing technical guidance primarily to financial industry customers. I share learnings about cloud architecture and AI/ML on this site.The views and opinions expressed on this site are my own and do not represent the official positions of my employer.

Related Posts