AWS Lambda battle 2021: performance comparison for all languages (cold and warm start)
Let’s compare the performance of all supported runtimes + 2 custom runtimes (Rust and GraalVM).
Will compare cold start and warm.
Source code is here: https://github.com/Aleksandr-Filichkin/aws-lambda-runtimes-performance. It requires the minimum local setup(almost all is Dockerized)
- NodeJs (14.x)
- Python (3.9)
- Go(1.x)
- Ruby(2.7)
- .Net(3.1)
- Java (11)
- Rust(1.54.0)
- GraalVM(21.2)
Disclaimer:
All benchmarks were performed in September 2021
I’m not an expert in all these languages and I’m happy to see MR in GitHub repo with performance improvements. I’m going to support these repo and run the perfomance test every 3 months. I believe in opensource collaboration :)
Test scenario
We are going to test API-Gateway -> AWS Lambda->DynamoDb flow.
We will test only POST endpoint which will save the book into the DynamoDb table in the known AWS region(us-east-2).
Cold start test
I did all my best to reduce the cold start:
- Removed useless dependencies.
- Move as much as possible to the initialization phase(for example, in Java move everything to static) to use CPU burst on startup.
- Specified the Region.
- Got rid of any DI frameworks
The detailed information about cold start read here.
Result:
- All languages(except Java and .Net) have a pretty small cold start.
- Java even cannot start with 128Mb. It needs more memory. But GraalVM can help in this case. Feel free to read a detailed page about GraalVM and AWS Lambda
- Rust beats all runtimes for all setups, the only exception is 128 MB where Python is the best.
- The huge setup helps only for Java and .Net.
WARM test
The test is to send 15.000 requests to each lambda one by one.
For the load test, I’m using JMeter. It looks like:
Which metrics will we check?
- The average(per minute) duration for each language (256MB setup,(128MB short result you can find at the end)
- The maximum(per minute) duration for each language (256MB setup)
NodeJS
NodeJS has an expected behavior.
First times it’s slow, but after JIT optimization it becomes better:
Python
Has a stable performance: 100th and 15000th invocations are the same.
Ruby
I observe very weird behavior for Ruby: average duration is growing up(looks like a memory leak or bug in code)
.NET
The first ~1k invocations are slow, but then it has very good performance:
Golang
Stable briliant performance:
Java
The first ~1k iterations are slow, then it becomes faster(JIT C1 helps).
For Java I expected C2 JIT optimization after 10k iterations, but there is no optimization even after 20k invocations and duration is the same. See the screen below:
GraalVM:
As expected, GraalVM has stable good performance from the very beginning.
Rust
Rust has a constant awesome performance.
All together
It’s very tricky to measure average performance because every new lambda has a bit different result (I believe it’s because lambdas are allocated on different hardware). I run the test 3 times with 30 min delay between tests to have 3 different lambdas allocations.
Also, I tested the same flow for 128MB lambda. And here we can see a big difference.
I assume for CPU-intensive flow the difference between compiled and interpreted languages will be much bigger. I guess, GraalVM doesn’t perform well for 128 MB, because it still has JVM inside and it needs too much memory and Lambda performs to often GC.
Conclusion:
Cold start:
- All languages(except Java and .Net) have a pretty small cold start.
- Java even cannot start with 128Mb. It needs more memory. But GraalVM can help in this case.
- Rust beats all runtimes for all setups for cold start, the only exception is 128 MB where Python is the best.
Warm start:
- Golang and Rust are the winners. They have the same brilliant performance.
- .Net has almost the same performance as Golang and Rust, but only after 1k iterations(after JIT).
- GraalVM has a stable great performance almost the same as .Net and a bit worse than Rust and Golang. But it doesn’t perform well for the smallest setup.
- Java is the next after GraalVM.The same as .Net, Java needs some time(1–3k iterations) for JIT(C1). Unfortunately for this particular use case, I was not able to achieve the expected great performance after JIT C2 compilation. Maybe AWS just disabled it.
- Python has stable good performance but works too slow for the 128 MB
- Ruby has almost the same performance as Python, but we see some duration growing after 20 min invocations(after 15k iteration).
- NodeJs is the slowest runtime, after some time it becomes better(JIT?) but still is not good enough. In addition, we see the NodeJS has the worst maximum duration.
Cold+warm start winners are Golang and Rust. They are always faster than other runtimes and demonstrated very stable results.
Check my next performance comparison for AWS Lambda: x86 vs ARM https://filia-aleks.medium.com/aws-lambda-battle-x86-vs-arm-graviton2-perfromance-3581aaef75d9