gRPC - Google Remote Procedure Call
In this post, I want to consolidate my research on gRPC in a way that feels practical. Although I might be 10 years late to the party, gRPC is a widely used and mature framework.
What Is gRPC
At its core, gRPC is an RPC framework built on top of HTTP/2 and Protocol Buffers (protobuf). Instead of thinking about “resources” like /beers/123, you think in terms of methods you can call on a service:
- GetBeerById
- CreateBeerBatch
- StreamInventory
From a single .proto file, gRPC generates strongly typed clients and servers for multiple languages like Java, Go, Python and many others. One file becomes the source of truth.
Protocol Buffers
Everything in gRPC starts with protobuf. Protobuf files define:
- Messages: the data structures exchanged between client and server
- Services: collections of callable RPC methods
- Methods: how information flows—unary, streaming, etc.
Here’s a piece of the project I made on gRPC:
syntax = "proto3";
option java_multiple_files = true;
option java_package = "com.palmeida.grpcplayground";
option java_outer_classname = "BeerServiceProto";
package beer;
//domain messages
message Beer {
string id = 1;
string name = 2;
BeerStyle style = 3;
double abv = 4; //alcohol by volume
uint32 ibu = 5; //bitterness
uint32 stock_quantity = 6;
}
enum BeerStyle {
STYLE_UNKNOWN = 0;
LAGER = 1;
IPA = 2;
STOUT = 3;
PORTER = 4;
PILSNER = 5;
WHEAT = 6;
SOUR = 7;
}
message CreateBeerRequest {
string name = 1;
BeerStyle style = 2;
double abv = 3;
uint32 ibu = 4;
uint32 initial_stock = 5;
}
message BeerResponse {
Beer beer = 1;
}
message GetBeerByIdRequest {
string id = 1;
}
message ListBeersByStyleRequest {
BeerStyle style = 1;
}
//client streaming message: client sends multiple beers to be created
message BeerBatchItem {
string correlation_id = 1;
CreateBeerRequest payload = 2;
}
message BeerBatchSummary {
uint32 total_received = 1;
uint32 total_created = 2;
uint32 total_failed = 3;
repeated string failed_names = 4;
}
//inventory streaming messages for bidirectional streaming
enum InventoryEventType {
EVENT_UNKNOWN = 0;
STOCK_INCREASED = 1;
STOCK_DECREASED = 2;
}
message InventoryEvent {
string beer_id = 1;
InventoryEventType type = 2;
int32 quantity_delta = 3;
}
message InventoryUpdate {
string beer_id = 1;
uint32 new_stock_quantity = 2;
string message = 3;
}
//service definition using all four RPC types
service BeerService {
//unary: get a beer by id
rpc GetBeerById(GetBeerByIdRequest) returns (BeerResponse);
//unary: create a beer
rpc CreateBeer(CreateBeerRequest) returns (BeerResponse);
//server streaming: list all beers by style
rpc ListBeersByStyle(ListBeersByStyleRequest) returns (stream BeerResponse);
//client streaming: client sends multiple beers to create, server returns summary
rpc CreateBeerBatch(stream BeerBatchItem) returns (BeerBatchSummary);
//bidirectional streaming:
//client sends inventory events (stock in/out),
//server responds with continuous inventory updates
rpc StreamInventory(stream InventoryEvent) returns (stream InventoryUpdate);
}
What I love about protobuf is that it’s engineered for change. The tag numbers (= 1, = 2, etc.) are what matter for the binary encoding. You can reorder fields, rename them, and evolve the schema as long as those tag numbers remain stable.
When compiled, protobuf generates language-specific classes with getters, builders, enums, etc…
How a gRPC Method Works Internally
I like to think of gRPC as “REST but binary” but the internal flow is quite different. A typical unary RPC call looks like this:
- Client code calls a method on a stub. e.g. beerServiceStub.getBeerById(request)
- The stub serializes the request using protobuf
- It sends the binary payload over an HTTP/2 stream
- The server receives it, deserializes it into a message
- The server implementation executes your business logic
- The response is serialized back and returned on the same stream
All of this happens behind the scenes. The user only sees strongly typed request/response objects.
Why gRPC Is Fast
From my research, I found three major reasons:
Protobuf > JSON
Binary serialization is significantly lighter—smaller payloads and faster encoding/decoding.
It stores data in byte formats instead of human-readable text, eliminating overhead like quotes, field names, and whitespace. Primitive values are written as fixed or efficiently packed bytes rather than long strings, and schemas allow binary formats to omit repeated metadata.
Since the data maps closely to how CPUs store and manipulate values, encoding and decoding requires far fewer parsing steps and conversions, making the process much faster while producing much smaller serialized messages.
HTTP/2
gRPC sits entirely on HTTP/2, inheriting benefits like multiplexed streams, persistent connections, server push and header compression.
The protocol was designed for streaming from the start.
Contract-generated code
There’s no string-based routing, reflection-heavy serialization, or dynamic payload parsing. Everything compiles down to direct method calls
Request/response types in gRPC
gRPC defines four fundamental interaction models.
Unary
Pattern: single request → single response
rpc GetBeerById(GetBeerByIdRequest) returns (BeerResponse);
Used for:
- standard request/response operations (similar to a typical REST call)
- fetching or updating a single resource
In this project:
- GetBeerById – looks up a beer by its ID
- CreateBeer – creates a single beer (also unary)
Server Streaming
Pattern: single request → stream of responses
rpc ListBeersByStyle(ListBeersByStyleRequest) returns (stream BeerResponse);
Use cases:
- sending back a sequence of items without the client having to keep polling
- real-time updates, or “pages” of data where streaming is more efficient than pagination
In this project:
- the client sends one ListBeersByStyleRequest with a style like IPA
- the server streams each BeerResponse (every IPA beer) one by one
This is especially nice when:
- results might be large
- you want to start processing / displaying them before the full list is ready (better user experience)
Client Streaming
Pattern: stream of requests → single response
rpc CreateBeerBatch(stream BeerBatchItem) returns (BeerBatchSummary);
Use cases:
- client wants to upload or send a batch of information
- server wants to respond with a summary or aggregated result
In this project:
- the client sends multiple BeerBatchItem messages, each wrapping a CreateBeerRequest
- the server:
- processes each BeerBatchItem as it arrives
- aggregates counts for total_received, total_created, and failures
- returns a single BeerBatchSummary when the client closes the stream
This shows:
- handling of per-item validation errors without failing the entire stream
- batch processing via streaming rather than a massive single request
Bidirectional Streaming
Pattern: stream of requests ↔ stream of responses (independent)
rpc StreamInventory(stream InventoryEvent) returns (stream InventoryUpdate);
Use cases:
- real-time bidirectional communication
- chat, live telemetry, streaming analytics, collaborative apps, continuous control channels
In this project:
- the client streams InventoryEvent messages, such as:
- “stock increased by 20”
- “stock decreased by 10”
- the server:
- applies each event to the domain model
- immediately streams back an InventoryUpdate with the new stock and a human-readable message
This shows:
- full duplex communication
- stateful interactions over a single connection
- how to coordinate business logic over a long-lived stream
Channels and Stubs: How Clients Connect
In gRPC:
- A channel is a long-lived connection to a server (like localhost:9090 I would say)
- A stub is the client-side interface used to call remote methods
Example using Spring Boot + gRPC:
@GrpcClient("beerClient")
private BeerServiceGrpc.BeerServiceBlockingStub blockingStub;
Then you just call:
BeerResponse created = blockingStub.createBeer(createRequest);
No REST templates. No serialization code. No URL construction. Just strongly typed method calls (like) calling another class in the same JVM, even if it’s running in another country.
gRPC in the Real World
gRPC is widely used at companies that performance and internal service reliability very seriously, like Google, Netflix, etc…
You’ll see it in:
- High-throughput microservices
- Polyglot server ecosystems
- Low-latency applications (search, recommendations, ads)
- Game backends
- Streaming analytics
It’s not a public API replacement for REST or GraphQL, gRPC excels behind the scenes, powering internal networks where speed and efficiency matter more than human-readability.
Security and Transport
gRPC is also very strong in terms of security. Because it’s built on HTTP/2, TLS is not optional, it’s an integral part of how services communicate. In production environments, most teams rely on TLS or mutual TLS (mTLS) for strong authentication and encrypted traffic.
TLS / mTLS
- TLS encrypts traffic and protects data in transit.
- mTLS adds service-to-service authentication so both the client and server present certificates, which means each party can cryptographically verify the other’s identity. In zero-trust networks, this is the norm.
JWT / OAuth
For user-facing flows or API gateway level auth:
- gRPC headers support JWT tokens as metadata
- OAuth2 tokens flow exactly as they do in REST
- Interceptors can enforce auth logic at the application layer
Error Handling and Resiliency
gRPC has a clear error model that’s more structured than HTTP’s status codes, but the mindset is similar.
Deadlines
The client specifies how long it’s willing to wait:
response = blockingStub.withDeadlineAfter(deadlineMs, TimeUnit.MILLISECONDS).sayHello(request);
This sets the deadline from when the client RPC is set to when the response is picked up by the client.
Deadlines travel across the call chain, which prevents downstream services from being overloaded by “stuck” upstream calls.
Cancellations
Clients can cancel requests to free up resources—especially useful with streaming RPCs or long-running operations.
Also, according to the documentation: “When an RPC is cancelled, the server should stop any ongoing computation and end its side of the stream. Often, servers are also clients to upstream servers, so that cancellation operation should ideally propagate to all ongoing computation in the system that was initiated due to the original client RPC call.”
Refer to https://github.com/grpc/grpc-java/tree/master/examples/src/main/java/io/grpc/examples/cancellation for a detailed example.
Schema Evolution and Versioning
Protobuf is designed to evolve gracefully, but only if you respect the rules:
- Tag numbers are forever. Do not reuse them.
- The name can change, the tag cannot.
If you remove a field, reserve its tag and name:
reserved 3;
reserved "old_field_name";
This prevents accidental reuse.
When to Use (and Not Use) gRPC
When gRPC Is a Great Fit
- Service-to-service communication inside a microservices architecture
- High-performance, low-latency environments
- Polyglot systems where shared types matter
- Real-time streaming (telemetry, analytics)
When gRPC Is Not Ideal
- Public APIs: REST or GraphQL are more approachable
- Browser clients: gRPC-Web works but is still limited compared to REST
- Human-readable debugging: binary payloads require tooling
- Large file uploads: sometimes REST with multipart/form-data is simply easier
I would choose gRPC for internal systems, not as your universal external API.