In this post, I want to consolidate my research on gRPC in a way that feels practical. Although I might be 10 years late to the party, gRPC is a widely used and mature framework.

What Is gRPC

At its core, gRPC is an RPC framework built on top of HTTP/2 and Protocol Buffers (protobuf). Instead of thinking about “resources” like /beers/123, you think in terms of methods you can call on a service:

  • GetBeerById
  • CreateBeerBatch
  • StreamInventory

From a single .proto file, gRPC generates strongly typed clients and servers for multiple languages like Java, Go, Python and many others. One file becomes the source of truth.

Protocol Buffers

Everything in gRPC starts with protobuf. Protobuf files define:

  • Messages: the data structures exchanged between client and server
  • Services: collections of callable RPC methods
  • Methods: how information flows—unary, streaming, etc.

Here’s a piece of the project I made on gRPC:

syntax = "proto3";

option java_multiple_files = true;
option java_package = "com.palmeida.grpcplayground";
option java_outer_classname = "BeerServiceProto";

package beer;

//domain messages
message Beer {
  string id = 1;
  string name = 2;
  BeerStyle style = 3;
  double abv = 4; //alcohol by volume
  uint32 ibu = 5; //bitterness
  uint32 stock_quantity = 6;
}

enum BeerStyle {
  STYLE_UNKNOWN = 0;
  LAGER = 1;
  IPA = 2;
  STOUT = 3;
  PORTER = 4;
  PILSNER = 5;
  WHEAT = 6;
  SOUR = 7;
}

message CreateBeerRequest {
  string name = 1;
  BeerStyle style = 2;
  double abv = 3;
  uint32 ibu = 4;
  uint32 initial_stock = 5;
}

message BeerResponse {
  Beer beer = 1;
}

message GetBeerByIdRequest {
  string id = 1;
}

message ListBeersByStyleRequest {
  BeerStyle style = 1;
}

//client streaming message: client sends multiple beers to be created
message BeerBatchItem {
  string correlation_id = 1;
  CreateBeerRequest payload = 2;
}

message BeerBatchSummary {
  uint32 total_received = 1;
  uint32 total_created = 2;
  uint32 total_failed = 3;
  repeated string failed_names = 4;
}

//inventory streaming messages for bidirectional streaming
enum InventoryEventType {
  EVENT_UNKNOWN = 0;
  STOCK_INCREASED = 1;
  STOCK_DECREASED = 2;
}

message InventoryEvent {
  string beer_id = 1;
  InventoryEventType type = 2;
  int32 quantity_delta = 3;
}

message InventoryUpdate {
  string beer_id = 1;
  uint32 new_stock_quantity = 2;
  string message = 3;
}

//service definition using all four RPC types
service BeerService {

  //unary: get a beer by id
  rpc GetBeerById(GetBeerByIdRequest) returns (BeerResponse);

  //unary: create a beer
  rpc CreateBeer(CreateBeerRequest) returns (BeerResponse);

  //server streaming: list all beers by style
  rpc ListBeersByStyle(ListBeersByStyleRequest) returns (stream BeerResponse);

  //client streaming: client sends multiple beers to create, server returns summary
  rpc CreateBeerBatch(stream BeerBatchItem) returns (BeerBatchSummary);

  //bidirectional streaming:
  //client sends inventory events (stock in/out),
  //server responds with continuous inventory updates
  rpc StreamInventory(stream InventoryEvent) returns (stream InventoryUpdate);
}

What I love about protobuf is that it’s engineered for change. The tag numbers (= 1, = 2, etc.) are what matter for the binary encoding. You can reorder fields, rename them, and evolve the schema as long as those tag numbers remain stable.

When compiled, protobuf generates language-specific classes with getters, builders, enums, etc…

How a gRPC Method Works Internally

I like to think of gRPC as “REST but binary” but the internal flow is quite different. A typical unary RPC call looks like this:

  1. Client code calls a method on a stub. e.g. beerServiceStub.getBeerById(request)
  2. The stub serializes the request using protobuf
  3. It sends the binary payload over an HTTP/2 stream
  4. The server receives it, deserializes it into a message
  5. The server implementation executes your business logic
  6. The response is serialized back and returned on the same stream

All of this happens behind the scenes. The user only sees strongly typed request/response objects.

Why gRPC Is Fast

From my research, I found three major reasons:

Protobuf > JSON

Binary serialization is significantly lighter—smaller payloads and faster encoding/decoding.

It stores data in byte formats instead of human-readable text, eliminating overhead like quotes, field names, and whitespace. Primitive values are written as fixed or efficiently packed bytes rather than long strings, and schemas allow binary formats to omit repeated metadata.

Since the data maps closely to how CPUs store and manipulate values, encoding and decoding requires far fewer parsing steps and conversions, making the process much faster while producing much smaller serialized messages.

HTTP/2

gRPC sits entirely on HTTP/2, inheriting benefits like multiplexed streams, persistent connections, server push and header compression.

The protocol was designed for streaming from the start.

Contract-generated code

There’s no string-based routing, reflection-heavy serialization, or dynamic payload parsing. Everything compiles down to direct method calls

Request/response types in gRPC

gRPC defines four fundamental interaction models.

Unary

Pattern: single request → single response

rpc GetBeerById(GetBeerByIdRequest) returns (BeerResponse);

Used for:

  • standard request/response operations (similar to a typical REST call)
  • fetching or updating a single resource

In this project:

  • GetBeerById – looks up a beer by its ID
  • CreateBeer – creates a single beer (also unary)

Server Streaming

Pattern: single request → stream of responses

rpc ListBeersByStyle(ListBeersByStyleRequest) returns (stream BeerResponse);

Use cases:

  • sending back a sequence of items without the client having to keep polling
  • real-time updates, or “pages” of data where streaming is more efficient than pagination

In this project:

  • the client sends one ListBeersByStyleRequest with a style like IPA
  • the server streams each BeerResponse (every IPA beer) one by one

This is especially nice when:

  • results might be large
  • you want to start processing / displaying them before the full list is ready (better user experience)

Client Streaming

Pattern: stream of requests → single response

rpc CreateBeerBatch(stream BeerBatchItem) returns (BeerBatchSummary);

Use cases:

  • client wants to upload or send a batch of information
  • server wants to respond with a summary or aggregated result

In this project:

  • the client sends multiple BeerBatchItem messages, each wrapping a CreateBeerRequest
  • the server:
    • processes each BeerBatchItem as it arrives
    • aggregates counts for total_received, total_created, and failures
    • returns a single BeerBatchSummary when the client closes the stream

This shows:

  • handling of per-item validation errors without failing the entire stream
  • batch processing via streaming rather than a massive single request

Bidirectional Streaming

Pattern: stream of requests ↔ stream of responses (independent)

rpc StreamInventory(stream InventoryEvent) returns (stream InventoryUpdate);

Use cases:

  • real-time bidirectional communication
  • chat, live telemetry, streaming analytics, collaborative apps, continuous control channels

In this project:

  • the client streams InventoryEvent messages, such as:
    • “stock increased by 20”
    • “stock decreased by 10”
  • the server:
    • applies each event to the domain model
    • immediately streams back an InventoryUpdate with the new stock and a human-readable message

This shows:

  • full duplex communication
  • stateful interactions over a single connection
  • how to coordinate business logic over a long-lived stream

Channels and Stubs: How Clients Connect

In gRPC:

  • A channel is a long-lived connection to a server (like localhost:9090 I would say)
  • A stub is the client-side interface used to call remote methods

Example using Spring Boot + gRPC:

@GrpcClient("beerClient")
private BeerServiceGrpc.BeerServiceBlockingStub blockingStub;

Then you just call:

BeerResponse created = blockingStub.createBeer(createRequest);

No REST templates. No serialization code. No URL construction. Just strongly typed method calls (like) calling another class in the same JVM, even if it’s running in another country.

gRPC in the Real World

gRPC is widely used at companies that performance and internal service reliability very seriously, like Google, Netflix, etc…

You’ll see it in:

  • High-throughput microservices
  • Polyglot server ecosystems
  • Low-latency applications (search, recommendations, ads)
  • Game backends
  • Streaming analytics

It’s not a public API replacement for REST or GraphQL, gRPC excels behind the scenes, powering internal networks where speed and efficiency matter more than human-readability.

Security and Transport

gRPC is also very strong in terms of security. Because it’s built on HTTP/2, TLS is not optional, it’s an integral part of how services communicate. In production environments, most teams rely on TLS or mutual TLS (mTLS) for strong authentication and encrypted traffic.

TLS / mTLS

  • TLS encrypts traffic and protects data in transit.
  • mTLS adds service-to-service authentication so both the client and server present certificates, which means each party can cryptographically verify the other’s identity. In zero-trust networks, this is the norm.

JWT / OAuth

For user-facing flows or API gateway level auth:

  • gRPC headers support JWT tokens as metadata
  • OAuth2 tokens flow exactly as they do in REST
  • Interceptors can enforce auth logic at the application layer

Error Handling and Resiliency

gRPC has a clear error model that’s more structured than HTTP’s status codes, but the mindset is similar.

Deadlines

The client specifies how long it’s willing to wait:

response = blockingStub.withDeadlineAfter(deadlineMs, TimeUnit.MILLISECONDS).sayHello(request);

This sets the deadline from when the client RPC is set to when the response is picked up by the client.

Deadlines travel across the call chain, which prevents downstream services from being overloaded by “stuck” upstream calls.

Cancellations

Clients can cancel requests to free up resources—especially useful with streaming RPCs or long-running operations.

Also, according to the documentation: “When an RPC is cancelled, the server should stop any ongoing computation and end its side of the stream. Often, servers are also clients to upstream servers, so that cancellation operation should ideally propagate to all ongoing computation in the system that was initiated due to the original client RPC call.”

Refer to https://github.com/grpc/grpc-java/tree/master/examples/src/main/java/io/grpc/examples/cancellation for a detailed example.

Schema Evolution and Versioning

Protobuf is designed to evolve gracefully, but only if you respect the rules:

  • Tag numbers are forever. Do not reuse them.
  • The name can change, the tag cannot.

If you remove a field, reserve its tag and name:

reserved 3;
reserved "old_field_name";

This prevents accidental reuse.

When to Use (and Not Use) gRPC

When gRPC Is a Great Fit

  • Service-to-service communication inside a microservices architecture
  • High-performance, low-latency environments
  • Polyglot systems where shared types matter
  • Real-time streaming (telemetry, analytics)

When gRPC Is Not Ideal

  • Public APIs: REST or GraphQL are more approachable
  • Browser clients: gRPC-Web works but is still limited compared to REST
  • Human-readable debugging: binary payloads require tooling
  • Large file uploads: sometimes REST with multipart/form-data is simply easier

I would choose gRPC for internal systems, not as your universal external API.