pk.org: CS 417/Lecture Notes

Web Services and gRPC

From SOAP to REST to gRPC

Paul Krzyzanowski – 2026-01-19

Goal: Provide a standard, web-native way for programs to communicate using Internet protocols (especially HTTP), so independently built components can interoperate reliably without custom, per-pair integration code.

The nice thing about standards is that there are so many of them to choose from.” Widely attributed.
– Andrew Tanenbaum

Traditional RPC systems like ONC RPC, DCE RPC, and DCOM were designed primarily for enterprise or campus networks, where participants were usually inside one organization, connectivity was more predictable than the public internet, and security more easily managed. As the internet became the dominant platform for distributed computing, these systems faced significant challenges:

Interoperability: Different languages, operating systems, and hardware architectures needed to communicate seamlessly. Proprietary binary formats and platform-specific implementations created barriers.

Firewalls: Traditional RPC systems used dynamic ports assigned at runtime. Firewalls, configured to allow only specific ports like HTTP (80) and HTTPS (443), blocked these connections.

State management: Object-oriented RPC systems (for example, DCOM and Java RMI) encouraged long-lived remote objects. That makes replication, load balancing, and recovery harder because state is tied to a particular server process.

No asynchronous messaging: Traditional RPC followed a strict request-response pattern. Large streaming responses, notifications of delays, and publish-subscribe models were not supported.

These limitations drove the development of web services, designed to work over HTTP and provide language-independent, platform-neutral communication.

The Rise of Web Services

A web browser provides a dominant model for user interaction on the internet, but web pages are designed for human consumption. The user interface is a major component of the content, making programmatic access to data difficult. Web services emerged to provide machine-readable interfaces that applications could consume directly.

Web services are a set of protocols by which services can be published, discovered, and used in a technology-neutral form. They are language and architecture independent. Applications typically invoke multiple remote services, an approach called Service-Oriented Architecture (SOA).

Early web-services designs included automatic discovery and registries (for example, UDDI – Universal Description, Discovery, and Integration), but in practice most services are discovered out of band through documentation, code repositories, and configuration.

SOA treats an application as an integration of network-accessible services, where each service has a well-defined interface. Components are unassociated and loosely coupled, meaning neither service depends on the other’s implementation. This provides several benefits:

XML-RPC

XML-RPC, born in early 1998, was one of the first attempts at web services. It marshals data into XML messages transmitted over HTTP. The format is human-readable, uses explicit typing, and works through firewalls since it uses standard HTTP.

XML-RPC is still relevant and useful because Python ships with client and server support in its standard library (xmlrpc.client and xmlrpc.server), so you can experiment with RPC without any third-party packages.

XML-RPC supports basic data types: int, string, boolean, double, dateTime, base64, array, and struct. The specification is remarkably simple at about seven pages. WordPress historically used XML-RPC for remote publishing and integrations, and many sites still have it enabled, although modern WordPress deployments often prefer the REST API and may disable XML-RPC because it is a common attack surface.

<methodCall>
    <methodName>sample.sumAndDifference</methodName>
    <params>
        <param><value><int>5</int></value></param>
        <param><value><int>3</int></value></param>
    </params>
</methodCall>

However, XML-RPC failed to become widely adopted. It supported only a small set of data types and lacked strong extensibility. The ecosystem never matured to provide robust libraries or broad community buy-in compared to later platforms. It also lacked rich features for interface definitions and schemas that enterprises demanded. XML itself was bulky, slow to parse, and difficult to debug compared to later, simpler formats such as JSON. XML-RPC was designed for simple remote function calls rather than complex distributed systems.

SOAP

SOAP (originally “Simple Object Access Protocol”)1 emerged in 1998 with strong Microsoft and IBM support. XML-RPC is essentially a 1998 simplified subset of SOAP. SOAP extends XML-RPC with user-defined data types, message routing, and extensibility.

SOAP provides a stateless messaging model that can be used to build various interaction patterns:

WSDL: Web Services Description Language

WSDL serves as an IDL for SOAP services. A WSDL document describes a service’s data types, messages, operations, and protocol bindings. Organizations exchange WSDL documents to describe their APIs.

WSDL is not meant for human consumption. It is verbose XML designed for tools to read. Development environments like Visual Studio and Java’s JAX-WS read WSDL documents and generate client stubs automatically, hiding the complexity from developers.

The Decline of SOAP

SOAP is still used in some enterprise and legacy systems, particularly where long-lived integrations, strict contracts, or formal standards are required. However, the dominant direction for public web APIs today is REST using JSON. SOAP has long been criticized for its complexity and tooling friction. Common criticisms include:

Despite these drawbacks, SOAP’s strong emphasis on formal contracts and standards made it attractive for certain enterprise use cases.

Google dropped SOAP support from its public APIs in 2006, signaling an early shift toward simpler alternatives.

REST

REST was introduced in response to the growing complexity of earlier web service technologies such as SOAP.

As web applications scaled and diversified, developers needed an approach that was simpler, more interoperable, and better aligned with the existing web infrastructure. REST embraces the design principles of HTTP itself, using standard methods and resource-oriented URLs rather than custom protocols and complex message formats.

REST (REpresentational State Transfer) builds on these principles by treating the web as a collection of resources rather than remote procedures. Instead of defining custom message formats and operations, REST leverages HTTP’s existing methods to operate on resources.

A resource is any data the server manages, identified by a URL. HTTP methods operate on resources:

HTTP Method Operation Idempotent? Description
POST Create No Create a new resource
GET Read Yes Retrieve a resource
PUT Update Yes Replace a resource entirely
PATCH Partial Update Depends Modify part of a resource (may or may not be idempotent)
DELETE Delete Yes Remove a resource

The core CRUD operations (Create, Read, Update, Delete) are represented by POST, GET, PUT, and DELETE. PATCH is an additional method used for partial updates when replacing an entire resource would be inefficient or undesirable.

An important property of many HTTP methods is idempotency: GET, PUT, and DELETE should produce the same result regardless of how many times they are called. POST is not idempotent because each call may create a new resource (although many APIs add idempotency keys so clients can safely retry). PATCH may or may not be idempotent depending on how the server applies the partial update.

REST in Practice

REST services identify resources through URLs and return representations of those resources, typically in JSON format.

GET /api/users              # List all users
GET /api/users/123          # Get user 123
POST /api/users             # Create a new user
PUT /api/users/123          # Update user 123
DELETE /api/users/123       # Delete user 123

A response to an the HTTP command GET /api/users/123 might look like:

{
    "id": 123,
    "name": "Alice Smith",
    "email": "alice@example.com",
    "created_at": "2024-01-15T10:30:00Z"
}

REST’s simplicity made it the dominant paradigm for web APIs. Major platforms including Twitter, Facebook, Amazon, and Google provide REST APIs.

REST vs. RPC Paradigms

The conceptual difference between REST and RPC lies in how they model and expose functionality:

RPC approach: Define procedures (operations) that act on data

getUser(123)
createUser(name, email)
updateUserEmail(123, "new@example.com")
deleteUser(123)

REST approach: Define resources and use HTTP methods

GET    /users/123
POST   /users
PUT    /users/123
DELETE /users/123

In REST, the API is organized around resources with a uniform set of operations (HTTP methods), whereas RPC treats the API as a collection of remote procedures that operate on data.

AJAX

AJAX (Asynchronous JavaScript And XML) brought web services to browser-based applications. Despite the name, modern AJAX typically uses JSON rather than XML.

AJAX allows JavaScript to make HTTP requests and process results asynchronously. The page does not need to reload; JavaScript can update portions of the page dynamically. This enabled rich web applications like Gmail, Google Maps, and collaborative editing tools.

const response = await fetch('/api/users/123');
const user = await response.json();
document.getElementById('username').textContent = user.name;

AJAX plus REST became the foundation of modern single-page applications.

API Versioning

As APIs evolve, changes may break existing clients. API versioning allows multiple versions to coexist, giving clients time to migrate.

Common versioning strategies include:

URL path versioning: /api/v1/users vs /api/v2/users. Simple and explicit, but changes the resource URL.

Header versioning: Clients specify the version in a header like Accept: application/vnd.myapi.v2+json. Keeps URLs clean but is less visible.

Query parameter: /api/users?version=2. Easy to test in a browser but clutters the URL.

Most public APIs use URL path versioning for its simplicity and visibility.

Sidebar: CORS (Cross-Origin Resource Sharing)

Browsers enforce a same-origin policy that prevents JavaScript on one domain from making requests to another domain. This security measure prevents malicious scripts from accessing sensitive data.

CORS is a mechanism that allows servers to specify which origins (e.g., JavaScript running on a page loaded from a different domain) can access their resources. When a browser makes a cross-origin request, it first sends a “preflight” OPTIONS request. The server responds with headers indicating allowed origins, methods, and headers.

CORS matters for browser-based applications consuming APIs from different domains. Server-to-server communication (like gRPC between microservices) is not affected. If you build browser-based API clients, you will encounter CORS errors; the solution is server-side configuration, not client-side workarounds.

This isn’t something you need to know for the class, but it’s something to be aware of in real environments.

The Limitations of REST

While REST dominated (and continues to dominate) web APIs for over a decade, it has limitations for internal service-to-service communication:

Overhead: HTTP/1.1 headers are text-based and repetitive. Each request carries cookies, content types, and other metadata, adding significant overhead for small payloads.

No streaming: HTTP/1.1’s request-response model does not support streaming data or bidirectional communication without workarounds like WebSockets.

Limited type safety: JSON lacks a schema, meaning any valid JSON document will be accepted regardless of whether it matches what your service expects. Errors from malformed data often surface at runtime rather than compile time.

Performance: Text-based formats like JSON require parsing, which is slower than binary formats. For high-throughput internal services, this overhead adds up.

Connection overhead: HTTP/1.1 allows only one request per TCP connection at a time. While HTTP/1.1 pipelining exists in theory, browser support is limited.

These limitations are why many systems keep REST for public-facing APIs but use gRPC for internal service-to-service calls.

Sidebar: Connection Pooling and Keep-Alive

Every TCP connection requires a three-way handshake before any data can be sent. If using TLS, additional round trips are needed for the TLS handshake. For a client making many requests, creating a new connection for each request adds significant latency.

HTTP/1.0 closed the connection after each request-response pair. Every request paid the full cost of connection establishment.

HTTP/1.1 introduced keep-alive (also called persistent connections), allowing multiple requests and responses to flow over a single TCP connection. The connection stays open until explicitly closed or a timeout occurs. This eliminates repeated handshake overhead.

However, HTTP/1.1 head-of-line blocking means requests are processed sequentially: the server must finish handling the first request before beginning the second. To work around this, browsers and HTTP clients use connection pooling, maintaining multiple simultaneous connections to the same server (typically 6-8 connections per domain).

HTTP/2 changes this by multiplexing many concurrent streams over one connection, which reduces the need for large connection pools and is one reason gRPC uses HTTP/2.

QUIC (used by HTTP/3) keeps the same goal as keep-alive and pooling: reuse a connection so you do not pay setup costs repeatedly. It also supports multiple concurrent streams within one connection, like HTTP/2.

For HTTP/1.1 services, connection pooling is critical for performance. Most HTTP client libraries handle this automatically, but you need to reuse client instances rather than creating new ones for each request. Creating a new HTTP client per request defeats connection pooling and forces full connection establishment overhead every time.

HTTP/2 and HTTP/3 reduce the need for large connection pools because one connection can support many concurrent streams, but connection reuse is still important for latency and for maintaining middlebox state.

gRPC

REST did not make RPC obsolete. REST is an architectural style built around HTTP semantics and resources, and it became the dominant model for public web APIs because it fits the web’s existing infrastructure. That success can make it sound like the story ends at REST.

For internal service-to-service communication, the problem looks different. You usually control both endpoints, you care more about latency and throughput than human readability, and you often want stronger contracts and streaming. That is where the industry largely circled back to RPC, but with a modern transport: gRPC is a modern RPC framework that runs over HTTP/2.

gRPC is a high-performance, open-source RPC framework developed at Google. It uses Protocol Buffers for serialization and HTTP/2 for transport, combining the benefits of efficient binary encoding with modern HTTP features.

The earlier notes on RPC introduced gRPC as a modern RPC framework. This section provides a deeper look at some of gRPC’s features and why it uses HTTP/2 for communication.

Why HTTP/2?

HTTP/2 keeps HTTP semantics but changes how requests and responses are carried. It is a binary protocol with explicit framing, which is faster to parse than HTTP/1.1’s text format.

The most important feature for internal RPC is multiplexing. Multiple independent streams can share one connection, so concurrent calls do not require multiple TCP connections and do not get serialized at the HTTP layer. This reduces the need for connection pooling and improves throughput when a request triggers many internal calls.

HTTP/2 also compresses headers to reduce repetitive metadata overhead and includes flow control so fast senders do not overwhelm slow receivers. HTTP/2 reduces head-of-line blocking at the HTTP layer, but TCP loss can still delay delivery across streams because TCP delivers bytes in order.

For microservices, where a single user request might require dozens of internal service calls, HTTP/2’s multiplexing provides significant speedup. All calls can happen concurrently over a single connection, eliminating the connection overhead of HTTP/1.1.

Protocol Buffers as IDL

gRPC uses Protocol Buffers both as the serialization format and as the interface definition language. Service definitions are schemas that describe the methods available, their parameter types, and return types. A schema specifies exactly what data structure each message must have:

syntax = "proto3";

service Telemetry {
  rpc GetDevice(DeviceRequest) returns (Device);
  rpc StreamReadings(ReadingsRequest) returns (stream Reading);
}

message DeviceRequest {
  string device_id = 1;
}

message Device {
  string id = 1;
  string model = 2;
  string firmware_version = 3; // can be added later without breaking old readers
}

message ReadingsRequest {
  string device_id = 1;
  repeated string sensor_ids = 2;
}

message Reading {
  string sensor_id = 1;
  int64 time_unix_ms = 2;
  double value = 3;
  Status status = 4;
}

enum Status {
  STATUS_UNSPECIFIED = 0;
  OK = 1;
  FAILED = 2;
}

The protoc compiler reads these schema definitions and generates client stubs and server interfaces for the target language. This provides compile-time type safety: if you change a message definition, code that uses incorrect types will fail to compile rather than failing mysteriously at runtime.

Beyond Request-Response: Streaming RPCs

Unary RPC looks like a traditional procedure call: one request produces one response. gRPC also supports RPCs where the “response” is not a single value but a stream of messages, and where the client can stream messages as well. This shifts the programming model from “call a function” to “open a session and exchange messages,” which fits feeds, pipelines, and interactive protocols.

gRPC supports four patterns:

  1. Unary RPC is the usual request-response call.

  2. Server streaming returns a sequence of messages in response to one request.

  3. Client streaming sends a sequence of messages and returns a single final response.

  4. Bidirectional streaming lets both sides send streams of messages independently.

Streaming is useful when results arrive over time, when the client needs to send data incrementally, or when the interaction is naturally conversational rather than a single operation with a single result.

Making RPC Work in Production

RPC frameworks live or die on the behavior around failure, time, and visibility. gRPC provides built-in support for setting time limits, carrying request information along a chain of calls, and adding shared features like logging and authentication in one place instead of repeating the same code in every method.

gRPC includes built-in support for these features:

Deadlines and cancellation: Every gRPC call can have a deadline. If the deadline passes, the call is automatically cancelled. Deadlines propagate across service boundaries, so downstream services know how much time remains.

Metadata: Key-value pairs can be sent with requests and responses, similar to HTTP headers. This enables passing context like authentication tokens, request IDs, and tracing information.

Interceptors: Middleware that can process requests and responses, useful for logging, authentication, and monitoring.

Load balancing: gRPC supports client-side load balancing, distributing requests across multiple server instances.

Health checking: A standard protocol for services to report their health status.

gRPC vs. REST

Aspect gRPC REST
Serialization Protocol Buffers (binary) JSON (text)
Transport HTTP/2 HTTP/1.1 or HTTP/2
Contract Required (.proto files) Optional (OpenAPI)
Streaming Native support Requires workarounds
Browser support Limited (gRPC-Web) Universal
Human readability Poor Good
Performance Often faster for internal calls Often simpler, sometimes slower at scale.
Type safety Strong (compile-time) Weak (runtime)

gRPC excels for internal service-to-service communication where performance matters and both sides are under your control. REST remains preferred for public APIs where browser compatibility and human readability are important.

gRPC-Web

Browsers cannot make native gRPC calls because the browser networking APIs do not let JavaScript use HTTP/2 in the way gRPC needs. They expose higher-level HTTP request interfaces, not the lower-level mechanisms gRPC relies on. gRPC-Web is a JavaScript implementation that works in browsers, but with limitations:

For browser applications needing gRPC features, gRPC-Web provides a partial solution. Full bidirectional streaming in browsers typically requires WebSockets or emerging standards like WebTransport.

Microservices Architecture

Microservices architecture decomposes an application into small, independently deployable services, each responsible for a specific business capability. Services communicate through well-defined APIs, and teams often use REST for public interfaces and gRPC for internal service-to-service communication, but real systems frequently mix these with messaging or internal REST depending on requirements.

Microservices are sometimes described as “SOA done right,” but the distinction is operational as much as architectural.

Classic service-oriented architecture (SOA) often relied on shared enterprise infrastructure such as an enterprise service bus (ESB), centralized governance, and organization-wide schemas and contracts.

Microservices push in the opposite direction: decentralized ownership, independent deployment, and simple communication patterns, aiming for “smart endpoints, dumb pipes.” This usually means minimizing cross-service coupling, including shared schemas and shared databases, and avoiding centralized middleware that performs business logic in the middle.

Microservices can be attractive when different parts of a system need to evolve at different speeds, scale independently, or be owned by separate teams. The main advantages are:

The Reality of Microservices

Microservices are not “free modularity.” They make some problems easier, but they also create new ones that show up quickly once the system has real traffic and real failures. These are costs that can be planned for and engineered around, but they do not go away.

Microservices can be the right choice, but they introduce overhead in engineering and operations that only pays off when independent ownership and independent deployment are genuine requirements.

When Microservices Make Sense

Microservices make sense when independent change is the real problem you are trying to solve. If different parts of the system are owned by different teams, need to ship on different schedules, or have very different scaling and reliability requirements, separate deployable services can be the cleanest way to prevent everything from becoming coupled through one release process.

Microservices are a poor fit when the main problem is still product iteration and feature discovery, when the team is small, or when you cannot afford the operational overhead. In those cases, a modular monolith usually delivers the best cost-to-benefit ratio, and it keeps the option open to extract services later once boundaries are stable and the need for independent deployment is proven.

The Modular Monolith Alternative

A modular monolith keeps a single deployable unit while enforcing internal module boundaries. It aims to capture many of the organizational benefits of microservices, such as clearer ownership boundaries and cleaner interfaces, without paying the full cost of distributed communication.

This approach shows up in real systems as consolidation.

Twilio Segment documented moving away from a microservices-heavy design to reduce operational complexity and improve developer productivity.

Amazon Prime Video also published a case where a distributed architecture was consolidated into a single process to reduce cost and simplify operation, and later clarifications emphasized that this was a targeted change, not a blanket claim that “Prime Video is a monolith.”

A common path is to start with a modular monolith and extract services only when a boundary has proven stable and the benefits of independent deployment outweigh the added complexity.

Observability in Distributed Systems

When a request crosses multiple services, failures and slowdowns rarely show up where the problem actually is. A timeout in one service might be caused by a slow dependency two calls away, and retries can mask the original error while increasing load elsewhere. Observability is the ability to reconstruct what happened from the signals the system emits.

Logs record discrete events. They are most useful when they include a request ID so you can follow one request across services.

Metrics summarize behavior over time, such as request rate, error rate, and latency percentiles. They tell you what changed and when, but they do not tell you what happened for one specific request. Metrics work best when they track a small set of categories, such as endpoint name and status code, rather than a unique value per request. Request IDs belong in logs and traces, not in metric labels.

Traces record the path of a single request through the system. A trace is made of spans, where each span represents one operation, such as handling an incoming request or calling a dependency, and includes timing and parent-child relationships. Tracing is what lets you answer, “Where did the time go?” across multiple services.

Request IDs

A request ID (also called a correlation ID) is generated when a request enters the system and is propagated through all downstream calls. The ID is included in logs and trace context so you can reconstruct the full request path during debugging. HTTP services commonly carry this ID in headers such as X-Request-ID, and gRPC commonly carries it in metadata.

Health checks and circuit breakers

Services expose health checks to indicate whether they can serve traffic, and infrastructure uses those checks to avoid routing to unhealthy instances. Circuit breakers prevent cascading failures by failing fast when a dependency is unhealthy, rather than letting requests pile up waiting for timeouts.

Choosing the Right Technology

The right choice depends on your specific context:

REST over HTTP remains the default for:

gRPC is preferred for:

GraphQL (not covered in detail here) is useful when:

Message queues (Kafka, RabbitMQ, etc.) are appropriate for:

We will discuss message queues and their architecture later.

Many systems use multiple technologies. A typical architecture might expose REST APIs to external clients, use gRPC for synchronous internal communication, and use message queues for asynchronous event processing.

Summary

The evolution from traditional RPC to modern web services reflects the changing landscape of distributed computing:

Key design decisions include:

Operational concerns are critical for production systems:

The technology landscape continues to evolve. Understanding these fundamentals enables evaluating new technologies as they emerge and making appropriate choices for specific contexts.

A note on hype and measurement

Architecture trends attract hype. Vendor marketing, conference talks, and copycat engineering can make an approach feel inevitable long before it is justified. The industry has cycled through “obvious” answers before: distributed objects (with insanely complex frameworks like CORBA), then XML-heavy web services and SOAP. Today, the microservices label is often applied to designs that fragment systems into too many moving parts, where communication overhead dominates, failure modes multiply, and the resulting architecture becomes harder to understand and operate than what it replaced.

Treat any framework or architectural style as a tradeoff, not a virtue. Measure it. Run benchmarks that resemble your real environment: realistic request mixes, tail latency, failure scenarios, deployment constraints, and operational load. A microbenchmark that shows “service X is fast” is not the same as a system benchmark that includes networking, retries, timeouts, tracing, load balancing, and rollouts.

Also account for complexity. New frameworks can simplify one layer while adding requirements elsewhere: schema evolution, versioning, operational tooling, debugging distributed failures, and the overhead of running and securing more components. The right choice is the one that meets the system’s goals under real constraints, not the one that is currently fashionable. We will reiterate these messages througout the semester.


Next: Study Guide

Back to CS 417 Documents


  1. The acronym was eventually dropped because calling SOAP “Simple” was laughable. SOAP is still around but is no longer an abbreviation for anything.