We'd like to have more than one computer,
connected together via some kind of network,
and use this to provide a shared, interactive,
real-time simulation...
...to an arbitrary set of users, who are
possibly dispersed world-wide.
Easy Peasy, Lemon Squeezy?
NO! It's Difficult Difficult, Lemon Difficult.
Why So Difficult?
Networks are intrinsically unreliable. (i.e. your messages may not arrive)
They have limited capacity. (i.e. yr msgs wll hv 2b cmprsd)
Networks are so complicated, unreliable and diverse that, nowadays at least, we rarely program to them "directly".
Instead, we access them via system-level libraries which implement various somewhat abstract "protocols" that are intended to make the whole process slightly less awful and maddening.
UDP (User Datagram Protocol) is part of the Internet Protocol Suite (aka TCP/IP), and it provides a relatively "raw" form of access between an Application and The Internet.
As such, it exposes the App to all the craziness of the underlying network, and requires the App to be written to handle unreliability/congestion/etc.
On the plus side, it is simple, low-overhead and "fast".
Unlike UDP, TCP (Transmission Control Protocol) is a non-trivial protocol which aims to insulate Applications from the unreliability of the network, by having various built-in error checking and correction mechanisms.
This allows it to provide a reliable, ordered and error-corrected communication stream between Apps.
In essence, this approach means that all messages on the network are split up into smallish units (e.g. "packets", "datagrams", "frames" or whatever) which are then sent individually by whatever route is available at the time.
It's like putting your "electronically mailed letter" onto a series of postcards, and sending those individually.
Your Postcards In The Tubes
...and, with UDP, that is all you get!
UDP is 100% "fire and forget" --- your postcard may arrive, or it may not. If it arrives, it might be smudged to illegibility when it eventually gets there. Even if it was sent as part of an ordered batch, it may not arrive in any particular order, or by any particular route.
TCP, on the other hand, puts sequence numbers on each postcard and sends replies to acknowledge their receipt (or to request the resending of missing ones)
Which Protocol To Use?
You might (naively) assume that, because TCP does a lot of useful "corrective" work for you, you should always adopt it.
Unfortunately, the downside of all that "fake reliability" is that it has to do a lot of buffering and, where required, re-sending of the data... which uses memory and, perhaps more importantly, takes time.
i.e. it makes the "latency" even worse.
And latency is frequently The Enemy in games.
Simultaneity?
Recall that, in a networked game, we're usually trying to create the sense of a shared, real-time simulation --- this implies that the actions of all participants should, ideally, be treated as happening simultaneously, and be made visible to other participants instantaneously.
The reality of network latency gets in the way of this, and it's usually up to us to minimise its effect.
Typical internet latencies might be around 100 milliseconds (for international round-trips).
Let's Look At Latency
The speed of light presents an interesting theoretical limit on latency. If you assume no packet switching or routing delays, you could calculate the theoretical minimum signalling time from Berkeley (CA) to Boston (MA) as follows:
Distance from Berkeley to Boston: 4983 km
Speed of light in a vacuum: 3 * 10^8 m/s
Speed of light in fibre: .66 * 3 * 10^8 m/s = 2 * 10^8 m/s
Time to go from Berkeley to Boston: 4983 km / (2 * 10^8 m/s) = 24.9 ms
Round trip time: 2 * 24.9 ms = 49.8 ms
Plus Delays...
Realistically, you can expect to add a fudge factor of about 20 ms for switching delays, imperfect transmission media, cosmic rays, etc., so you'd typically see round-trip times of maybe 70 ms or so across the U.S.A., from the West Coast and the East Coast.
I've found some online sources which suggest that these numbers are in the right ball-bark, at least.
Some combinations of game and network are simpler to deal with than others. For example, a non-"twitch" turn-based game (e.g. Chess) is relatively insensitive to latency, and requires very little bandwidth.
Also, a game being played over a LAN or, better yet, between two computers directly connected by a cable (e.g. a "null modem" link) is much less of a challenge than one which has to deal with the full chaos of the Internet.
Real-time, twitchy, internet games are the Hard Cases.
What To Send
One of the central questions when designing the network architecture for a game, is "what do we send?". There are several possible answers. You could potentially send:
raw user-inputs (e.g. key-presses)
logical "actions" (e.g. each "move")
states (e.g. where things are)
state-deltas (e.g. how things change)
pictures (e.g. actual rendered images!)
How To Send
Some of the practicalities are also dictated by the expected network infrastructure that your game will be running on. You don't always have much choice over this, but options include:
Direct cabling
Direct wireless
A Local Area Network
A Campus Area Network
A Wide Area Network
Topology
There is also the matter of "Network Topology" i.e. the "geometry" of how the nodes in a network are actually connected. e.g.
This game from 1996 supported up to 6 players over a LAN-type connection, using a peer-to-peer topology.
Each "peer" computes and sends local state information (e.g. position, velocity etc.) about the entities over which it has authority. Remote entities are interpolated/ extrapolated from the received network inputs.
i.e. loosely-coupled "Distributed Simulation"!
Lots of problems: big corrections, even divergent outcomes!
Lock Step
The standard approach to limiting peer-to-peer divergence is to use a "lock-step" system to keep everyone in sync.
i.e. each machine waits for the others to take their turn, and send their results, before any further computation is done.
This guarantees sync... but at the expense of imposing "weakest link" latency on all participants. The resulting system is also very susceptible to connection problems.
It doesn't scale well (maybe up to 4 or 6 players)
Local Buffered Input
One of the problems with lock-stepping is the need to wait for the messages from all the other machines to be received before you can proceed.
It's possible to work around this by delaying the local inputs (by approx. the network latency) and having everyone agree to run the simulation based on "old data" in the meantime.
This might sound a bit mad, but it can be made to work, and provides a nice, simple, low-bandwidth solution in some cases e.g. Crackdown (2007)
(End of Part One)
Authoritative Server
Peer-to-peer networking definitely has its limits though (e.g. security, scalability). So, in many cases, it's generally more practical to use a client-server model instead.
The idea here is to put one machine "in charge" of the overall true state of the simulation, and have it act as a comms hub which tells all the clients what to do.
In its simplest variant, the clients are entirely "dumb" and simply send control inputs up to the server, and then let it inform them of what to render, and where.
Partial Information
The C-S model is often used in simulations where the players are somewhat dispersed "in game", and do not generally have perfect information about each other.
It is also used in scenarios where the individual clients may not be powerful enough to compute the entire simulation state themselves.
As such, they rely on the server to tell them what is going on, rather than calculating it locally from all the user-inputs...
Sending State
The lack of "perfect information" on the part of clients means that they need to be explicitly informed about the state of relevant entities, which is a big shift from our early ideas of "broadcasting inputs".
However, although sending states is, in principle, more bandwidth-intensive than sending inputs, if the subset of relevant entities is small enough compared to the total, this approach can actually turn out to be a saving overall.
(especially when combined with the careful use of deltas)
Client-Server Downsides
One downside of the C-S model is that we now have to deal with round-trip latency (from client to server and back) instead of the direct one-way latency of Peer-to-Peer.
This can also introduce a degree of "unfairness", where players with faster links to the server enjoy a significant advantage (especially the player on the server).
Also, by having the server do "all" of the work,
the simulation is no longer distributed, and the
server itself becomes a significant bottleneck
(and a Single Point Of Failure).
Client-Side Prediction
One approach to the C-S latency problem is to allow the
clients to make local predictions about the future state
of the simulation, thereby "hiding" the network delay
for any locally predictable actions.
Of course, the clients have incomplete/imperfect data about the simulation and (usually) no "authority" over it, so these predictions are merely "educated guesses", which can potentially be over-ruled by the server.
Among other things, it's hard to predict the other players.
Prediction is Difficult
...difficult, lemon difficult
...especially about the future
One of the consequences of client-side prediction is the creation of a weird time-warp effect, in which the state of locally predicted entities runs ahead of the true situation on the server, while the state of unpredicted remote/foreign entities lags behind the truth by a similar amount.
This means that interactions between local and foreign entities tends to be unreliable e.g. collision with other players can be problematic (as can line-of-sight, aiming, shooting). :-(
When Predictions Go Wrong
If a client-side prediction turns out to be wrong, it must somehow be corrected.
The corrections often involve a significant change in the apparent state of some entities and, if these happen abruptly, the result is noticeably ugly.
To address this, clients will typically blend towards the corrected state... in fact, to avoid compounding the latency problem, they must actually blend towards an extrapolated future state! This can be very "lemon difficult" indeed...
Prediction Buffers
A particular difficulty with implementing prediction-correction is handling the knock-on consequences of everything that has happened since the incorrect prediction was made!
Nowadays, a common approach to this is to have the client retain a buffer of the recent history of local player inputs, such that it can "replay" these inputs on top of any corrections that it receives, to provide a kind of semi-reliable extrapolation.
Have I mentioned that this is difficult?
The Limits Of Prediction
Because a prediction can potentially be wrong, and therefore be corrected or revoked at a later stage, it's not possible (or, at least, not advisable) to predict things which cannot be cleanly undone again.
For example, it is usually unwise to "predict" something such as a death or a pick-up event.
MMOGS!
OMG!
Massively-Multiplayer Online Games take most of the previous "network gaming" difficulties... and extend them to ludicrous extremes.
Generally speaking, the architecture used is a client-server one, typically with strong server-side authority (for multiple reasons: including in-game "anti-cheating" and, quite frankly, real-world "anti-piracy"). The resulting centralisation creates significant server-side burdens, which can be ameliorated by some client-side trickery and clever optimisation techniques.
MMO Scaling
When you have 100s or 1000s of players on a single "server", the per-player costs (for CPU, RAM and Bandwidth usage) start to become very significant, even if they are relatively small on an individual basis.
Also, any kind of worse-than-linear scaling becomes a real problem. A big part of designing the engine-tech for an MMO involves tackling all those O(N^2) logic hotspots, which can be found in "naive" collision-detection and object-relevance computations: smart spatial data-structures, selective object activation and clever caching can all help here.
Sharding
Another standard approach in MMOs is to split-up the (potentially very large) player-base into a number of "shards"/"parallel universes", each of a limited capacity.
This allows most of the "scaling" to be done in simple linear terms... just by added more shards --- which, in practice, means buying (or, these days, renting) more servers.
This is, for the most part, a nice, simple, economical way to handle the large-scale scaling issues... albeit at the expense of splitting-up the player base, in a somewhat limiting manner.
We had to support 100 players (each with a uniquely customised character mesh and a unique vehicle) in a single shared shard, using the Unreal 3 Engine ---
which, in its native form, struggles to deal with around 16 (generally non-unique) players.
We killed all the O(N^2) logic, overhauled the "octree", rewrote the "relevance" system, and added prediction, interpolation, compression and a bunch of other stuff.
...it's a pity that no-one actually bought it, really. :-(
EVE has somewhere around half a million active accounts world-wide, with a PCU (peak concurrent users) count of approximately 65 thousand.
And they are all in a single logical shard!
The "EVE Way" is very unusual: The CCP system operates as a Distributed Computing Cluster, comprised of a "mesh" of many high-powered physical and logical "server" machines, among which the overall load is distributed at runtime.
A Picture of EVE
What data does EVE send?
(from client to server)
Not user-inputs
Not states
Not state-deltas
It sends function calls!
i.e. EVE is actually an RPC (Remote Procedure Call) system.
Such RPCs have significant latency though, so what should the client do while it's waiting for the reply from the server?
The answer is, "it should do something else". ;-)
Something Else
The "something else" is achieved by switching over to some other independent task -- a bit like what your operating system does when a process is "blocked" waiting on I/O.
In the EVE code-base, this is done via something called "Tasklets", which are a kind of lightweight thread-like facility (sometimes called "green threads") that is managed by the application itself (instead of the operating system).
Only one Tasklet is actually running at any given time, but they allow us to keep active during our "async" RPC delays.
EVE Server-to-Client
In addition to responses to RPCs, the other main traffic from the EVE server to the client is a description of what's happening in the physics simulation of nearby entities (a region known internally as the "ball park").
This takes the form of a relatively low-bandwidth "command stream", containing info such as "Ship <x> is now orbiting Object <y>"
This info is sent reliably, and is used to implement a deterministic simulation on both server and client(s).
"Cloud Gaming"
Back at the start, one of the possible answers on the "What To Send" slide was "actual rendered images"...
and we treated this proposal with the derision it so clearly deserves.
BUT!
This may be the seed of a plan so crazy that it Just Might Work!
The Cloud In Your Pocket?
Today, in 2021, we have lots of cute little network-equipped mobile devices with pretty decent graphics -- but, in certain respects, limited overall power. Certainly, they aren't really able to run "full quality" console-like experiences.
Also, even if they were, the practical difficulties of producing games for all of these different (and fragmented) platforms, including the large memory requirements of many games, and the hassles associated with installation and upgrading, makes the prospect unattractive in many ways.
A Universal Client
...but what if you only needed a single, rather simple, app to be installed on your device (essentially a kind of custom video-streaming thing) to let you play any game, by having all the real work take place on servers in the "The Cloud".
i.e. You'd have a "dumb client" which sent control inputs up to the server, and received rendered images as a result!
It sounds sort-of-cool, but also seems very impractical...
Surely this sucks, for both bandwidth and latency?
Not Necessarily!
Most devices (and networks) can already handle pretty decent real-time video streaming, and modern super-powered server hardware can perform the necessary encoding and compression in real-time too!
So, the real problem is latency.
...which can sometimes be hidden, at least partially.
Latency Hiding
One of the really cool ideas about latency hiding comes from the realisation that much of what happens in a simulation doesn't depend directly or immediately on the user's input, so a little bit of "cause and effect lag" is often invisible.
What you really notice is the lag in your own movements...
And, clearly, your own movements do have an immediate effect, but it's actually quite a narrow one... it just changes your "point of view" (aka "view matrix").
Time Warping!
The details of this are complicated, but the basic idea is that you can draw an image based on an old position and then, at the last minute, "warp" it to match a new position.
So, the server can render, and send, an image based on your "lagged" position, but your client can then tweak it a little to match your instantaneous local position -- lag begone! (ish)
(This requires sending the scene's depth information in addition to the standard colour components, and it isn't trivial, but "smart people" are looking into it).
The Time War
Notably, John Carmack worked on techniques of this kind as part of the War On Latency that he's been waging, especially in the context of 3D "Virtual Reality" gaming (which is particularly sensitive to latency problems), during his work at Oculus (now "Meta").
Will any of this pan out? Is "Cloud Gaming" (with or without crazy 3D stuff) The Future?
I dunno! -- it could go either way, but it's not as completely insane as it first appears, and I thought it might interest you.
OLD NEWS: I'll be taking my original set of "Computer Game Programming" lectures offline for a little while, but will be publishing them again (with some revisions), week-by-week, for the Autumn 2021 version of the course, starting on August 23rd.