Over the next few posts, we will look at how to (more effectively) synchronize players motion between a custom server (Java Micronaut in our case) and Unreal Engine client. Here are the connected posts to this overview.
- How to make effective comms between UE and custom server (This post)
- Implementing a good backend system design
- Connecting your Unreal Engine to your websockets (synchronize players)
- Motion smoothing for your actors
- How to spawn and control mobs using UE connected to our MMO server
We’ve looked at something similar in the past, for example in this post we explored using websockets to get nearby players information.
We will be using Websockets to achieve this synchronization again, but it will be working quite differently to before.
We will be achieving the following:
- Push only UPDATES from your client
- Receive only UPDATES to your client
Pretty simple? Not really, but perhaps it is simpler on the client side as the micronaut server will have to do a lot more work.
In this post we will explore the general approach we will take and how it will work, alongside a quick demo.
The Micronaut server code can be found in Github for reference.
Why this design and not replication model?
The ‘out-of-the-box’ replication model from UE is great – I want to clear that out of the way.
Furthermore, there’s fantastic work being done to leverage it and extend it further, such as Open World Server.
Note that I would probably use that for my own game if I decide to make it! The work required to complete custom server architecture is very significant.
But ok with that out of the way, what are the issues of the replication model?
Simply put is the player limit per shard.
With UE server replication it will be very difficult to achieve more than 100 players in a particular zone and I think that’s a big flaw (in my opinion). Fair enough the game can handle millions of concurrent players, but it will never allow 100+ (or there-about) to interact with each other simultaneously in a zone.
The reason why is because the UE server is a monolith design which is doing too much and its very difficult to split that workload.
This is essentially the architecture that I am drawing up, where you can split those responsibilities and concerns.
Client – Server communication design
Here’s a basic diagram of what will be achieved, abstracting the DB models. It demonstates how the UE server, UE player client and Micronaut server will communicate between each other.
Sockets are session based and can be great to broadcast message updates to multiple players at once. The issue is that when you scale your service, some sessions will not know about the broadcasted message, to overcome this we use Kafka to publish messages which other servers running websocket sessions listen to and can publish updates when necessary.
Perhaps it will be easier with some examples.
Let’s say we have a UE server controlling 1 mob (id=mob1) connected to Micronaut instance 1.
We also have a player on UE client connected to Micronaut instance 2.
If the player is in the range of a monster controlled by UE server, we should subscribe to those events, such that whenever the mob moves, we will get that update (almost) immediately (allowing us to bypass even DB storage, but not kafka as we’ll soon see why).
On a socket, I can add session parameters to say I want to subscribe to mob1 events.
Each time the mob moves I can broadcast to anyone subscribed to this event.
However, this is not going to happen if the two Micronaut servers are running separately.
This is where Kafka can come in. It allows us to load balance all the work between any available nodes and send updates again to the socket.
Note that it will be more than just motion passed using the socket. Motion updates will just be the first ones, but later all comms can leverage the socket. For example, inventory updates too.
General architecture
The architecture for this can be found here: Real-Time Gaming Infrastructure for Millions of Users with Apache Kafka, ksqlDB, and WebSockets (confluent.io)
Of-course the application is different but it demonstrates a very scalable approach to solving the problem.
In particular, this diagram from the article is great: