In mobile communication, simply put, latency is the time user traffic takes to travel between the user mobile smartphone and the service in the network (e.g., a video streaming server). Dependent of the type of application, the less is this time the better is the experience from the user point of view: what if you have to wait 10 seconds for your video to start? You would probably not appreciate it!
But in today’s communications there are scenarios in which latency is way more important than in a human streaming experience. Think about a safety critical process, like in an industry, in which you need to arise an alert due to some problems in a very short time, in order to have fast response from emergency systems and avoid major incidents. Or imagine if you are in a self-driving vehicle and a sensor on the road, connected through the network, detects a pedestrian suddenly crossing at the next corner (maybe a kid running behind his ball). In this situation, it is imperative to get in a very short time the alarm message to the car, in order for the automotive system to immediately react and stop the vehicle in a total time of some seconds (consider that the biggest part of this time would be needed by the braking system to have effect, so the communication between sensor and car must happen in a way shorter time – say few milliseconds).
An exhaustive list of different use cases requiring low latency can be found here.
Considering the overall mobile communication system, two are the main parts involved: the Radio Access Network (RAN), composed by all the antennas giving connectivity to the mobile nodes (e.g., smartphone, tablets, but also sensors, cars, etc.); and the Core Network (CN), composed by all the functions managing user authentication, session establishment, mobility management, etc.
Latency on the access network basically depends on the base station type and on the associated radio technology being used. 4th Generation (4G) base stations currently cannot ensure in average latency less than 10-20 ms, while 5G New Radio should ensure way less latency (~1-2 ms).
But optimising the end-to-end latency not only involves the radio part, but also requests optimising latency in the core network, since the core is in between service and radio network.
In the core, latency mainly depends on how distant the service to access is from the mobile user. This is quite intuitive: imagine you want to access a video in some server; of course the time to reach the server will be much smaller if the server is located just outside the street where you live or if it is somewhere on the other side of the world!
In Spotlight, among many others, we are addressing this problem and we are looking for feasible solutions to bring the service at the edge (near the mobile user), in order to achieve low latency.
Bringing the service to the edge is anyway by itself not sufficient. It is also necessary to at least partially re-engineer the core network architecture. For instance, there is a component called UPF (User Plane Function) which plays the role of session anchor for the user traffic: this means that traffic towards and from the service (e.g., our YouTube server) has to pass through the session anchor to be correctly routed to the mobile device. So, if we place the service at the edge (near the user) and the session anchor at the centre of the network (far from the user), we are basically forcing our traffic to follow a long and non-convenient path, which will not give any benefit in terms of total latency, as visible in Figure1.
What we are proposing and working on, therefore, is an architecture in which the session anchor (UPF) is moved towards the edge, and some new elements (DPNs + elements to enforce policies on DPNs) are added after it in order to properly route user traffic. In Figure2, for instance, the mobile user is a connected car, which may request services to an Edge Cloud, achieving low latency, or to a Central Cloud. Note that this proposal is thought to integrate with current standard in 5G core network (defined by 3GPP standardization body).
The drawback in using Edge Clouds is mainly related to user mobility. As in the previous figure, the mobile user could be for instance a connected car, which continuously changes position and, therefore, the physical antenna to which it is attached to (this process of changing antenna is called Handover).
But then, when changing its location, the previous Edge Cloud (chosen near the previous car’s location), will no more be near to the new car’s position.
Of course we can attach to a new Edge Cloud, near to the new location. But this means we need to perform UPF relocation, which means choosing a new UPF near the new Edge Cloud (because, the UPF is our session anchor and using the old UPF as in Figure3 would mean having user traffic deviating through the previous location, increasing latency!). So we are studying mechanisms to have the UPF relocation in the simplest and less invasive way for system performances.
What we are also keeping in mind while finding a solution for relocation, is to avoid breaking down the ongoing sessions between the user and the service: this is generally referred to as Session and Service Continuity (SSC).
For more details on what described in this article, have a look at our work presented in Barcelona IEEE CAMAD 2018 or at our latest contribution to standards (video of the presentation by Dr.Liebsch at IETF103 here) and in case feel free to contact me for further questions.