Due to an issue with a software update, the Policy and Charging Rules Function (PCRF) incorrectly enforced prepaid balance checks on postpaid customer organizations. Consequently, organizations were erroneously blocked, initiating a process that disconnected (off-boarded) devices previously online.
Before identifying and halting the erroneous process, some devices had already been off-boarded. Subsequent attempts by these disconnected devices to reconnect triggered a signalling storm, resulting in an overload of one of our core network functions ( Authentication Centre - AuC) and causing recovery delays. Recovery resumed successfully following scaling of the overloaded network function, after which off-boarded devices could reconnect.
Full recovery occurred 69 minutes after the fault was introduced.
A number of postpaid organizations received an organization-blocked event and email notification.
Some organizations were effectively blocked, preventing new device connections from approximately 08:33 UTC until all incorrect blocks were lifted at 09:08 UTC (35 minutes).
A small number of organizations experienced disconnection of active device data sessions, impacting a portion of the devices connected at the time of the incident. These disconnections occurred between 08:33 and 08:43 (UTC).
The vast majority of online devices continued normal operations, provided no change occurred in local network conditions or the devices did not disconnect independently.
New connection attempts and reconnection attempts of previously disconnected devices encountered difficulties registering on the network due to the signalling storm. This resulted in significantly reduced success rates for Update Location and Create PDP Context procedures. The overload condition persisted from 08:35 until 09:42 (UTC), when additional AuC capacity was provisioned.
The following corrective measures were undertaken to resolve the incident:
The severity of this incident was exacerbated due to the following factor:
Detailed investigations will continue over the coming days; however, the following corrective actions are currently planned: