Magma: v1.4.0 Release

Release date:
March 16, 2021
Previous version:
v1.3.3 (released January 12, 2021)
Magnitude:
1,223 Diff Delta
Contributors:
16 total committers
Data confidence:
Commits:

18 Commits in this Release

Ordered by the degree to which they evolved the repo in this version.

Authored January 21, 2021
Authored March 5, 2021
Authored February 20, 2021
Authored March 5, 2021
Authored February 10, 2021

Top Contributors in v1.4.0

prabinakpattnaik
karthiksubraveti
ashish-acl
panyogesh
VinashakAnkitAman
emakeev
uri200
interfan7
tmdzk
AndyLKhuu

Directory Browser for v1.4.0

We haven't yet finished calculating and confirming the files and directories changed in this release. Please check back soon.

Release Notes Published

Etna Release Notes

Introduction

The 1.4.0 (Etna) Magma release contains support for some new features and fixes for other known issues. See the release page and test report for more information.

Key Features and Highlights

Stateless AGW

Magma v1.4.0 enables stateless mode for AGW by default. This mode increases the stability against crashes in core Magma services on AGW except for sctpd. The mode also enables hitless software or configuration updates for the same core services.

FWA HA

Magma AGWs can now be paired together in an HA pool via new Orc8r APIs. The AGW with the higher relative capacity is designated as the primary and the other AGW designated as the secondary by the Orc8r. The recommended configuration for relative capacity for the primary AGW is 255 (maximum allowed value) and for the secondary AGW is 1. The HA feature requires that the eNBs have MME pooling support (a.k.a. S1-Flex) and operator configures the same pair of AGWs in the MME pool set up via the vendor provided management tools. With the recommended configurations, the secondary AGW in the pool is primarily utilized when the primary AGW is not reachable anymore by the eNB. When there are UEs camped on the secondary AGW, they are eventually offloaded to the primary AGW once the primary AGW is denoted as healthy by the Orc8r. If the secondary AGW and the eNB do not have routable S1-U ip addresses, the HA feature is supported only for one eNB per site. If the secondary AGW and the eNB are in the same private network, then HA feature can be utilized for multiple eNBs in the same site.

Header Enrichment

This feature allows operators to enable header enrichment for UE HTTP traffic. The AGW service adds subscriber information to HTTP requests to contextualize HTTP requests from the server. There could be privacy implications for this feature, so operators are encouraged to check local laws before using it.

NMS Improvements

  • Subscriber state view in LTE and Federated LTE networks
    • Enhanced subscriber table to display subscribers setup through federation apart from the configured subscribers. The subscriber table comes with a drop down which displays detailed session information for the subscriber. Additionally the “View JSON” option for the subscriber displays the subscriber JSON state in entirety.
    • Subscriber table additionally also displays IP Address and Active APNs
  • Call tracing feature
    • Added support for basic traffic capture for troubleshooting purposes. Monitoring of control messaging flow and other traffic between the Magma access gateway and eNodeB devices is possible with this feature.
  • There is a new UI for adding and modifying APN information
  • There is also a new UI for adding and modifying policies, rating groups and QoS profiles.
  • Metrics Explorer
    • A new metric explorer has been added to enable users to view all the metrics exposed by Magma. It additionally comes with a drop down which opens the metric in grafana and enables the user to make queries based on that metric.
  • Logs and events aggregation is enabled on the gateways by default.
  • Event table enhancements
    • EventTable is modified to display events across all event streams, earlier it was only restricted to magmad and sessiond streams.
    • User can filter event table based on event streams(mme/sessiond/magmad etc), event types or tags
  • Gateway Log Table enhancements
    • Users can now filter logs based on service and tags
  • Alarm Table enhancements
    • Added the ability to synchronize predefined alerts from the Alarm component
    • Cleaned up the dashboard alert table to display the alerts with their severities in their respective tabs and pulled out the labels from alerts and displayed them in a separate column on the table.

FeG

  • Additional ability to filter by charging characteristics to apply virtual APNs (#4164)

Debugging Tools

  • The show-tech tool enables operators to capture the essential state of the gateway (currently only supported on AGW), packages it and dumps it in a pre-configured destination directory. The collected data can then be shared with support teams to help identify and resolve issues quickly. Check out the associated docs

Known Issues

  • Access Gateway sctpd process may show the following error message under moderate load. In most instances, this is auto-remediated through an sctpd service restart while raising an unexpected service restart alert in the NMS. However, in rare situations, repeated service restarts may not recover the sctpd service and the only remediation is a manual access gateway reboot; a critical alert will be raised in the NMS for repeated restarts under 5 minutes for all impacted services.
 util.cpp:58] sctp_bindx ADD error error (98): Address already in use
  • When gateways or eNodeBs are added and when the top level gateway or eNodeB page doesn’t auto-refresh, it is possible for user to see stale values they click to view gateway or eNodeB detail page. User might have to refresh to see the latest values here. (#4985)
  • Stateless vs. stateful performance expectations. Stateless feature uses Redis to persist state. This brings increased stability against failures or support hitless software upgrades for magma package, however the control plane performance can see some degradations in terms of attach/detach rate per seconds. Since stateless is enabled by default, users need to disable it per AGW if higher performance is desired. Our scaling tests were done with attach/detach rates of 10 UE per second.
  • NMS
    • Read-only users are not supported in NMS (#5477)
    • Call tracing feature doesn’t work with lower timeouts. We need it to be set at, least to 300 secs for peoper function of the feature (#5478)
    • Unexpected service restart alerts might not get triggered for services which restart too quickly. This can cause sctpd unexpected restart alert to not be triggered. (#5479)

Upgrade Notes

  • As of v1.3.2, the apt source needs to be updated in order to get the latest tagged AGW build. Hence, it is required to modify /etc/apt/sources.list.d/packages_magma_etagecom_io.list on the gateway. Instead of “stretch stretch-1.3.3 main", please replace with "stretch-1.4.0 main". Following this, an apt update and apt upgrade magma will be required to finish the AGW upgrade.
  • The desired AGW tag is 1.4.0-xxxxxxxxxx-yyyyyyyy.
  • Upgrade from v1.3.x to 1.4.0 for the Access Gateway will require operators to use a force option else the upgrade may fail due to unmet dependencies. To mitigate this issue, it is recommended that the upgrade be done in the following fashion:
apt update
apt upgrade magma -o Dpkg::Options::="--force-overwrite"

Orc8r Upgrade (service mesh changes)

  • The upgrade steps are very similar to past upgrades except the following modifications need to be noted. Note that these are also discussed in detail on the v1.4 upgrade notes in the github documentation.
    • Set cluster_version in module orc8r in the main.tf file. This should be the current K8s version that was deployed with v1.3.x orc8r deployment.
    • Terraform should be upgrade to at least 0.14.0 on the host from where the Terraform commands will be run.
    • orc8r_deployment_type needs to be specified in module orc8r-app in the main.tf file. See upgrade documentation for more specifics.

Compatibility and Interoperability

Supported and validated setups (Static, DHCP, NAT, Non-NAT)

Critical bug fixes

AGW

  • ZMQ fix for thread safety for shared zmq contexts (#5307)
  • Subscriberdb db lock mitigation for concurrent S6a calls at high UE attach rates (#4698)
  • QoS performance improvement replacing subprocess calls with pyroute2 (#5240)
  • Pipelined getting stuck and hanging fix (#5255)
  • Pipelined performance and recovery redis fix (commit (https://github.com/magma/magma/commit/f9a78ea113c19fff52580fd47ad6043667240c09))
  • Mobilityd stateless performance improvement (#5247)
  • Fixing SPGW state racing condition during IP allocation / create session through ITTI messaging (#5416, #5456)
  • Removed thread unsafe MME NAS state synchronization from non-owner threads (#5426)
  • Add proper handling of failed IP address allocation (#5200)
  • Avoid stale packet buffering on sctpd during mme restarts (#5355)
  • Unaligned memory fixes (#5262)
  • Clean up stale session metrics (#5306)
  • Restore metering metrics on SessionD service restart (#5290)
  • Fix race condition while handling duplicate attach requests (#4769)
  • Added a service restart check if SGi port is part of uplink-br0 before resetting interface IP address (#5495)

Other fixes

AGW

  • Timer ID update during resuming timers (#5303)
  • Reset hash table size to zero upon table destruction (#5108)
  • Fix null eNB context access (#5084)
  • Fix null pointer access in delete session response (#4991)
  • Fix eNB id initialization that leads to s1 setup failures for eNB ids that have a zero value (#4952)
  • Remove redundant call for handling modify bearer response (#4946)
  • Clean up stale eNB state if SCTP event is missed by stateless MME (#4677)
  • Fixed a pyroute2 version dependency (#5460)

NMS

  • Fixed the subscriber usage metrics (#4222)
  • Adding network from network selector component was broken. That was fixed (#3491)
  • Added back the ability to invoke gateway commands from NMS