Magma: v1.6.1 Release

Release date:
November 17, 2021
Previous version:
v1.6.0 (released September 7, 2021)
Magnitude:
2,111 Diff Delta
Contributors:
13 total committers
Data confidence:
Commits:

62 Commits in this Release

Ordered by the degree to which they evolved the repo in this version.

Authored September 29, 2021
Authored October 30, 2021
Authored September 29, 2021
Authored October 1, 2021
Authored November 9, 2021

Top Contributors in v1.6.1

ardzoht
amarpad
uri200
themarwhal
tmdzk
pshelar
ulaskozat
ymasmoudi
koolzz
ssanadhya

Directory Browser for v1.6.1

We haven't yet finished calculating and confirming the files and directories changed in this release. Please check back soon.

Release Notes Published

Introduction

The 1.6.1 release of Magma is focused on improving the product stability and addressing known issues/bugs. Please note this version was previously only a Release Candidate, but it has now been more extensively tested. For more details on the project’s improved level of release testing, please refer to the test report and TeraVM logs.

Key Features and Improvements

S1 Mobility

Fixes are backported. On v1.6.0, feature tests were failing to send handover notification messages over the real RAN environment.

MME Service Improvements

  • Fixed bug in servicing sessiond triggered session terminations.
  • Reduced the load on sessiond and pipelined by suppressing TEID updates during idle-active transitions to reduce the risks of sessiond triggered session terminations and detaches.
  • Fixed race condition in congested scenarios where dependent library recycles timer ids on one thread while expired timer is not yet processed on another thread.
  • During error events caught by handlers for ZMQ timers, (-1) return value was leading to individual thread exits without stopping the service itself. This issue is fixed by logging the exceptions and returning okay.
  • As a failsafe mechanism, individual thread exits from ZMQ event loops are caught and asserted to force MME service itself to restart. Due to the fix above, this failsafe mechanism isn’t expected to be called.
  • T3485 timer handler for default bearer activation was not resetting the timer ID, leading to unfulfilled retransmissions after the first retransmission.
  • S1AP UE states were not committed to Redis during IDLE/ACTIVE transitions that are fixed on v1.6.1.
  • Fixed issues for attach requests with unknown GUTI.
  • Fixed issues with the optimization of mme state synchronization that were only done for registered users before.
  • Memory leak fixes during NAS and session procedures.
  • Removed unchecked and hence unsafe pointer access.
  • Stop service303 timer in the right thread during service exits and restarts.

Data Path and Stability Improvements

  • Return error status from pipelined as GRPC response when local event for flow activations time out.
  • Service restarts would lead to loss of SGi connectivity in bridged mode AGW. Now flows are restored properly on SGI bridge on service restarts.
  • Enabled GTP-U checksum.
  • IPFIX sampling port is fixed.
  • Fixed vlan matching, where absence of wildcard in vlan match rule was leading to match failures.
  • Avoided the need of service restart when the uplink bridge is reconfigured due to port addition.
  • Enabled GTP-U echo by default and make it configurable.
  • Enabled GTP-U echo response.
  • Removed auto interface bring-ups to eliminate recursions in ifup script.
  • Eliminated continuous deletion of default drop rule to improve performance.
  • Fixed issues on bringing up SGi interface upon AGW reboots when interface file is missing.

FEG Improvements

  • To support some specific vendors, Service Requested Unit AVP needs to be set empty. A configuration option in the environment file is added to support this.
  • The default config value for the credit request unit is changed from 200Kbyte to 10Mbyte to prevent spamming.
  • Removed origin-state-id from S6a messages as per spec.

Other Improvements

  • Sessiond race condition fixes that would lead to failed service requests and stale states in sessiond.
  • Monitord failures due to “too many opened files” are fixed. The issue was caused due to holding onto inactive subscriber IPs and calling a large number of subprocess calls.
  • Subscriberdb had a bug introduced during the optimizations done for v1.6.0 that would lead to failed subscriber synchronization if services restart at the exact time of data sync from the cloud. The issue is fixed by changing the ordering of processing subscriber data and digest data.
  • Default log levels for directoryd and state services are changed from DEBUG to INFO level.
  • Service health watchdog improvements to catch all situations where a particular service is down.
  • AGW bootstrapping service stability improvements due to revoked certs.
  • Alert rule and threshold changes for S1 setup failure and cert expiry. NOTE: Alerts have to be re-synced to properly work with the certificate expiry time. The previous logic was incorrect and the alert was triggered when the certificate expiry date was greater than 720 hours (the fix is to invert the logic).
  • Updated OVS dependency.

Known Issues

  • HIGH BW traffic test fails to match expected data rate (9591)
  • QoS Flow 200UE Test Fail (9587)
  • In scale tests involving active/idle transitions, in a few instances, service rejects and timeouts ranged from 1-2% of total service requests. The raw count of total service requests was greater than 12,000 requests in a 120 minute test. period. (9690)
  • Header Enrichment enablement issue on non-NAT setups (10338)
  • Connectiond service restarts frequently (10081)
  • Orcr8r to AGW relay function can be severed for up to 30 seconds at a time due to a known bug which may elevate failures for procedures requiring a federation gateway. This however, does not impact any configuration management of the gateways.

Bare Metal Install

For new gateway installations on a Ubuntu 20.04 Server, please use the following install script:

  • wget https://raw.githubusercontent.com/magma/magma/master/lte/gateway/deploy/agw_install_ubuntu.sh

Bare Metal Install Log

Upgrade Notes

To upgrade an existing AGW, please run the following upgrade script:

  • wget https://raw.githubusercontent.com/magma/magma/master/lte/gateway/release/upgrade_magma.sh

Upgrade Logs

Image Versions

  • Ubuntu - 1.6.1-1636529012-5d886707

Orc8r

  • Previous upgrade instructions published for v1.6.0 are still valid and relevant for the upgrade from v1.5 to v1.6.1. Note: Terraform v0.15.0 was used to do the upgrade.
  • Orc8r 1.5.0 → 1.6.1upgrade logs (using terraform commands)
  • Orc8r 1.6.1 Fresh Install logs (using cloudstrapper)

Critical bug fixes

  • fix(sessiond): fix race condition on handle_activate_ue_flows_callback (#10213)
  • fix(mme): Backport PR 10186 (#10202)
  • fix(pipelined): Add grpc error code for timeouts (#10155)
  • feat(session_proxy): add a flag to disable Service Requested Unit AVP (#8697)
  • fix(mme): Using pair of mme_ue_id and timer_id and storing on set for (#9646)
  • fix(agw): enable checksum for GTPU (#9359)
  • feat(mme): Adding mme_app_imsi_timer_map and MME ue id validation che… (#9593)
  • fix(mme): Fixing guti unknown attach and adding s1ap test case (#9339)
  • fix(mme): Replace RETURNerror with RETURNok in timer handlers (#9416)
  • fix(mme): Fix for handling unknown GUTI attach (#9304)
  • fix(agw): Fixing monitord too many opened files error (#8810)
  • fix(mme): Fix NAS common procedure check segfaults (#8808)
  • perf(pipelined): backport-v1.6: Avoid service restart on bridge recon… (#7851) (#8428)
  • fix(agw): enable GTP echo (#8328) (#8649)
  • fix(mme): add config flag to control GTP-U echo (#7980)
  • feat(agw): enable GTP-U echo response. (#7885)
  • fix(mme): Updating S1AP UE state IMSI write on connection_establishment... (#9334)
  • fix(mme)(s1_handover) Send UE ctx release cmd to the source eNB after (#8046) (#8462)
  • fix(mme)(s1_handover) fix bugs in security context for HandoverReques… (#7985) (#8461)
  • fix(mme): Mitigate timer expiration handler with invalid UE context c… (#8648)
  • fix(agw): on AGW reboot bring up SGi interface. (#8169)
  • fix(mme): Stop timer in right service303 thread (#7952)
  • fix(agw): Process subscriber data before digest data (#7983)

Other fixes

  • fix: Fix continous deletion of default drop flow (#10128)
  • fix(orc8r): Set correct default image tag for v1.6 branch (#10051)
  • fix(nms): Adjust alert rule for S1 setup failure (#9899)
  • chore(mme): Update TEIDs on LTE calls only during successful session (#10016)
  • fix(mme): Backport PR 9968 (#9988)
  • fix(agw): Compensate for eNB firmware param inversion (#9967) (#9982)
  • fix(agw): Fixed memleak for s1 handover cancel (#9938) (#9966)
  • chore(magmad): Expand magmad service restart error list (#9857) (#9873)
  • fix(nms): Adjust certificate expiry alert threshold (#9890)
  • fix(mme): memory leak in duplicate attaches (#9835) (#9844)
  • fix(agw): Fix bootstrapper when certs rejected at app level (#9684)
  • fix(mme): remove memory leaks in attach rejects (#9804) (#9810)
  • bug(agw): V1.6 check if UE context exists when timer expires (#9663)
  • fix(pipelined): add missing test files for UplinkBridgeTestFlowRestore (#9654)
  • chore(mme): Updating attach reject congestion control config value to… (#9643)
  • chore: add a new sentry project to upload artifacts to (#9568) (#9597)
  • fix(ci): Fix root ca expired in deployment of agw packages (#9585) (#9598)
  • fix(agw): force upgrade ca-certs (#9596)
  • fix(mme): Removing stop_timer extra call on service303 thread exit (#9516)
  • fix(agw): backport-v1.6: restore SGi bridge flows to keep connectivit…(#8146) (#9465)
  • chore(agw): Fix python formatting (#9379) (#9413)
  • fix(agw): Fix default logging level (#9409) (#9438)
  • fix(mme): Fix default bearer timer expiry handler (#9418)
  • chore: refactor circleci config for sentry upload so we can add more… (#9376) (#9390)
  • fix(orc8r): Run cloud on 1.6 (link)
  • fix(orc8r): Move cloud test to Github Action (link)
  • fix(orc8r): Move cloud test to Github Action (link)
  • fix(orc8r): Move cloud test to Github Action (link)
  • fix(orc8r): cloud tests (link)
  • fix(agw): remove fluentbit registry using ansible(hotfix) (link)
  • fix(agw): remove fluentbit registry using ansible(hotfix) (link)
  • chore(agw): backport-v1.6: update OVS dependency (#9385)
  • fix(agw): Check for unset params while extracting status (#9288) (#9365)
  • Backport Ignore nosetest to unblock CI (#9369)
  • fix(session_proxy): fix linter issue using make precommit (#9358)
  • fix(agw): brackport-1.6: gtp-br0 definition. (#9355) (#9357)
  • backport: PR8720 and PR8754 (#9361)
  • fix(agw): lte-test fix (#9362)
  • feat: add hostname tag to sentry events (#9329) (#9337)
  • fix(mme): Closing latest OF connection on OpenFlowController (#9310)
  • fix(mme): Adding write of MME UE state for NAS uplink data ind handle…(#9313)
  • fix(mme): Free bstring field for ICS response (#9265)
  • fix(pipelined): fix ipfix sampling port (#9247) (#9252)
  • feat(orc8r): Add FreedomFi One as a supported eNB (#9052) (#9094)
  • feat(agw): Cherry pick FreedomFiOne enodebd changes (#9056)
  • fix(sessiond): increase default_requested_units to 10Mb (#8671)
  • fix(agw): handle zero value for logging level. (#8655)
  • fix(mme): Frees esmmessagecontainer on attach reject proc (#8594)
  • fix(s6a_proxy): remove origin-state-id avp (#8582)
  • fix(agw): remove auto interface bring-up. (#8477)
  • fix(mme): assert on single thread exits (#8455) (#8468)
  • Fixed memory leaks (#8414)
  • Backport fix for TAU memory leak (#8324)
  • fix: backport (#7847)(#7850) to fix 1.6 branch build (#8269)
  • fix(pipelined): backport-1.6 (#8974): fix vlan match (#8975)
  • fix(sessiond): fix race condition on handle_activate_ue_flows_callback (#10213)
  • fix(mme): Backport PR 10186 (#10202)
  • fix(pipelined): Add grpc error code for timeouts (#10155)
  • fix: Fix continuous deletion of default drop flow (#10128)
  • fix(orc8r): Set correct default image tag for v1.6 branch (#10051)
  • fix(nms): Adjust alert rule for S1 setup failure (#9896) (#9899)
  • chore(mme): Update TEIDs on LTE calls only during successful session (#10016)
  • fix(mme): Backport PR 9968 (#9988)
  • fix(agw): Compensate for eNB firmware param inversion (#9967) (#9982)
  • fix(agw): Fixed memleak for s1 handover cancel (#9938) (#9966)
  • chore(magmad): Expand magmad service restart error list (#9857) (#9873)
  • fix(nms): Adjust certificate expiry alert threshold (#9890)
  • fix(mme): memory leak in duplicate attaches (#9835) (#9844)
  • fix(agw): Fix bootstrapper when certs rejected at app level (#9684)
  • fix(mme): remove memory leaks in attach rejects (#9804) (#9810)
  • bug(agw): V1.6 check if UE context exists when timer expires (#9663)