Ruckus Ruckus Wi-Fi

RUCKUS WIRELESS: Zone Director vs. SmartZone WLAN Architecture Review


I work for Ruckus Wireless Inc. (Now a Part of Brocade!).  This article is intended to assist any Ruckus Wireless channel partner or customer in selecting the right product.  It is neither intended as a “plug” nor a critique of the products.  It is simply a description of how the products work and what the relative advantages and disadvantages are for each platform.  I hope you find this useful!


Ruckus Wireless’ venerable line of Zone Director WLAN controllers is quite arguably one of the main products that helped build Ruckus from a small 100-and-something employee start-up in 2008 to the world’s third largest Enterprise WLAN vendor today.  It has been used by tens of thousands of customers for deployments in multiple verticals and provided an easy interface to configure and control wireless networks.  I certainly remember it as a revelation when I started my first one up in 2010.  Better yet was when my sales colleague did the same and did not have to phone me for help configuring it!

But with all things, sometimes it becomes time to move on.  Approximately a year ago Ruckus Wireless introduced the world to its new SmartZone WLAN control platform, a new generation of controllers aimed at helping Ruckus Wireless address shifting market needs/trends and correcting many of the shortcomings of the older Zone Director platform.

In this article I intend to present a comparison of the architecture between the Zone Director and the SmartZone control platforms and look at how that affects the kinds of networks we can design.  I’ll start by looking at WLAN MAC architectures in general to build a framework of our understanding.

WLAN MAC Architectures

Generally speaking there are three distinct MAC architectures available to 802.11 Wireless LAN and Radio Access Network vendors: Remote MAC, Split MAC and Local MAC:

Remote MAC:

  • APs are PHY radio only.  Centralized control function performs ALL MAC layer processing.
  • All real time processing and non-real time processing is performed at the controller.
  • PHY layer data connection between the controller and AP
  • Least common architecture for WLANs.
  • A good examples of remote MAC architecture would be modern LTE Active Das systems, or these guyswho implement a proprietary Software Defined Radio solution for LTE in a datacenter that implements everything from the LTE protocol upwards.  From what I can see, their radios are simply PHY layer transceivers.

Split MAC:

  • MAC Layer processing is split between the AP and the Control Function.
  • Real Time MAC Functions are performed at the AP
  • Control is performed via LWAPP or CAPWAP to a centralized controller.
  • Non-real time / Near Real Time Processing performed by the Controller (the actual details of this depend largely on the vendor!)
  • Integration Service (802.11 WLAN to 802.3 Ethernet Frame Conversion) performed at either the AP or the Controller.
  • Layer 2 / Layer 3 Connection from the AP to the Controller.
  • Most Common WLAN architecture.
  • Implemented by the Ruckus Wireless Zone Director

Local MAC:

  • All MAC layer processing performed at AP
  • Real-time, near real-time and non-real-time MAC processing performed at the AP
  • ALL essential functions are performed at the AP – resiliency provided by a lack of dependence on a centralized controller.
  • Additional services can be implemented by the control function (Config Management, etc)
  • Architecture of choice for distributed deployments with a “cloud” controller.
  • Implemented by the SmartZone control platform

Reading what the LWAPP RFC has to say is interesting too.  According to RFC 5412 these are the definitions of SPLIT vs LOCAL MAC:

Distribution System Controller Access Point
Integration Service Controller Access Point
Beacon Generation Access Point Access Point
Probe Responses Access Point Access Point
Pwr Mgmt / Packet Buffering Access Point Access Point
Fragmentation / De-Fragmentation Access Point Access Point
Association / Disassociation / Reassociation Controller Access Point
Classifying Controller Access Point
Scheduling Controller / Access Point Access Point
Queuing Access Point Access Point
802.1X / EAP Authenticator Controller Controller
Key Management Controller Controller
Encryption / Decryption Controller / Access Point Access Point

Zone Director MAC Architecture:

The Ruckus Wireless Zone Director platform was developed as a platform targeting small to medium sized enterprise customers.  It implements a customized version of the Split MAC architecture defined in RFC 5412.  Here is a breakdown of services implemented at the AP vs the Zone Director:

Zone Director Split MAC Roles & Responsibilities

Function Access Point Zone Director
Probe Requests/Responses
Control Frames: (ACK + RTS/CTS)
Encryption / Decryption of 802.11 Frames
Distribution & Integration Services
Fragmentation / De-Fragmentation
Packet Buffering / Scheduling / Queuing (802.11e)
WMM-Admission Control
Background Scanning
802.11h DFS Processing
Wireless Distribution Services (MESH) Control
802.11 Authentication & Association
802.1X EAP Authenticator
Encryption Key Management
WPA/WPA2 – 4 Way handshake
RADIUS Authentication & Accounting
Rogue AP – Wireless Intrusion Detection / Prevention
Additional Services (Hotspot / Captive portal / Guest Access etc.)
Configuration Management
Management / OSS/BSS Integration (SNMP etc.)


It is important to understand the implications of the way the roles and responsibilities are separated between the AP and Zone Director when designing a customer’s network.  Here are some of the most important points that will affect your design:

Resiliency to Failure

From reading the table above you can easily understand which services you will lose when a Zone Director controller fails and you don’t have Smart Redundancy enabled.  Your existing connected clients will remain connected to the network, but no new clients will be able to associate if the controller is not available.  You will also not be able to roam between APs as all association / re-association requests must be sent to the controller.

The Ruckus SmartMesh algorithm runs on the APs enabling each AP to calculate a path cost associated with each uplink candidate.  However, SmartMesh topology changes between APs cannot occur without a controller to allow Mesh APs to re-associate to a new uplink.  The Zone Director also controls which mesh uplinks can be selected by APs.

In addition to preventing any new associations and re-associations, losing connectivity with the Zone Director will also affect encryption key management, RADIUS authentication, Rogue AP Detection / Prevention and any other additional WLAN services that require the controller to be present.

The AP is designed to work in a local breakout mode by default and implements the integration service onto the LAN.  Anything like L3/L4 Access Control Lists and client isolation settings that are stored on the AP will continue to work for associated clients.  L2 Access Control (MAC Address based Access Control) is implemented on the Zone Director at association.

Scale & Redundancy

The Zone Director platform allows for only 1+1 Active/Standby redundancy (called Smart Redundancy) with no options of clustering controllers to increase scale.  This can cause problems when implementing networks of more than 1000 APs.  N+1 redundancy can be achieved using Primary and Secondary Zone Director failover settings in place of the Smart Redundancy option, but this does not provide for stateful failover.  It is also not possible to allow for failover between HA pairs of Zone Directors.

Authentication / Association

The MAJOR limitation of a Zone Director AP is that all 802.11 Authentication and Association requests must go via the controller. This is the case for all WLAN types except for the WLANs using the “Autonomous WLAN” feature introduced in ZoneFlex version 9.7.

The Zone Director also acts as the RADIUS client for all RADIUS authentication and accounting messages and fulfills the role of the 802.1X authenticator in the 802.1X framework.  The Zone Director is responsible for encryption key management and derives and distributes the Master Session Key (MSK), Pairwise Master Key (PMK), and other temporal keys used for encrypting the air interface to the APs in the network.

Integration with other authentication types and their services including LDAP / Active Directory, Captive Portal Access, Guest Access Services, Zero-IT client provisioning, Dynamic Pre-Shared Keys and Hotspot Access etc are managed by the control function and reside on the Zone Director.

The requirement that all 802.11 authentications / associations must traverse the Zone Director places some limits on the way you can design large networks with respect to:

  • 802.11 Authentication/Association request latency.
  • 802.1X Authentication latency / packet loss.
  • WPA / WPA2 4-Way handshake
  • AP roaming delays
  • Distributed and Branch Office Environments


802.11 Authentication/Association latency:

Some 802.11 client devices place a limitation on the acceptable delay between an 802.11 authentication or association request and the expected response.  A known issue with specific barcode scanners exists in which the scanner will fail to join a WLAN unless it receives an association response within 100ms of its request.  Testing conducted in 2013 by Ruckus field engineers showed that most modern enterprise clients with updated drivers did not have any problem with latencies of several hundred milliseconds. The longest latency tested between an AP and the controller was > 400ms (from Sunnyvale to South Africa / Beijing) with no adverse effects on WLAN association or other services.  However if you ask a Ruckus employee for the official number here you will receive an answer of “150ms”, mostly because we aren’t sure of the clients you are using and for other reasons, which will become clear as you read on.

The only exception to this is the Autonomous WLAN feature introduced in ZoneFlex version 9.7 that will allow the AP to directly respond to a client’s authentication and association request.

802.1X Authentication with latency/packet loss:

In addition to the 802.11 Authentication and Association messages, EAPOL messages sent between the Supplicant (client device) and the Authenticator (Zone Director) can also run into trouble when being transmitted across a high latency WAN link with unpredictable packet loss.  Remember that LWAPP tunnels use UDP.  In testing, Ruckus engineers observed that it became difficult for 802.1X clients to successfully complete the EAPOL key exchange over a high latency link due to out of order frames and increasing EAPOL replay counts.

WPA / WPA2: 4-Way Handshake

In any Robust Security Network (RSN) the WPA/WPA2 4-Way handshake to establish the Pairwise Transient Key (PTK) and GroupWise Transient Keys for encrypting the air interface is conducted between the Client STA and the Zone Director.  Once the Zone Director has established the PTK for encrypting client station unicast traffic and sent the GTK to the client for multicast traffic, it informs the AP of the key values allowing the AP to perform Encryption/Decryption of the air interface at the edge of the network.  Key Caching is all stored centrally at the Zone Director and APs are not typically made aware of the PMK values or required to derive any Transient Keys.

In Zone Flex version 9.7 Ruckus Wireless released a new feature called “Autonomous WLAN” that allowed a WLAN to use open authentication with the option of WPA/WPA2-Personal for encryption of the air interface.  This is the only WLAN type in which the AP will derive the PTK from the PMK and store the PMK on the AP.

AP Roaming Delays:

Another aspect of your design that must be considered is the issue of roaming delay when moving between access points.  Every time you re-associate to a new AP, the re-association request must be passed to the Zone Director for approval.  If encryption is being used then you will also be required to wait for the 4-way handshake between the Client STA and Zone Director to take place.  This will introduce considerable latency if the control function is placed far away from the client device.

Even with the use fast roaming techniques like PMK Caching, or Opportunistic PMK Caching you may find that roaming times are too long for certain applications purely because of the time taken to complete the association and complete the necessary 4-way handshake with the Zone Director.

Using an example here, if a Zone Director is placed only 50ms RTT away from an AP, it will take >200ms to perform an AP roam excluding any processing time for generating a response or RADIUS authentication messages between the Zone Director and the Authentication Server (AS).

When using 802.11r Fast-BSS Transition, it is important to realize that the Zone Director acts as the Pairwise Master Key R0 Key Holder (R0KH) as well as the Pairwise Master Key R1 Holder (R1KH) and is involved in each and every roam event!

Distributed and Branch Office Deployments.

Imagine a large enterprise customer with multiple large office buildings in geographically dispersed locations.  Each large office would most likely have all of its own IT infrastructure including Active Directory, LDAP, RADIUS, DHCP, DNS and some local servers.  Smaller branch office locations would be connected back to the main offices via lower throughput and potentially high latency WAN links.  Some enterprises make use of MPLS from an Internet service provider, others might simply have a network of VPNs connecting their offices over the open internet.

Each main office will require its own Zone Director controller to integrate with the local IT infrastructure.  Sure, you can place a single Zone Director in a data center and build connectivity from there to all the main offices.  But all of your association and AAA authentication/authorization requests will be forced to hairpin through the controller! Now don’t get me wrong here.  I am not saying it won’t work.  But think on this: in order to achieve a roam time of less than 50ms in a Vo-Wi-Fi project, RTT to the Zone Director would not be able to exceed 12.5ms!

Branch offices will have exactly the same problem, but it becomes unwieldy to place a Zone Director at every branch office.  So you’re stuck with a conundrum.

Layer 3 Networks:

The Ruckus implementation of LWAPP disabled Layer 2 LWAPP tunnels by default in ZoneFlex 9.4 onwards.  Ruckus supports L3 LWAPP tunnels making it possible to place the Zone Director and APs in different subnets.

NAT Traversal:

The Access Point is the source of all LWAPP communication messages to the controller and it is therefore possible to have APs placed behind a NAT with no issue.  Deployments running ZoneFlex 9.2 or later also support Zone Directors behind a NAT provided that the APs are pointed to the public address and the necessary port forwarding is setup using inbound NAT rules.  Smart Redundancy will also work provided that each Zone Director is given a separate public IP address or located behind separate firewalls.

Centralized Data Forwarding

The Zone Director’s Split MAC architecture allows for centralized data forwarding using LWAPP tunnels (UDP port 12222).   Similarly to CAPWAP, LWAPP does not support differentiated handling of the control and data planes.  Both LWAPP Control and LWAPP Data tunnels must terminate on the same interface of the Zone Director. This is not really a problem in most enterprise environments.  Most designs would typically just place the controller somewhere in the core and tunnel all of the data to it and break traffic out from there.  I mean that is where everything goes anyway, right?

This is not always the case with service providers though.  Most large operators and service providers actually prefer to keep network control and subscriber data separate and forward them to parts of the network optimized for dealing with the specific traffic types.

The other major pain point here is that because data and control planes are inherently entwined, losing a control function will result in an interruption of the flow of subscriber data through the system.

The final straw comes in when you realize that the Zone Director or control function will also be performing large amounts of MAC layer processing of subscriber data for centralized forwarding.  It is no wonder that enterprise solutions implementing a Split MAC Architecture typically max out at somewhere around several thousand APs per Controller


We’ve gone through the implementation of Split MAC architecture on the Ruckus Wireless Zone Director in fair detail.  We have also covered some of the design constraints and considerations when implementing a Zone Director based network.  The 802.11 Client State Machine is implemented on the Zone Director and all 802.11 Authentication, Association, and Key derivation is done at the controller.  This can introduce long AP roam times and create problems in large geographically distributed deployments where integration to multiple AAA servers may be necessary.  Placing Zone Directors locally at each site is the recommended solution to the problem but can make the solution expensive and harder to manage.

SmartZone MAC Architecture

Since it started development in early 2011 the SmartZone Platform represented a shift away from the widely accepted Split MAC architecture of Enterprise Wireless LANs.  The SmartZone platform implements a customized Local MAC Architecture using separate control and data planes with lightweight protocols best suited to each task.  Most importantly the 802.11 Client State Machine and other services are implemented directly on the AP.  Below is a breakdown of the services implemented on the AP and SmartZone platform:

Function Access Point Smart Zone Controller
Probe Requests/Responses
Control Frames: (ACK + RTS/CTS)
Encryption / Decryption of 802.11 Frames
Distribution & Integration Services
Fragmentation / De-Fragmentation
Packet Buffering / Scheduling / Queuing (802.11e)
WMM-Admission Control
Background Scanning
802.11h DFS Processing
Wireless Distribution Services (MESH) Control
802.11 Authentication & Association
802.1X EAP Authenticator
Encryption Key Management

(PMK Caching)

(Opp. Key Caching)

WPA/WPA2 – 4 Way handshake
RADIUS Authentication & Accounting

(Proxy Function)

Wireless Intrusion Detection / Prevention
Configuration Management
Management / OSS/BSS Integration (SNMP etc)

Additional value added services implemented on the SmartZone Controller and the AP are shown on the table below:

Function Access Point SmartZone Controller
Active Directory / LDAP Integration
Captive Portal Authentication
Guest Access Portal redirect
Guest Access Pass Creation / Storage / Authentication
Social Media Login – Signup
Hotspot 2.0 / Passpoint Online Sign-Up Server
WiSPr Hotspot

Resilience to Failure

As you can see from the tables above, the SmartZone Platform is highly resilient in the event of a controller failure.  All essential WLAN services and some additional services are implemented at the AP.  Loss of connectivity with the controller will have minimal impact on the operation of normal WLAN services.

Value added services like Social Media Login or Guest access services require the use of the local user database on the SmartZone controller to authenticate subscribers.  However, APs will keep a cache of users who have already connected and their state ensuring that already connected users are not affected.

Authentication via the AP internal captive portal to any external database is unaffected by connectivity to the controller.

The WiSPR Hotspot function requires the SmartZone controller for HTTP/HTTPS redirect support, client proxy handling, and integration with the Landing portal and RADIUS authentication. The walled garden of the hotspot function is implemented on the AP via a DNS cache allowing subscriber traffic to pass directly from the AP to the internet without passing through the controller.

Scale & Redundancy

The SmartZone platform supports N+1 active/active clustering allowing up to 4 nodes to be clustered together for scale and redundancy.  APs are automatically load balanced across the cluster in a randomized order ensuring no single node is ever overloaded.

All client state information is shared and replicated in each node’s memcache.  All configuration information and other permanent records are striped and mirrored across the cluster database allowing for up to 1 node to fail at any given moment in time with no service impact.

A single SmartZone controller can support as many as 10 000 APs while each cluster can scale to support as many as 30 000 APs.  Should an entire cluster fail the SmartZone platform also supports failover between SmartZone clusters.  Configurations between clusters must be manually synchronized in the current release.

802.11 Association

The SmartZone architecture places the 802.11 client state machine at the Access Point.  All 802.11 authentication & association tasks are handled by the Access Point directly and the results are simply reported to the SmartZone controller (in real time) for storage in the events logs.

There is no latency limitation placed between the SmartZone Controller and the Access Point for successful 802.11 authentication and association.

In controlled access networks using a captive portal method to authenticate a subscriber, the Access Point allows the subscriber to associate and simultaneously performs a memcache lookup on the SmartZone controller to establish if the subscriber state is set to “authorised”.  If authorized the AP knows immediately not to implement the captive portal and to simply allow client traffic through.

In WPA2-Personal networks the Pre-Shared Key is stored on every AP as part of the network configuration allowing immediate interaction with the client.  Similarly L2 Access Control Lists for MAC Address based access control are stored on the AP.

The SmartZone platform’s Local MAC architecture eliminates many of the problems Split MAC architectures face with regard to:

  • 802.11 Authentication/Association request latency issues.
  • 802.1X Authentication latency / packet loss.
  • AP roaming delays
  • Distributed and Branch Office Environments

802.1X Authentication & Encryption Key Management

The AP also acts as the RADIUS client for all RADIUS Authentication and Accounting messages with the option of allowing the SmartZone to act as a proxy.

The AP fulfills the role of 802.1X Authenticator in the 802.1X framework and is responsible for all encryption key management deriving the MSK, PMK and other temporal keys used for encrypting the air interface.  Once a PMK is derived it is cached at the AP to enable PMK-caching enhancing the user experience for 802.1X roaming.  The PMK is also sent by the AP to the SmartZone controller where it is stored in the memcache of each control node to enable Opportunistic PMK Caching.  If a subscriber roams to a new AP and supplies the PMK-ID in its re-association request, the AP will lookup the PMK-ID on the SmartZone memcache avoiding a full 802.1X re-authentication.  The WPA-4 way handshake is completed directly between the client and the AP eliminating any latency requirement to the controller.

There is still the possibility here that a long latency link between the SmartZone and AP could cause an unacceptable roaming delay during the memcache lookup of the necessary PMK but the potential impact of a long latency link is dramatically lowered.  The long latency link is only used for a single lookup in the memcache straight after association.  After that all further interactions will occur directly between the Access Point and Client Device.

The SmartZone platform does not currently support 802.11r Fast BSS Transition (current release: SmartZone Version 3.2), but in any future implementation it is sensible to see that the APs would fulfill the role of the Pairwise Master Key R0 Key Holder (R0KH) and be responsible for deriving the PMKR1 and distributing it to other APs in the defined Mobility Domain.  Thus we can deduce that fast roaming messaging would take place directly between the client and the AP providing great improvements in performance over techniques like Opportunistic Key Caching.  I guess we will have to wait until a future release and see if my deduction is correct.

Distributed and Branch Office Deployments.

The SmartZone platform enables Branch office deployments and direct integration between APs and local IT infrastructure (Active Directory, LDAP, RADIUS, DHCP, DNS, Firewalls etc.).  This means a single SmartZone controller can manage multiple sites without forcing authentication requests or client data to hairpin through the SmartZone controller. Of course, you can still set the SmartZone to be a proxy if that is what you want to do.

Layer 3 Networks:

Ruckus SmartZone APs use a proprietary SSH-based control protocols to communicate with the SmartZone controller allowing all control traffic to traverse Layer 3 networks.

NAT Traversal:

The SmartZone AP is the source of all communication messages to the SmartZone controller allowing APs to be placed behind a NAT with no issue[1].  The virtual SmartZone (vSZ-E, vSZ-H) and SmartZone 100 controllers all support ability to be placed behind a NAT.  In the case of a cluster or it will be necessary to use separate Public IPs for each node.

Centralized Data Forwarding

The SmartZone Platform makes use of proprietary SSH-based control and GRE-based data protocols to separate the control and data planes.  The SmartZone platform allows for centralized forwarding to a SmartZone controller appliance or to a virtual Data Plane using RuckusGRE.


RuckusGRE is a customized version of L2oGRE (also known as Ethernet over GRE or SoftGRE) with an added Transport Layer header in the outer IP Packet.  This added UDP header allows the AP to originate GRE tunnels from behind a NAT router.  Recall that in standard GRE, there is typically no UDP/TCP header in the outer packet and therefore no way for NAT routers to track NAT sessions based on TCP/UDP port number.

RuckusGRE uses TLS certificates for authentication and optional AES 128 bit tunnel payload encryption.

Differentiated Handling

The SmartZone implements separate control and data planes allowing differentiated handling of control traffic and subscriber data.  The data plane is designed to take its configuration from the control plane, but to maintain a separate engine for processing subscriber data tunnels.

The SmartZone 100 and SCG 200 physical appliances of the SmartZone platform both implement a data plane on the appliance allowing traffic to be tunneled to the same physical location as the SmartZone controller.

The data plane on the SmartZone 100 can use its own IP address and subnet or it can use the same IP address as the SmartZone 100 control plane.  The SCG 200 requires that each of the data planes (it has two) maintain their own IP addresses.

The Virtual SmartZone Data Plane (vSZ-D) is available as a separate virtual appliance and is designed for use with the virtual Smart Zone Controllers.  Each virtual control plane can manage 2 vSZ-D instances and can manage up to 8 vSZ-D instances per cluster.

Data Plane NAT Traversal

The SmartZone 100 supports placing the data plane IP behind a NAT.  The SCG 200 data plane does not support being placed behind a NAT in the current release (SmartZone 3.2).

The vSZ-D supports NAT traversal between itself and the vSZ control plane and APs allowing it to be placed behind a NAT.  Since all incoming connections would use the same port numbers each vSZ-D instance would have to get its own Public IP.

Latency Requirement

The vSZ-D requires a recommended maximum RTT latency of <150ms to the vSZ control plane.  This is not a concern in deployments using the SmartZone 100 or SCG 200 appliances, as the data plane is co-located with the controller in the appliance.

802.11 Frame Processing

APs are still responsible for all integration services converting 802.11 frames to 802.3 before they are encapsulated in the GRE header, reducing the processing requirements, improving performance and increasing scalability on the SmartZone data plane.


The SmartZone platform implements a Local MAC architecture – placing the 802.11 client state machine on the AP.  We have reviewed how the SmartZone platform enables simpler deployment of WLAN networks spread across multiple sites and improved user experience without the need for additional controllers at each site.  This is central to enabling business models like cloud control, managed services and branch office deployments.

We have also seen how the SmartZone architecture provides differentiated handling of control traffic and tunneled subscriber data.  The ability to place the vSZ-D in its own subnet allows for separation of control and subscriber traffic in carrier networks.  The ability to place the vSZ-D in a custom physical location (or several locations) increases flexibility when forwarding client data traffic.  The added ability to support AES encrypted tunnels opens the potential for use as a concentrator for remote AP deployments.

[1] There is a limitation here when integrating directly with 3rd Party WLAN gateways using L2oGRE / Soft-GRE.  SoftGRE/L2oGRE does not implement a transport layer header in the outer packet.  This is a limitation of SoftGRE, not the SmartZone Architecture.