Categories
Ruckus Ruckus ICX Switches Ruckus Wi-Fi

My Ruckus Laboratory – Virtual Network Topology

This post is part of a series on building my Ruckus home laboratory environment. Previous articles in this series include:

  1. Building a Ruckus Home/Office Laboratory on a Budget – Part 1 
  2. Building a Ruckus Home/Office Laboratory on a Budget – Part 2
  3. My Ruckus Laboratory – Home Network Architecture & Limitations
  4. My Ruckus Laboratory – Physical Network

In this post I document the high level network topology inside my Dell R610 server and share some of my thoughts on why it is designed the way it is.

Choosing a Topology

A Flat Network

The simplest possible network topology is a flat network with all of the virtual machines, APs and clients in the same subnet.  Once you’re done building out the physical network, simply place your virtual machines inside the host on a single subnet behind a virtual switch and you’re good to go right?  In one way you are right, this is the simplest possible deployment and the quickest way to getting your hands dirty with various machines, but it is also the most limited approach.

A flat network does not allow you to test scenarios where a SmartZone controller is deployed using 3 separate interfaces for Management, Control and Clustering.  It cannot test scenarios where APs and controllers communicate through NAT.  It cannot test scenarios where clients require a tunneled connection for mobility between subnets.  It prevents you from testing things like dynamic VLANs and Policy control.  It also removes the ability to learn about many other entities and technologies that are at play in a network and how they interact with the Ruckus products and that is really the one of the fundamental reasons I am building this laboratory.

A Layer 3 Network

Overall, it makes the most sense to build a layer 3 network for the virtual network, but what should this network look like?  I could simply slap down a virtual Router and run my laboratory network with all the entities connected to that.  This is a great deal better than the flat network we described above and will enable you to test many more features, but it still doesn’t quite hit the mark in my book.

Before I dive into the topology that I have chosen for my virtual network, let’s consider the three primary requirements of the virtual network topology.

First the chosen network topology must allow testing of as many features and technologies as possible without requiring major architectural changes.  I want this laboratory to be as productive as possible, and for that I need to be able to test a feature or a configuration whilst making only limited changes to the underlying network.

Second, if the laboratory is really to have any value, it must be roughly analogous to most customer network topologies in the field.  This way when I test a feature or a configuration, it must map to an as yet unknown production network as easily as possible.  This is key for any person reading about testing I have done, who wishes to use the same approach in their own network.

Third,  the network topology should act as a template that enables best practice network design, segmentation and security.  The motivation here is to re-enforce network design best practices and address topics including network performance and security.

The Triangle

The shape I have chosen for the virtual network topology is that of a triangle, hardly an original choice but there it is.  The points at the base of the triangle represents two remote locations, separated from the network core at the apex of the triangle by Layer2/Layer 3 connections.

Ruckus Laboratory Virtual Network Topology

Whether you are looking at an enterprise network with a head office and several remote offices/locations, a carrier network with multiple sites connected into a regional core network, or a campus network with multiple buildings, you will always be able to discern this shape.  Even if we think of a scenario in which a hotel group chooses to manage multiple properties from a single controller in a datacenter available over the Internet, this shape persists in some form.  Of course, the exact protocols that run between the sites will differ between the different scenarios.  In a campus network you would be likely to encounter OSPF, or perhaps some campus fabric implementation,  in an enterprise network you may encounter SD-WAN or MPLS.  In our last example, you’d be likely to encounter connectivity directly over the internet with NAT traversal on either side.  Either way though, the shape holds.  The additional benefit of this shape is that it enables you to test communications from the edge to the core network and between two edge locations.

Virtual Network Topology

The final virtual network topology showing the segmentation of the laboratory network according to services is shown below.

Virtual Network Topology & Segmentation

Network Segmentation

Abstraction & Simplification

In order to escape the exponential complexity of assigning and tracking subnets on the fly, I have subdivided my virtual network into functional groups.  The functional group into which an entity is placed informs me what services the entity should be providing and which other entities it should be communicating with.  For instance, I don’t want any of the client subnets to be able to reach the Core Network Services or OSS Subnets that manage and run the network.  I also only want subscribers to be able to use specific services/protocols in the Subscriber Core Services and Subscriber Services subnets.  The only entities that should be able to manage the network are those in the Management Access group.

Organizing the network entities this way also allows me to get an idea for how many subnets I may need in each functional group and gives me the ability to assign IP Address ranges at a functional level in a predictable manner, that allow for future customization and expansion.

Security & Role Based Design

Each entity within a functional group will also have a customized security profile based on its role.  For instance both Super and Network Admin class users (separate subnets) have connectivity to the OSS Network and the entities that provide services in the Subscriber Services and Subscriber Core Services subnets.  An entity in the network attempting to relay API commands to the SmartZone Controller would have to reside in either of the management Access Subnets and have access to the controller API with a valid username and password.

Only Super Admins have the ability to even reach the Network Core Services subnet to manage the entities there, let alone attempt a login.  In addition, the only entities capable of receiving services from the Core Network Services subnet are network devices on their own management VLANs.

OSS Network Subnets

The OSS network provides a management interface to the OSS infrastructure including the SmartZone Controller, SmartCell Insight Analytics, SPoT location services and any other NMS (SNMP / Syslog etc) that I have chosen to learn about.   It is useful to note that if the SmartZone controller is deployed with 3 separate interfaces, the Management subnet will be the subnet that interfaces with Management Access.  The AP Ctrl interface will be given its own policy.

vSZ-D Placements

Ruckus’ SmartZone platform provides a wide range of options for deployment and provides features such as CALEA mirroring for lawful intercept and roaming of clients between Data Planes.  I have placed the virtual SmartZone – Data Planes asymmetrically in the network to quickly demonstrate the various deployment options and test the software’s features.  One is placed in the core network, whilst the other is placed locally in Edge Network B.  

 

That’s all for now!

 

Categories
Ruckus Ruckus ICX Switches Ruckus Wi-Fi

Ruckus ICX Switches – Power over Ethernet

In my previous post we covered the configuration of some basic layer 3 services for my Ruckus ICX 7150-C12P which is going into my home laboratory.  But now we actually want to start plugging things in and turning APs ON!  In this post I dig into the details of working with PoE on the ICX Switches.

When I started this post I really thought it would be quite trivial, at best a purely supplemental post.  Turns out, I was a little wrong, this really does deserve its place as a standalone post!

A Quick Refresher

Yes, I know, you’ve been working in this field for ages, or are just starting out, and either way, you, like me, think that you know pretty much everything there is to know about boring old PoE. Regardless, let’s revise a little:

Terminology

The PSE is the Power Sourcing Equipment, and is located on the switch ethernet interface / PoE injector “out” interface.  The PD is the Powered Device, and is located on the other end of the Cat5e/Cat6 ethernet cable.

Power at the PSE vs Power at the PD

The Ethernet cable is made of copper wires and has a certain resistance and impedance specified by the standard to which it adheres (Cat 6 characteristics here).  This means that not all power at the PSE will reach the PD.  We must accommodate power losses for up to 100m of cable.

PoE Standards

802.3af allows a maximum of 15.4 Watts at the PSE / 12.95 Watts at the PD.  802.3at allows a maximum of 30 Watts at the PSE / 25.5 Watts at the PD.  You may see 60 Watt “Ultra Power PoE” or “PoE++” injectors or similar, these use a modified version of 802.3at but are typically not standardized.  Power over HDBaseT (PoH) allows a maximum of 100 Watts at the PSE.  PoH enables the PD to detect cable length/resistance and to maximize power draw whilst remaining below the limits of 100 Watts at the PSE.  There is also a newer standard called 802.3bt which aims to update the 802.3af/802.3at standards and allow operation with up to 90 Watts of power at the PSE / 71 Watts at the PD.  The latest timeline states it will be released as a standard in 2018.

For more on the various PoE standards check out the wikipedia article which is pretty useful!

Which Twisted Pairs Carry Power?

Ethernet cables have 4 twisted-pairs of copper wire.  A 100/1000 Ethernet link using 802.3af or 802.3at use only 2 twisted pairs in the Cat5e / Cat6 cable to transfer power and data, and 2 twisted pairs to transfer data only.   Originally in the days of 10/100 Ethernet, only 2 twisted pairs were used for data transfer.   The decision of which two pairs to use for power transfer depended on the type of device injecting the power.  For an end-span device like a PoE Switch, power is transferred on the same pairs used for data transfer (Pins 1, 2, 3 , & 6) called Mode A.  For a mid-span device like a PoE injector, power is injected on the unused twisted pairs (pins 4, 5, 7, & 8) called Mode B.  When selecting 802.3af/at compatible devices, be aware that they typically only support Mode B operation.  802.3af/at compliant devices however support BOTH Mode A and Mode B.   By comparison, PoH and 802.3bt both use 4 twisted pairs to transfer power and data. The  60 Watt “Ultra Power PoE” injectors use all 4 twisted pairs to deliver power and data.   You will sometimes see 60 Watt “Ultra Power PoE” injectors or similar, these use all 4 twisted pairs to deliver power and data, but are typically not standardized.

If you want to dig into this specific hole a bit more:

What Can I Power with PoE?

All the things that you can power with PoE. And I am sure there is more, just keep looking.

Ruckus ICX7150 PoE Capabilities

First, let’s just talk quickly about the main PoE capabilities of the Ruckus ICX Switching range when it comes

802.3at Power – For Everyone

All Ruckus ICX switches are capable of 802.3at power at the PSE.  The number of ports that can handle simultaneous 802.3at power depends on the power budget of the switch.  A summary of the limits on these switches (you can check it in the datasheets yourself) is presented in the table below.

Model PoE+ Ports PoH Ports PoE Power Budget Simultaneous
 802.3af Ports 802.3at Ports PoH ports
ICX7150-C12P 12 124 Watts 8 4
ICX7150-24P 24 370 Watts 24 12
ICX7150-48P 48 370 Watts 24 12
ICX7150-48PF 48 740 Watts 48 24
ICX7150-48ZP 32 16 1480 Watts (2 PSU) 48 48
(2 PSU)
16
(2 PSU)

As you can see from the table, the ICX7150-C12P I am using in the lab at least gets me over the hump on 802.3at for up to four APs.

Power by Class

PoE devices are separated into classes by how much power they require.  You can configure a port to only support a specific class of power.  For instance:

RobLab_7150_C12P_1(config)#inline power ethernet 1/1/12 power-by-class 2

This command limits the port to operating on class 2 only (3.84 to 6.49 Watts at the PD, 7 Watts at the PSE).

Power Adjust Class

You can also adjust the amount of power being fed to devices that belong to a certain class.  For instance, consider a batch of access points that should work in class 0 (0 to 12.95 Watts), but for some reason end up drawing just a bit more than that at boot up or some other scenario.  In these cases often a switch will simply shut the port down!  Well you can work around that with this command, shown in the example below:

RobLab_7150_C12P_1(config)#inline power adjust class 0 
  delta     delta power to be allocated over the LLDP request for the PD class
  minimum   Minimum power to be allocated for the PD class
RobLab_7150_C12P_1(config)#inline power adjust class 0 delta 1000
RobLab_7150_C12P_1(config)#inline power adjust class 0 minimum 16400

Power Limiting

In some scenarios you may have to budget your power quite carefully.  The last thing you want is someone plugging in a new PoE+ or PoH capable device and exceeding the PoE power budget of the switch.  To protect yourself against this possibility, you can set power limits on specific PoE interfaces of the switch.  Limiting the maximum power budget on specific interfaces will prove useful in scenarios where you are using equipment that is capable of running on multiple PoE standards like the Ruckus R720 / R710 / R610 APs.  Below is a table of recommended PoE power limits for different Ruckus AP models that I have gathered from the product data sheets.  In my home laboratory, I know that the maximum power draw for the mesh AP into my home network should be no more than 15.4 watts at the switch.

Ruckus AP Model 802.3af PoE (15.4 Watts) 802.3at PoE+ (30 Watts) PoH Power (< 90 Watts)
H510  PoE Output 4 Watts
Peak 9.2 Watts (no PoE out)
Typical 7.3 Watts
Recommended
PoE Output 12.95 Watts
 –
R310 Recommended
Peak 11 Watts
Typical 7.8 Watts
 –  –
R510  Recommended
Peak 12.6 Watts
Typical 7.5 Watts
 –  – 
R610 2 Chain Transmit (2.4 & 5 GHz)
3 Chain Receive
18dBm / chain (2.4 & 5GHz)
USB Disabled
Secondary Ethernet Disabled
Recommended
Peak 18.8 Watts
Typical 10.4 Watts
 –
R710 2 chain Transmit (2.4GHz only)
4 chain Receive
16dBm/Chain (2.4GHz Only)
USB Disabled

Secondary Ethernet Disabled

 Recommended
Peak 25 Watts (With USB)
Typical 9.4 Watts (no USB)

 –
R720 1 Chain Transmit (2.4 & 5GHz)
4 Chain Receive
18dBm/chain 2.4GHz
20dBm/chain 5GHz
USB Disabled
Secondary Ethernet Disabled
4 Chain Transmit
4 Chain Receive
18dBm/chain 2.4GHz
20dBm/chain 5GHz
USB Disabled
Secondary Ethernet Disabled 

Recommended
Peak 35 Watts (With USB)
Typical 11.4 Watts (No USB)

T300  Recommended
Peak 11 Watts
Typical 7.5 Watts
 –  –
T610

2 Chain Transmit
2 Chain Receive

USB Disabled
Secondary Ethernet Disabled

Recommended
Peak 25 Watts (With USB)
Typical 10.4 Watts (No USB)
– 
T710 –   Recommended
Peak 25 Watts
Typical 10.4 Watts
(PoE Output Disabled)
 802.3at PoE Output enabled.
Peak 60 Watts.

Power Prioritization

Another tool in your arsenal is power prioritization.  This is especially useful if you are running 2 PSUs in a switch and need to plan for a failure of one of the PSU’s.  Or what about if you are uncertain of the power requirements of the PoE devices that will be connected?  When the power budget of the switch is exceeded, what do you keep running and what do you kill?  In my laboratory environment, I want to make sure that the mesh AP into the home network stays up in case anything maxes out the power budget of the switch.   Priority levels can be set between 1 (highest) and 3 (lowest), all interfaces are set to priority 3 by default.

Decoupled PoE & Data Link Operations

Heads up WLAN people, this one is for you!  There are some scenarios where editing an Ethernet interface’s settings can cause power delivery on those interfaces to be affected.  Scenarios include adding / removing a PSE port from a LAG, adding / removing a tagged PSE port from a VLAN or VLAN group, or enabling / disabling the Ethernet port.  In these scenarios, you may want the data link to go up and down, but not the power!

OverDrive

This a VERY cool feature.  Overdrive allows Class 0 & Class 4 powered devices to negotiate for more than 30 Watts using LLDP on a normal PoE+ port!  For example, a Ruckus R720 requires about 35 Watts (Peak) to enable all of its features.  The Overdrive feature allows a powered device (like an AP) to request more than 30 Watts from a standard 802.3at, PoE+ capable PSE, up to the maximum rated power for the PSE.  This feature is available on the PoE+ ports of the switches in the below table:

Switch Model Max PSE Power (PoE+)
ICX 7150-48ZP 47 Watts
ICX7450-24P 42 Watts
ICX7450-32ZP 35 Watts
ICX7450-48P 35 Watts

Notes:

  • PoE overdrive was only introduced in 08.0.61, make sure to update your PoE firmware on the switch to ensure this feature works!
  • PoE Overdrive is only supported on PoE+ / PoH capable Ports.
  • Maximum power output is limited by the hardware limitation of the PSE on the switch port.
  • Overdrive is only valid on Ports that use 2 pairs for power, or on 4-pair ports configured for 2-pair operation.  i.e Overdrive is supported by default on PoE+ ports, but you will need to configure any PoH ports to use only 2 twisted pairs for power and data.

Inline Power on Secondary LAG Ports

You can also configure the switch to deliver power over multiple interfaces that make up a LAG.  For example you can add several ethernet interfaces to a LAG and supply power over one, two or every one of them.  This is a useful feature if you require redundant power to some device or perhaps a large amount of power delivered, for example a 200 Watt digital signage display?

Thats all for now!

(PS: have fun reading the command guide!)

Categories
Ruckus Ruckus ICX Switches Ruckus Wi-Fi

My Ruckus Laboratory – Physical Network

This post is part of a series on building my Ruckus home laboratory environment. Previous articles in this series include:

  1. Building a Ruckus Home/Office Laboratory on a Budget – Part 1 
  2. Building a Ruckus Home/Office Laboratory on a Budget – Part 2
  3. My Ruckus Laboratory – Home Network Architecture & Limitations

In this post I discuss the physical network components and the physical / logical connectivity of the laboratory equipment.

Physical Network Overview

The image below gives you a view of the physical components of the laboratory network and their connections to one another.

Home Laboratory – Physical Connectivity

Dell PowerEdge R610

The Dell PowerEdge R610 is connected to the ICX7150-C12P using 2 Ethernet interfaces configured in a LAG to provide 2 Gb/s full duplex connectivity.  This connection will provide for Layer 2 communications between the physical hosts using untagged frames (placed onto VLAN 100 inside the ICX 7150).  Additional tagged VLANs for Access Points and Client subnets will be enabled on the LAG interface on an as needed basis.

For the specifications of the Dell R610 server that I am using, check out one of my earlier posts on the topic!

Laboratory Access Points

Ethernet interfaces 7 through 11 are reserved for use by Access Points in the laboratory.  These interfaces are not configured yet but will provide L2 services between the APs and the virtual routers only.  This will maintain the logical separation between the Physical and virtual network environments.

Management AP

Port 12 of the switch is reserved for access to the network management subnet.  This interface is configured as an access port on VLAN 101.  That is to say, it will place all untagged traffic onto the management subnet on VLAN 101 and will drop any tagged VLAN traffic entering the interface.  This makes it easy to connect to the management network via an Ethernet cable directly or via a dedicated wireless access point using a WPA2-Personal SSID, as shown in the diagram.

Mikrotik HAP AC

Originally, I intended on using an 802.11ac capable, DD-WRT router that I had in the back of my cupboard from about 2014.  After a week of fist clenching frustration and dealing with a WEB UI that was unresponsive and didn’t correlate to the actual settings in the box, I decided I had had enough and went out and bought the Mikrotik HAP AC (you should be able to get one for about $130.00).

The Mikrotik HAP AC  is a dual band, 802.11ac, 3×3:3 capable Access Point running Mikrotik’s RouterOS (including a level 4 license), capable of fulfilling just about all of my needs in the laboratory.  The primary purpose of the Mikrotik HAP AC router is to provide the physical and virtual laboratory networks with connectivity to external networks whilst keeping both as isolated as possible.

The Mikrotik provides me with options for connecting the laboratory to the Internet via Wireless LAN, Ethernet or even a USB 3G/4G modem in a pinch.  It also supports dynamic routing protocols such as OSPF and BGP if I decide to start toying around with those…

In my home laboratory, the Mikrotik is configured as a wireless client that connects to my Home WLAN.  Traffic from the physical hosts and management subnets is routed to the Mikrotik via uplink port 1/2/2 on the ICX7150-C12P switch.  Traffic from the virtual environment is routed directly from Ethernet Interface 3 on the Dell R610 (by a virtual router) to the Mikrotik.

The traffic from each environment is thoroughly isolated using the builtin firewall and routed to the Internet.  The firewall allows connectivity from the management subnet into the virtual environment, but not the other way around.   The firewall also prevents laboratory traffic from reaching devices in my Home WLAN as it traverses that network on its way out to the internet!

Ruckus ICX7150-12

The ICX7150 holds the entire network together.  It provides L3 services to the physical hosts, management subnet and NAT router.  It also provides Layer 2 services to the laboratory access points, and the virtual environment.  More detailed configurations details of the switch are given in future posts.

Layer 3, Logical Network Diagram

The logical structure of the Layer 3 network, including part of the virtual environment is shown below.  Connectivity between the ICX7150 switch and Laboratory Access Points via Ethernet interfaces 1/1/7-11, and their connection into the virtual environment via Ethernet 1/1/1-6 is excluded since those are limited to Layer 2 only.

Home Laboratory – Logical, Layer 3, Network Diagram

 

 

 

Thats all for now!

Categories
Ruckus Ruckus Wi-Fi

My Ruckus Laboratory – Home Network Architecture & Limitations

This post is a continuation of the two part series Building a Ruckus Home/Office Laboratory on a Budget. This post starts with a new series name because from this point on, the topics discussed are more specific to my own laboratory environment.  This post details some of the limitations I have run into when designing my home laboratory, specifically around my Internet service provider and home LAN.

The Internet Connection Wrinkle

My home-office environment is sadly complicated by my Internet service provider’s lack of features (the irony).  I have a fiber connection installed at home,  which (considering how much the connection costs me per month) is connected to an offensively cheap home gateway.

The home gateway provides connectivity to the home LAN via four 10/100/1000 RJ-45 Ethernet interfaces on the back.  The home gateway performs basic DHCP and DNS forwarding for a single subnet (10.0.0.0/24) which is sent to the Internet via a NAT behind a dynamically assigned Public IP address.

I am not about to pay the ISP a premium for the luxury of a static IP address, and I am blocked from altering any of the settings on this device.  These two factors become material if I ever want remote access to my lab via an inbound NAT.  I don’t really trust the home gateway to do anything more than what it does already, and want my laboratory network to have as little to do with it as possible.  When they installed this plastic shoebox, at least they did me the favor of disabling the even cheaper Wi-Fi radio.  So my life is not all bad.

A Hidden Advantage

An unexpected advantage is that my debilitated home network has given me the impetus to design a laboratory environment that can fit in almost anywhere.  The entire laboratory environment including both physical and virtual components will be able to function behind a NAT inside almost any LAN.  More on this later!

The Home Network

The LAN provided by the home gateway router is still useful for running a Home Wireless LAN with Ruckus Cloud.  I use this service at home to keep up to date with new features and enhancements, and to perform continuous testing and monitoring of the service, allowing me to eat my own dog food so to speak.  The other advantages of this setup are that my home WLAN is remotely manageable/demo-able and is kept separate from my laboratory environment.

Overview

The image below is an illustration of my home WLAN.  The Ruckus APs are R600 (Root Node) / R500 (Mesh Node) connected over a 5GHz mesh link.  The R600 plugs into the Home Gateway and is the primary wireless connection to the Internet.  The home entertainment goodies are connected by the R500 mesh AP.  The two port switch on the R500 mesh node has proven useful for connecting the Raspberry Pi Home Media Server with the 3TB NAS for quick media playback & reducing unnecessary wireless utilization.  The Apple TV connects directly to the Home WLAN.

 

Full Disclosure: The Mesh links in the picture imply that I am using only 40MHz channels, but I decided to go with 80 MHz channels as none of my neighbors are using 5GHz yet, poor souls!

Limiting BSSIDs

The home SSID and any other client SSIDs are only available from the root AP on 2.4GHz and/or 5GHz as I require them.  My abode is not very big and a single AP placed roughly in the middle easily covers the entire area.

The mesh AP publishes no WLAN services except the mesh SSID.  This makes the network topology and connectivity a bit simpler, ensuring that clients are always directly connected to the root AP. It also means clients cannot connect to the mesh AP and then do an unnecessary double wireless hop to the Internet.

Ruckus Cloud does not support AP & WLAN groups just yet, so you can’t very easily choose to place specific SSIDs onto specific APs within a venue.  But, you can still accomplish the task above by disabling the 2.4GHz and 5GHz radios in the individual mesh APs’ Radio Settings.  The APs will maintain their full mesh functionality, acting both as mesh clients, and beaconing the hidden mesh SSID, allowing other APs to connect into the mesh if needs be.

Why Mesh?

First off, to test mesh features and stability with the Ruckus Cloud controller, but that is obvious I guess.

The other reason is that connectivity options on the Raspberry Pi Model 3 that I am using as my Home Media Server are truly basic.  The Raspberry Pi 3 only has a 2.4GHz, 802.11n, single stream radio, and a 10/100 Ethernet connection.  These connectivity options are SLOOOW for the 3TB NAS when you’re doing backups or big file dumps from a wireless client.  I also don’t want to muck around with an external USB Wi-Fi adapter (although that is an option).  The best option for me was the Ruckus R500 as a high throughput mesh client with a 1Gb/s Ethernet connection straight to the NAS and a 100Mb/s connection to the Pi.

I could, if I wanted to, replace the Raspberry Pi with something more capable like the ODROID C2 which would eliminate the need for having the mesh AP, but there would be no performance improvement and quite possibly a performance degradation over the current setup.

That about sums up the home network, we will discuss more in later posts!

Categories
Ruckus ICX Switches Ruckus Wi-Fi

Building a Ruckus Home/Office Laboratory on a Budget – Part 2

In the previous post I covered the basic motivation for needing to build a laboratory for working with the growing range of Ruckus Wireless products and touched lightly on the use of NFV in the laboratory as a method of saving money.  In this post I will give a basic overview of the hardware requirements of the laboratory.

Laboratory Components

The laboratory will consist of the following key entities:

  • Physical network connectivity
  • Virtual Machines
  • x86 Server/s
  • Hypervisor
  • Virtual network connectivity

Physical Network Connectivity

NAT Router

The first and most obvious item you will need is a NAT router to get your Lab network out to the Internet and to get remote access back to your lab.  I am not really going to be too prescriptive here, but I trust you can choose something that meets your needs!  If you are looking for tips, keep reading my blog, as I will be documenting how I built my network!

L2 / L3 Switch

You’re going to need something to connect your x86 servers to the physical network and you’re also going to need something to connect / power your APs.   Physical network connectivity also gives you quick and easy access to the hypervisor host without having to worry about the state of the virtual environment.  Enter the Ruckus ICX 7150-C12P.

The Ruckus ICX 7150-C12P is a 12 port switch that packs a punch.  It can provide PoE+ (802.3at / 30 Watts) on any of its 12 ports (total power budget of 124 Watts) using a fan-less design for silent operation – a useful feature in a home lab!

It has two dedicated 10/100/1000 RJ-45 Ethernet ports for uplink connections and two additional 1/10 GbE  SFP/SFP+ ports for uplinks or stacking.   The 1/10 GbE ports allow you to stack up to 12 of the 7150 family switches together over distances of up to 10km.  You can also stack different switch variants from the same family.  Stacking two switches together can be accomplished using only a single 10 Gb/s link in a linear stack topology, leaving two 10 Gb/s links in the stack free for other connections.

If using the L3 firmware image you can use features like static routes and Routing Information Protocol (RIP).  An additional license provides more advanced features including OSPF routing, Virtual Router Redundancy Protocol (VRRP), Equal Cost Multi-Path (ECMP) routing, Policy Based Routes (PBR) and Protocol Independent Multicast (PIM).

If you are planning on working with Ruckus ICX switches more often, this will be the right switch to start getting used to their basic features!  Check out the Ruckus ICX 7150 product family brochure here.

High-End 802.11ac AP Power Requirements

Some of the latest generation 802.11ac APs available in the market today like the Ruckus R720 boast multiple Ethernet ports, onboard USB ports and support NBASE-T compatible 2.5Gb Ethernet connectivity.  These APs and others like them can require more than even 802.3at PoE+ can deliver in order to use all of their peripheral hardware.  Most AP products like the R720 are capable of operating at 802.3at or even 802.3af power levels, but disable some of their peripherals when operating in this mode.

If you are desperate to use the peripheral interfaces/features on your AP in the laboratory, hate power injectors, and/or absolutely need 2.5 Gb Ethernet connectivity, then it would be a good idea to look at using a more capable switch such as the Ruckus ICX 7150-48ZP which provides 16 x 2.5Gb Ethernet ports with full PoH power (100 Watts) per port.  This in my opinion is overkill, but hey, some of you out there may need it!

Virtual Machines

The first thing we need to determine is the amount of resources we need!  The laboratory should be able to host the following Ruckus products:

  • Two virtual SmartZone (vSZ) Controllers
  • Two virtual SmartZone Data Planes (vSZ-D)
  • One Smart Cell Insight instance
  • One Cloudpath Enrollment Server.
  • One SPoT Location Based Services instance

The laboratory should also have sufficient processing power and memory to host additional servers and virtual machines including:

  • One / Two virtual routers
  • LDAP / RADIUS
  • TACACS/TACACS+ server
  • NMS / Analytics
  • Other virtual instances of products you wish to test

Hardware Requirements

In order to avoid any copyright / confidentiality infringements, I can only provide the overall hardware requirements that I have calculated for the laboratory.  Specific data about the hardware requirements of each virtual appliance are available on the Ruckus Support site which you can access using your own support credentials.  Adding up the minimum requirements for each Ruckus product, and factoring in a ±30% creep in processing requirements,  memory footprint and storage over time (we want this to last!), gives the following totals:

  • CPU: 24 vCPUs / 12 Cores
  • RAM: 96 GB
  • Storage: 800 GB

We should also add in in some extra resources for any other VMs and products we may want to play with err, I mean, “test”.   The final hardware requirements that I have defined for my laboratory are:

  • CPU: 32 vCPUs / 16 Cores
  • RAM: 128 GB
  • Storage: 1 TB

CPU Performance

Selecting the right CPUs for your virtual environment is crucially important.  I strongly recommend using the Intel Core i7 or the Intel Xeon E5-2XXX series chipsets or newer.  According to what I have found in the Ruckus deployment guides, CPUs must be Intel Xeon E55XX series or above, which I believe is part of a requirement to support the Intel DPDK and Intel Virtualization Technology for Direct I/O.  Here is a list of Intel CPUs that support Intel Virtualization Technology, Virtualization Technology for Direct I/O, and Hyper-Threading Technology.  When I installed VMware ESXi 6.5.0 I received a prompt warning me that the Intel Xeon X5650 CPUs may not be supported in future versions of the ESXi.  So try to get yourself something with the E5-26XX chipsets or newer!

Networking Interfaces

It is useful to note that you don’t need a huge number of network interfaces for this laboratory.  Each physical machine should obviously have at least 1 Gb Ethernet interface.  Testing out a 10Gb/s link on the vSZ-D may sound cool, but really it just makes the whole lab more expensive and doesn’t actually do anything except let you go “Ooooh!”.  That said, if you can pick something up with 10 Gb/s interfaces, and want to use them in anger the 7150 switch will handle it just fine!

x86 Server/s

Intel NUC – Skull Canyon

The Intel NUC Skull Canyon is a brilliant small machine and highly suited to this kind of work.  It contains an Intel i7-6770 quad-core processor with up to 32GB of RAM and can use very quick SSD storage.  A unit with 32GB RAM and 500GB of SSD storage would set you back about $1000 on Amazon.  If you simply want a lab that hosts a single virtual SmartZone Controller, virtual SmartZone Data Plane and a virtual router with some other peripheral software, then this would be a great bet!  It also makes for a fantastic option if you are a road warrior and find yourself needing to take your lab equipment with you on your customer visits.  However, to provide the CPU and memory resources for all of the products described above, you would need about three or four of these machines and it quickly becomes expensive and unwieldy to manage in comparison to other options.

Home Servers

I found a list of ten good home servers for 2017 on techradar.  But after investigation (feel free to do your own) I found that the sufficiently powerful contenders cost about $4000.00 or more for a server that contains the required CPU Cores and RAM.  If you’re happy to spend the money, this could be a path for you, but if you’re going to spend that much… why not just buy the Intel NUCs and repurpose them for a fun LAN gaming weekend every now and then?

Quiet Servers

If silence is a key requirement of your lab, then the right place to look is at the products available on EndPCNoise.com.  They have a good selection of servers that are specifically built to be quiet, but like the home servers above, are quite expensive for our use case.

Refurbished Servers

If you are not worried about buying refurbished servers (this is a home lab), don’t mind a little noise, don’t care about moving your lab around with you and can accommodate a rack mount solution, then a refurbished server could be the way to go!  (EDIT: – July 25th 2017 – As the folks over at CBT Nuggets have noted the Dell R710 is a great server to grow into!)

Deep Discount Servers (DDS)

DDS purchases decommissioned servers in large volumes allowing them to sell the refurbished equipment at surprisingly low cost.  All equipment comes from qualified sources and is thoroughly tested.  Most importantly: DDS equipment ships globally and has a solid returns policy.  If you buy from the DDS website directly, you can customize your order by adding CPUs, RAM and storage and other peripherals.  If you opt to purchase via the Deep Discount Servers’ eBay store, you could get a much better deal on a pre-built server that you can customize later.  Having had a look around I am confident you will find a server with the necessary resources for approximately $1500 to $2000.

Aventis Systems

Aventis Systems is another company that sells refurbished systems and should be worth a comparative look when shopping for refurbished products.  Aventis offer a wide array of customizations and additions enabling you to basically build your server from the ground up as if it was a new system.

My Lab Hardware (Updated – July 25th 2017)

In my own lab I am using a refurbished Dell R610 from Deep Discount Servers (DDS) with the following specs:

Make Dell
Model PowerEdge R610
CPU 2 x Intel Xeon X5650, 6 Core, 2.67 GHz
(12 Cores /24 vCPU Total)
Memory 128 GB, DDR3-1333MHz  (8 x 16GB)
Storage 2 x Seagate 2TB, 2.5″, 7200RPM, 12GB/s, SAS HDD
(ST2000NX0273)
Networking 4 x 1 Gb/s RJ45 Interfaces
Power Dual Redundant Power Supplies

Hypervisor

Since we are building a Ruckus Laboratory, the hypervisor can be either VMWare ESXi 5.5 or later or KVM on CentOS 7.0 64-bit.  Both of these hypervisors are available free of charge.  I am running the free version of ESXi 6.5 (6.5.0a available here) on my hardware as I know my way around this product better.

Containers…

I have had a look at things like Docker, Canonical’s LXD and the like and it makes a lot of sense to learn about this technology.  If you do decide to use containers in this environment, it will most likely be Docker inside a Linux VM on top of the ESXi hypervisor.   This will give you the ability to spin up a multitude of small containers quickly and easily inside a single Linux OS at a much higher density than ESXi can do it.  That could be a real game changer when you’re trying to replicate scenarios or spin up VMs quickly and easily in a constrained environment.  It will also simplify a lot of your networking efforts inside the hypervisor layer as containers are hidden behind a NAT into their Linux OS Host.

Virtual Network Connectivity

The Virtual Router

You may be asking yourself: “Why do I need a virtual router if the Ruckus ICX 7150 is already performing layer 3 services?”

Here are 5 good reasons why having the virtual router (or even more than one) is a good idea.

Flexible Network Topologies

In this laboratory inside the hypervisor, many of the entities you are testing and learning about can be placed into multiple different Layer 2 or Layer 3 network configurations in the real world.  For instance, the vSZ can have a single interface and IP address for AP Control, Clustering and Management or it can have 3 separate interfaces on 3 separate subnets.  The vSZ-D can communicate with the vSZ controller via a NAT or directly over a Layer 3 network.  The APs too can be placed behind a NAT, or not.  The other entities in the network may also need to be placed into separate network segments.  Running a flat network will be simple, but isn’g going to give you the flexibility you need to implement, experiment with or learn about supported network topologies.

Minimizing Physical Link Utilization

Some of the virtual appliances require Layer 3 connectivity.  You may only have a single physical NIC entering your server.  Do you really want all that ethernet traffic between layer 3 entities going back and forth to the L3 switch on the ONLY ethernet link?

Compatibility with Other Network Environments

You may need to NAT out of the LAB to get onto your home network, or into a customer’s network for the purpose of a trial or proof of concept.  You can’t always change the network you have to plug into, NAT is a good way of taking all that pain away.

Network Separation

Separating the virtual and physical network environments.  You can use the 7150 switch to provide layer 3 services to locally connected APs and to the hypervisor hosts on separate subnets.  The virtual routers in the virtual environment can manage connectivity to the virtual entities.  This configuration allows you to keep the hypervisor host network out of scope in testing and customer trials.

Remote Access

Remote Access – The Layer 3 switch is not going to give you the ability to gain remote access to your lab environment.  You also want to be able to completely limit remote access to the virtual environment only.  This is useful when allowing someone else to work and play in the virtual lab.  Something gone wrong? No worries, they can’t have touched the hypervisor or physical network – simply reset everything to the snapshot you took and carry on.

vRouter Firmware Options

Here are a couple of the options I am presently aware of for implementing a virtual router.  If you have other options feel free to investigate those!

Mikrotik RouterOS

Mikrotik RouterOS is a well proven platform that can run inside an x86 server environment on top of the VMWare ESXi hypervisor.  It comes with a plethora of features and the ability to scale to very large networks.  It also has a decent Web UI that you can use via a broser if thats your thing.  The routerOS software also supports an API that closely mimics the CLI commands.  For the purposes of this laboratory, the Level 4 license will prove more than sufficient.  Each license costs only $45.00 which is great value considering the capabilities it gives you.

VyOS

If any of you out there have used the older Vyatta Core routers and were sad to see them meet their demise, the VyOS router is for you!  VyOS is an open source project that continued from where the Vyatta Core project ended.  This is a very capable router without the frills.  If you haven’t had a peek at this, you really should.

That’s all for now!

Categories
Ruckus Ruckus ICX Switches Ruckus Wi-Fi

Building a Ruckus Home/Office Laboratory on a Budget – Part 1

A Growing Product Portfolio

If you work with Ruckus Wireless equipment and have been keeping up with the growth of the company’s product portfolio, you may have realized that they don’t just sell Wi-Fi Access Points and WLAN Controllers anymore.   Several short years ago, you would have been forgiven for having just a ZoneDirector and some Access Points lying around that allowed you to test the latest code release in your environment for any specific bugs pertinent to your customers.  Today, however,  you may require a somewhat more capable environment to fully embrace what Ruckus has to offer.   The Ruckus product portfolio has now grown to include:

If you want to work with the comprehensive set of products that Ruckus has to offer, it makes sense to be able to work with them all in a laboratory environment.

Network Function Virtualization to the Rescue

Thankfully, building such a laboratory today (circa 2017) is actually relatively cheap due to the benefits of NFV and the ability to place some network functions in the cloud if needs be.  If this was 2012 or 2013, you’d probably catch yourself buying a whole bunch of second hand networking appliances and stringing them together whilst cajoling your friends with close ties to networking vendors into giving you free code upgrades.  Shh, it doesn’t need to be that way.

Nowadays, you can simply get yourself some capable x86 hardware or an Amazon / Google Cloud / Microsoft Azure account along with some virtual appliances and you are good to go!  The lower cost of x86 hardware, availability of free hypervisors like ESXi / KVM and cheap cloud services means that you can now run an extremely capable laboratory environment for a fraction of what it would cost for a pure hardware appliance based environment.

NFV & Cloud

Ruckus have developed ALL of their new software products to run entirely in virtualized or cloud environments.  The SmartZone controller has been designed to use a single platform whether deployed as a physical or virtual appliance.  This means that running a virtual instance of the SmartZone controller will allow you to test and understand just about every feature and facet of a SmartZone deployment the same as if you were using the hardware appliance.  All other Ruckus software products are available as virtual machines capable of running on either VMWare or KVM and are also available on several cloud providers (check the specific product release notes and installation guides).

Ready to begin?

Categories
Ruckus Ruckus Wi-Fi

RUCKUS WIRELESS: Zone Director vs. SmartZone WLAN Architecture Review


Disclaimer:

I work for Ruckus Wireless Inc. (Now a Part of Brocade!).  This article is intended to assist any Ruckus Wireless channel partner or customer in selecting the right product.  It is neither intended as a “plug” nor a critique of the products.  It is simply a description of how the products work and what the relative advantages and disadvantages are for each platform.  I hope you find this useful!


Introduction

Ruckus Wireless’ venerable line of Zone Director WLAN controllers is quite arguably one of the main products that helped build Ruckus from a small 100-and-something employee start-up in 2008 to the world’s third largest Enterprise WLAN vendor today.  It has been used by tens of thousands of customers for deployments in multiple verticals and provided an easy interface to configure and control wireless networks.  I certainly remember it as a revelation when I started my first one up in 2010.  Better yet was when my sales colleague did the same and did not have to phone me for help configuring it!

But with all things, sometimes it becomes time to move on.  Approximately a year ago Ruckus Wireless introduced the world to its new SmartZone WLAN control platform, a new generation of controllers aimed at helping Ruckus Wireless address shifting market needs/trends and correcting many of the shortcomings of the older Zone Director platform.

In this article I intend to present a comparison of the architecture between the Zone Director and the SmartZone control platforms and look at how that affects the kinds of networks we can design.  I’ll start by looking at WLAN MAC architectures in general to build a framework of our understanding.

WLAN MAC Architectures

Generally speaking there are three distinct MAC architectures available to 802.11 Wireless LAN and Radio Access Network vendors: Remote MAC, Split MAC and Local MAC:

Remote MAC:

  • APs are PHY radio only.  Centralized control function performs ALL MAC layer processing.
  • All real time processing and non-real time processing is performed at the controller.
  • PHY layer data connection between the controller and AP
  • Least common architecture for WLANs.
  • A good examples of remote MAC architecture would be modern LTE Active Das systems, or these guyswho implement a proprietary Software Defined Radio solution for LTE in a datacenter that implements everything from the LTE protocol upwards.  From what I can see, their radios are simply PHY layer transceivers.

Split MAC:

  • MAC Layer processing is split between the AP and the Control Function.
  • Real Time MAC Functions are performed at the AP
  • Control is performed via LWAPP or CAPWAP to a centralized controller.
  • Non-real time / Near Real Time Processing performed by the Controller (the actual details of this depend largely on the vendor!)
  • Integration Service (802.11 WLAN to 802.3 Ethernet Frame Conversion) performed at either the AP or the Controller.
  • Layer 2 / Layer 3 Connection from the AP to the Controller.
  • Most Common WLAN architecture.
  • Implemented by the Ruckus Wireless Zone Director

Local MAC:

  • All MAC layer processing performed at AP
  • Real-time, near real-time and non-real-time MAC processing performed at the AP
  • ALL essential functions are performed at the AP – resiliency provided by a lack of dependence on a centralized controller.
  • Additional services can be implemented by the control function (Config Management, etc)
  • Architecture of choice for distributed deployments with a “cloud” controller.
  • Implemented by the SmartZone control platform

Reading what the LWAPP RFC has to say is interesting too.  According to RFC 5412 these are the definitions of SPLIT vs LOCAL MAC:

Function SPLIT MAC LOCAL MAC
Distribution System Controller Access Point
Integration Service Controller Access Point
Beacon Generation Access Point Access Point
Probe Responses Access Point Access Point
Pwr Mgmt / Packet Buffering Access Point Access Point
Fragmentation / De-Fragmentation Access Point Access Point
Association / Disassociation / Reassociation Controller Access Point
802.11e
Classifying Controller Access Point
Scheduling Controller / Access Point Access Point
Queuing Access Point Access Point
802.11i
802.1X / EAP Authenticator Controller Controller
Key Management Controller Controller
Encryption / Decryption Controller / Access Point Access Point

Zone Director MAC Architecture:

The Ruckus Wireless Zone Director platform was developed as a platform targeting small to medium sized enterprise customers.  It implements a customized version of the Split MAC architecture defined in RFC 5412.  Here is a breakdown of services implemented at the AP vs the Zone Director:

Zone Director Split MAC Roles & Responsibilities

Function Access Point Zone Director
Beacons
Probe Requests/Responses
Control Frames: (ACK + RTS/CTS)
Encryption / Decryption of 802.11 Frames
Distribution & Integration Services
Fragmentation / De-Fragmentation
Packet Buffering / Scheduling / Queuing (802.11e)
WMM-Admission Control
Background Scanning
802.11h DFS Processing
Wireless Distribution Services (MESH) Control
802.11 Authentication & Association
802.1X EAP Authenticator
Encryption Key Management
WPA/WPA2 – 4 Way handshake
RADIUS Authentication & Accounting
Rogue AP – Wireless Intrusion Detection / Prevention
Additional Services (Hotspot / Captive portal / Guest Access etc.)
Configuration Management
Management / OSS/BSS Integration (SNMP etc.)

 

It is important to understand the implications of the way the roles and responsibilities are separated between the AP and Zone Director when designing a customer’s network.  Here are some of the most important points that will affect your design:

Resiliency to Failure

From reading the table above you can easily understand which services you will lose when a Zone Director controller fails and you don’t have Smart Redundancy enabled.  Your existing connected clients will remain connected to the network, but no new clients will be able to associate if the controller is not available.  You will also not be able to roam between APs as all association / re-association requests must be sent to the controller.

The Ruckus SmartMesh algorithm runs on the APs enabling each AP to calculate a path cost associated with each uplink candidate.  However, SmartMesh topology changes between APs cannot occur without a controller to allow Mesh APs to re-associate to a new uplink.  The Zone Director also controls which mesh uplinks can be selected by APs.

In addition to preventing any new associations and re-associations, losing connectivity with the Zone Director will also affect encryption key management, RADIUS authentication, Rogue AP Detection / Prevention and any other additional WLAN services that require the controller to be present.

The AP is designed to work in a local breakout mode by default and implements the integration service onto the LAN.  Anything like L3/L4 Access Control Lists and client isolation settings that are stored on the AP will continue to work for associated clients.  L2 Access Control (MAC Address based Access Control) is implemented on the Zone Director at association.

Scale & Redundancy

The Zone Director platform allows for only 1+1 Active/Standby redundancy (called Smart Redundancy) with no options of clustering controllers to increase scale.  This can cause problems when implementing networks of more than 1000 APs.  N+1 redundancy can be achieved using Primary and Secondary Zone Director failover settings in place of the Smart Redundancy option, but this does not provide for stateful failover.  It is also not possible to allow for failover between HA pairs of Zone Directors.

Authentication / Association

The MAJOR limitation of a Zone Director AP is that all 802.11 Authentication and Association requests must go via the controller. This is the case for all WLAN types except for the WLANs using the “Autonomous WLAN” feature introduced in ZoneFlex version 9.7.

The Zone Director also acts as the RADIUS client for all RADIUS authentication and accounting messages and fulfills the role of the 802.1X authenticator in the 802.1X framework.  The Zone Director is responsible for encryption key management and derives and distributes the Master Session Key (MSK), Pairwise Master Key (PMK), and other temporal keys used for encrypting the air interface to the APs in the network.

Integration with other authentication types and their services including LDAP / Active Directory, Captive Portal Access, Guest Access Services, Zero-IT client provisioning, Dynamic Pre-Shared Keys and Hotspot Access etc are managed by the control function and reside on the Zone Director.

The requirement that all 802.11 authentications / associations must traverse the Zone Director places some limits on the way you can design large networks with respect to:

  • 802.11 Authentication/Association request latency.
  • 802.1X Authentication latency / packet loss.
  • WPA / WPA2 4-Way handshake
  • AP roaming delays
  • Distributed and Branch Office Environments

 

802.11 Authentication/Association latency:

Some 802.11 client devices place a limitation on the acceptable delay between an 802.11 authentication or association request and the expected response.  A known issue with specific barcode scanners exists in which the scanner will fail to join a WLAN unless it receives an association response within 100ms of its request.  Testing conducted in 2013 by Ruckus field engineers showed that most modern enterprise clients with updated drivers did not have any problem with latencies of several hundred milliseconds. The longest latency tested between an AP and the controller was > 400ms (from Sunnyvale to South Africa / Beijing) with no adverse effects on WLAN association or other services.  However if you ask a Ruckus employee for the official number here you will receive an answer of “150ms”, mostly because we aren’t sure of the clients you are using and for other reasons, which will become clear as you read on.

The only exception to this is the Autonomous WLAN feature introduced in ZoneFlex version 9.7 that will allow the AP to directly respond to a client’s authentication and association request.

802.1X Authentication with latency/packet loss:

In addition to the 802.11 Authentication and Association messages, EAPOL messages sent between the Supplicant (client device) and the Authenticator (Zone Director) can also run into trouble when being transmitted across a high latency WAN link with unpredictable packet loss.  Remember that LWAPP tunnels use UDP.  In testing, Ruckus engineers observed that it became difficult for 802.1X clients to successfully complete the EAPOL key exchange over a high latency link due to out of order frames and increasing EAPOL replay counts.

WPA / WPA2: 4-Way Handshake

In any Robust Security Network (RSN) the WPA/WPA2 4-Way handshake to establish the Pairwise Transient Key (PTK) and GroupWise Transient Keys for encrypting the air interface is conducted between the Client STA and the Zone Director.  Once the Zone Director has established the PTK for encrypting client station unicast traffic and sent the GTK to the client for multicast traffic, it informs the AP of the key values allowing the AP to perform Encryption/Decryption of the air interface at the edge of the network.  Key Caching is all stored centrally at the Zone Director and APs are not typically made aware of the PMK values or required to derive any Transient Keys.

In Zone Flex version 9.7 Ruckus Wireless released a new feature called “Autonomous WLAN” that allowed a WLAN to use open authentication with the option of WPA/WPA2-Personal for encryption of the air interface.  This is the only WLAN type in which the AP will derive the PTK from the PMK and store the PMK on the AP.

AP Roaming Delays:

Another aspect of your design that must be considered is the issue of roaming delay when moving between access points.  Every time you re-associate to a new AP, the re-association request must be passed to the Zone Director for approval.  If encryption is being used then you will also be required to wait for the 4-way handshake between the Client STA and Zone Director to take place.  This will introduce considerable latency if the control function is placed far away from the client device.

Even with the use fast roaming techniques like PMK Caching, or Opportunistic PMK Caching you may find that roaming times are too long for certain applications purely because of the time taken to complete the association and complete the necessary 4-way handshake with the Zone Director.

Using an example here, if a Zone Director is placed only 50ms RTT away from an AP, it will take >200ms to perform an AP roam excluding any processing time for generating a response or RADIUS authentication messages between the Zone Director and the Authentication Server (AS).

When using 802.11r Fast-BSS Transition, it is important to realize that the Zone Director acts as the Pairwise Master Key R0 Key Holder (R0KH) as well as the Pairwise Master Key R1 Holder (R1KH) and is involved in each and every roam event!

Distributed and Branch Office Deployments.

Imagine a large enterprise customer with multiple large office buildings in geographically dispersed locations.  Each large office would most likely have all of its own IT infrastructure including Active Directory, LDAP, RADIUS, DHCP, DNS and some local servers.  Smaller branch office locations would be connected back to the main offices via lower throughput and potentially high latency WAN links.  Some enterprises make use of MPLS from an Internet service provider, others might simply have a network of VPNs connecting their offices over the open internet.

Each main office will require its own Zone Director controller to integrate with the local IT infrastructure.  Sure, you can place a single Zone Director in a data center and build connectivity from there to all the main offices.  But all of your association and AAA authentication/authorization requests will be forced to hairpin through the controller! Now don’t get me wrong here.  I am not saying it won’t work.  But think on this: in order to achieve a roam time of less than 50ms in a Vo-Wi-Fi project, RTT to the Zone Director would not be able to exceed 12.5ms!

Branch offices will have exactly the same problem, but it becomes unwieldy to place a Zone Director at every branch office.  So you’re stuck with a conundrum.

Layer 3 Networks:

The Ruckus implementation of LWAPP disabled Layer 2 LWAPP tunnels by default in ZoneFlex 9.4 onwards.  Ruckus supports L3 LWAPP tunnels making it possible to place the Zone Director and APs in different subnets.

NAT Traversal:

The Access Point is the source of all LWAPP communication messages to the controller and it is therefore possible to have APs placed behind a NAT with no issue.  Deployments running ZoneFlex 9.2 or later also support Zone Directors behind a NAT provided that the APs are pointed to the public address and the necessary port forwarding is setup using inbound NAT rules.  Smart Redundancy will also work provided that each Zone Director is given a separate public IP address or located behind separate firewalls.

Centralized Data Forwarding

The Zone Director’s Split MAC architecture allows for centralized data forwarding using LWAPP tunnels (UDP port 12222).   Similarly to CAPWAP, LWAPP does not support differentiated handling of the control and data planes.  Both LWAPP Control and LWAPP Data tunnels must terminate on the same interface of the Zone Director. This is not really a problem in most enterprise environments.  Most designs would typically just place the controller somewhere in the core and tunnel all of the data to it and break traffic out from there.  I mean that is where everything goes anyway, right?

This is not always the case with service providers though.  Most large operators and service providers actually prefer to keep network control and subscriber data separate and forward them to parts of the network optimized for dealing with the specific traffic types.

The other major pain point here is that because data and control planes are inherently entwined, losing a control function will result in an interruption of the flow of subscriber data through the system.

The final straw comes in when you realize that the Zone Director or control function will also be performing large amounts of MAC layer processing of subscriber data for centralized forwarding.  It is no wonder that enterprise solutions implementing a Split MAC Architecture typically max out at somewhere around several thousand APs per Controller

Summary

We’ve gone through the implementation of Split MAC architecture on the Ruckus Wireless Zone Director in fair detail.  We have also covered some of the design constraints and considerations when implementing a Zone Director based network.  The 802.11 Client State Machine is implemented on the Zone Director and all 802.11 Authentication, Association, and Key derivation is done at the controller.  This can introduce long AP roam times and create problems in large geographically distributed deployments where integration to multiple AAA servers may be necessary.  Placing Zone Directors locally at each site is the recommended solution to the problem but can make the solution expensive and harder to manage.

SmartZone MAC Architecture

Since it started development in early 2011 the SmartZone Platform represented a shift away from the widely accepted Split MAC architecture of Enterprise Wireless LANs.  The SmartZone platform implements a customized Local MAC Architecture using separate control and data planes with lightweight protocols best suited to each task.  Most importantly the 802.11 Client State Machine and other services are implemented directly on the AP.  Below is a breakdown of the services implemented on the AP and SmartZone platform:

Function Access Point Smart Zone Controller
Beacons
Probe Requests/Responses
Control Frames: (ACK + RTS/CTS)
Encryption / Decryption of 802.11 Frames
Distribution & Integration Services
Fragmentation / De-Fragmentation
Packet Buffering / Scheduling / Queuing (802.11e)
WMM-Admission Control
Background Scanning
802.11h DFS Processing
Wireless Distribution Services (MESH) Control
802.11 Authentication & Association
802.1X EAP Authenticator
Encryption Key Management

(PMK Caching)

(Opp. Key Caching)

WPA/WPA2 – 4 Way handshake
RADIUS Authentication & Accounting

(Proxy Function)

Wireless Intrusion Detection / Prevention
Configuration Management
Management / OSS/BSS Integration (SNMP etc)

Additional value added services implemented on the SmartZone Controller and the AP are shown on the table below:

Function Access Point SmartZone Controller
Active Directory / LDAP Integration
Captive Portal Authentication
Guest Access Portal redirect
Guest Access Pass Creation / Storage / Authentication
Social Media Login – Signup
Hotspot 2.0 / Passpoint Online Sign-Up Server
WiSPr Hotspot

Resilience to Failure

As you can see from the tables above, the SmartZone Platform is highly resilient in the event of a controller failure.  All essential WLAN services and some additional services are implemented at the AP.  Loss of connectivity with the controller will have minimal impact on the operation of normal WLAN services.

Value added services like Social Media Login or Guest access services require the use of the local user database on the SmartZone controller to authenticate subscribers.  However, APs will keep a cache of users who have already connected and their state ensuring that already connected users are not affected.

Authentication via the AP internal captive portal to any external database is unaffected by connectivity to the controller.

The WiSPR Hotspot function requires the SmartZone controller for HTTP/HTTPS redirect support, client proxy handling, and integration with the Landing portal and RADIUS authentication. The walled garden of the hotspot function is implemented on the AP via a DNS cache allowing subscriber traffic to pass directly from the AP to the internet without passing through the controller.

Scale & Redundancy

The SmartZone platform supports N+1 active/active clustering allowing up to 4 nodes to be clustered together for scale and redundancy.  APs are automatically load balanced across the cluster in a randomized order ensuring no single node is ever overloaded.

All client state information is shared and replicated in each node’s memcache.  All configuration information and other permanent records are striped and mirrored across the cluster database allowing for up to 1 node to fail at any given moment in time with no service impact.

A single SmartZone controller can support as many as 10 000 APs while each cluster can scale to support as many as 30 000 APs.  Should an entire cluster fail the SmartZone platform also supports failover between SmartZone clusters.  Configurations between clusters must be manually synchronized in the current release.

802.11 Association

The SmartZone architecture places the 802.11 client state machine at the Access Point.  All 802.11 authentication & association tasks are handled by the Access Point directly and the results are simply reported to the SmartZone controller (in real time) for storage in the events logs.

There is no latency limitation placed between the SmartZone Controller and the Access Point for successful 802.11 authentication and association.

In controlled access networks using a captive portal method to authenticate a subscriber, the Access Point allows the subscriber to associate and simultaneously performs a memcache lookup on the SmartZone controller to establish if the subscriber state is set to “authorised”.  If authorized the AP knows immediately not to implement the captive portal and to simply allow client traffic through.

In WPA2-Personal networks the Pre-Shared Key is stored on every AP as part of the network configuration allowing immediate interaction with the client.  Similarly L2 Access Control Lists for MAC Address based access control are stored on the AP.

The SmartZone platform’s Local MAC architecture eliminates many of the problems Split MAC architectures face with regard to:

  • 802.11 Authentication/Association request latency issues.
  • 802.1X Authentication latency / packet loss.
  • AP roaming delays
  • Distributed and Branch Office Environments

802.1X Authentication & Encryption Key Management

The AP also acts as the RADIUS client for all RADIUS Authentication and Accounting messages with the option of allowing the SmartZone to act as a proxy.

The AP fulfills the role of 802.1X Authenticator in the 802.1X framework and is responsible for all encryption key management deriving the MSK, PMK and other temporal keys used for encrypting the air interface.  Once a PMK is derived it is cached at the AP to enable PMK-caching enhancing the user experience for 802.1X roaming.  The PMK is also sent by the AP to the SmartZone controller where it is stored in the memcache of each control node to enable Opportunistic PMK Caching.  If a subscriber roams to a new AP and supplies the PMK-ID in its re-association request, the AP will lookup the PMK-ID on the SmartZone memcache avoiding a full 802.1X re-authentication.  The WPA-4 way handshake is completed directly between the client and the AP eliminating any latency requirement to the controller.

There is still the possibility here that a long latency link between the SmartZone and AP could cause an unacceptable roaming delay during the memcache lookup of the necessary PMK but the potential impact of a long latency link is dramatically lowered.  The long latency link is only used for a single lookup in the memcache straight after association.  After that all further interactions will occur directly between the Access Point and Client Device.

The SmartZone platform does not currently support 802.11r Fast BSS Transition (current release: SmartZone Version 3.2), but in any future implementation it is sensible to see that the APs would fulfill the role of the Pairwise Master Key R0 Key Holder (R0KH) and be responsible for deriving the PMKR1 and distributing it to other APs in the defined Mobility Domain.  Thus we can deduce that fast roaming messaging would take place directly between the client and the AP providing great improvements in performance over techniques like Opportunistic Key Caching.  I guess we will have to wait until a future release and see if my deduction is correct.

Distributed and Branch Office Deployments.

The SmartZone platform enables Branch office deployments and direct integration between APs and local IT infrastructure (Active Directory, LDAP, RADIUS, DHCP, DNS, Firewalls etc.).  This means a single SmartZone controller can manage multiple sites without forcing authentication requests or client data to hairpin through the SmartZone controller. Of course, you can still set the SmartZone to be a proxy if that is what you want to do.

Layer 3 Networks:

Ruckus SmartZone APs use a proprietary SSH-based control protocols to communicate with the SmartZone controller allowing all control traffic to traverse Layer 3 networks.

NAT Traversal:

The SmartZone AP is the source of all communication messages to the SmartZone controller allowing APs to be placed behind a NAT with no issue[1].  The virtual SmartZone (vSZ-E, vSZ-H) and SmartZone 100 controllers all support ability to be placed behind a NAT.  In the case of a cluster or it will be necessary to use separate Public IPs for each node.

Centralized Data Forwarding

The SmartZone Platform makes use of proprietary SSH-based control and GRE-based data protocols to separate the control and data planes.  The SmartZone platform allows for centralized forwarding to a SmartZone controller appliance or to a virtual Data Plane using RuckusGRE.

RuckusGRE

RuckusGRE is a customized version of L2oGRE (also known as Ethernet over GRE or SoftGRE) with an added Transport Layer header in the outer IP Packet.  This added UDP header allows the AP to originate GRE tunnels from behind a NAT router.  Recall that in standard GRE, there is typically no UDP/TCP header in the outer packet and therefore no way for NAT routers to track NAT sessions based on TCP/UDP port number.

RuckusGRE uses TLS certificates for authentication and optional AES 128 bit tunnel payload encryption.

Differentiated Handling

The SmartZone implements separate control and data planes allowing differentiated handling of control traffic and subscriber data.  The data plane is designed to take its configuration from the control plane, but to maintain a separate engine for processing subscriber data tunnels.

The SmartZone 100 and SCG 200 physical appliances of the SmartZone platform both implement a data plane on the appliance allowing traffic to be tunneled to the same physical location as the SmartZone controller.

The data plane on the SmartZone 100 can use its own IP address and subnet or it can use the same IP address as the SmartZone 100 control plane.  The SCG 200 requires that each of the data planes (it has two) maintain their own IP addresses.

The Virtual SmartZone Data Plane (vSZ-D) is available as a separate virtual appliance and is designed for use with the virtual Smart Zone Controllers.  Each virtual control plane can manage 2 vSZ-D instances and can manage up to 8 vSZ-D instances per cluster.

Data Plane NAT Traversal

The SmartZone 100 supports placing the data plane IP behind a NAT.  The SCG 200 data plane does not support being placed behind a NAT in the current release (SmartZone 3.2).

The vSZ-D supports NAT traversal between itself and the vSZ control plane and APs allowing it to be placed behind a NAT.  Since all incoming connections would use the same port numbers each vSZ-D instance would have to get its own Public IP.

Latency Requirement

The vSZ-D requires a recommended maximum RTT latency of <150ms to the vSZ control plane.  This is not a concern in deployments using the SmartZone 100 or SCG 200 appliances, as the data plane is co-located with the controller in the appliance.

802.11 Frame Processing

APs are still responsible for all integration services converting 802.11 frames to 802.3 before they are encapsulated in the GRE header, reducing the processing requirements, improving performance and increasing scalability on the SmartZone data plane.

Summary

The SmartZone platform implements a Local MAC architecture – placing the 802.11 client state machine on the AP.  We have reviewed how the SmartZone platform enables simpler deployment of WLAN networks spread across multiple sites and improved user experience without the need for additional controllers at each site.  This is central to enabling business models like cloud control, managed services and branch office deployments.

We have also seen how the SmartZone architecture provides differentiated handling of control traffic and tunneled subscriber data.  The ability to place the vSZ-D in its own subnet allows for separation of control and subscriber traffic in carrier networks.  The ability to place the vSZ-D in a custom physical location (or several locations) increases flexibility when forwarding client data traffic.  The added ability to support AES encrypted tunnels opens the potential for use as a concentrator for remote AP deployments.

[1] There is a limitation here when integrating directly with 3rd Party WLAN gateways using L2oGRE / Soft-GRE.  SoftGRE/L2oGRE does not implement a transport layer header in the outer packet.  This is a limitation of SoftGRE, not the SmartZone Architecture.