Categories
WLAN general

Is LTE-U Really Wi-Fi’s Great Challenger?

So I will admit I am writing this in a fit of pique.  Today Bloomberg published this preposterous piece of marketing flimflam claiming that LTE-U has the potential to replace or drown out Wi-Fi.  I am not sure if the consultants quoted in that article are drinking kool-aid together, but I certainly feel like they have missed some key points.

So I am going to ask one very simple question:

What does LTE-U enable that Wi-Fi doesn’t?

Answer: The ability to seamlessly charge a customer for its use, without any knowledge, or intervention required by the customer.

OK, so that’s a pretty big carrot for the mobile operators! I mean imagine.  They could give a customer an unlimited data plan, the subscriber can move anywhere around the mobile network, indoors and outdoors.  Mobile operators finally get to remove the need to deploy Wi-Fi and the motivation for subscribers to use it in the first place.  LTE-U keeps them on the mobile network and they can do it cheaply with the unlicensed 5GHz spectrum!

Holy crap! Wi-Fi is dead yo! We comin for you Wi-Fi!  Cry ‘havoc!’ and let slip the dogs of war!

Ahhh. Indeed. This is exactly the kind of blinkered thinking demonstrated by mobile operators, the 3GPP and anyone involved in mobile telecommunications that causes me to sneer.

Let me ask another question:

What does Wi-Fi provide that LTE-U doesn’t?

I hope you’re ready.  School is commencing.

Local Area Networking

Contrary to the opinion of those who move exclusively in mobile operator circles, Wi-Fi networks are actually not only built to handle Internet bound traffic from hotspot users and subscribers.  In fact that is likely an incidental service that resulted from what they were actually developed to provide.  The primary purpose of a Wireless LAN is to allow mobility over a venue’s own local area network.  Services enabled by Wi-Fi or WLANs include:

  • Corporate / Operational Communications
  • Security Systems / Services
  • Video Surveillance
  • Building Automation
  • Internet of Things Applications
  • Voice over IP
  • Touch to Talk services
  • Real Time Location Services (Security Personnel, Doctors, Nurses, Asset Tracking…)
  • Digital Advertising Boards
  • Point of Sale Terminals
  • Ticketing Machines
  • Medical Devices
  • Back Office Connectivity for Stores / Shop Fronts
  • Location Based Services / Location Tracking (location analytics, more asset tracking etc)
  • Public Internet Access (The ONE thing LTE-U actually currently enables)
  • Targeted Digital Advertising
  • Customer Engagement
  • Marketing Campaigns
  • … and any other service you recently used on or integrated with a Wireless LAN.

LAN connectivity is a fundamental and critical function that LTE-U simply cannot provide in its current form.  Wi-Fi allows you to setup a radio and plug it directly into your Enterprise LAN.  LTE-U can’t do that.  LTE-U traffic must go via the mobile operators’ Packet Gateways which means all traffic gets hoovered up, sent into some operator’s core network and is then popped out on a public IP in some APN based on who your sim card says you are.  Good luck getting back to the LAN with a reasonable latency.  Also, please explain the complicated architecture, SLAs, agreements, firewall rules / VPN tunnels and identity management that a mobile operator would have to implement to get a heterogeneous group of SIM authenticated users back into a venue’s LAN from multiple APNs.

LTE-U is coming and it is indifferent to your Enterprise LAN, distinctly unfriendly to your Wireless LAN and it could arguably interfere with the very wireless networks that most venues depend upon to operate on a day to day basis.  Which brings me to my next point.

Value to the Venue/Business Owner

Time for another question…

If you were a venue owner, with a WLAN that you used for a mixture of corporate, operational and public access use cases, would you be happy about LTE-U being installed in your building?

What value does LTE-U actually add to a venue?  Sure people will be walking around with smiles on their faces as they stare obliviously at their phones.  But what do I get out of allowing LTE-U in my Office, School, University, Warehouse, Logistics Center, Hospital,  Shipping Port, Airport, Care Facility, Residence, Stadium, Convention Center, Mall, Coffee Shop or Train Station?  I’ll probably get a ticked off IT engineer and a slew of complaints from all my tenants who are currently using Wi-Fi for business related functions.  I have no doubt, LTE-U will find some use in public venues.  But in my office? Where it effectively DOS attacks my WLAN with radio interference on a duty cycle determined by the mobile operator?

When LTE-U is allowed into a venue, the venue owner will ultimately have to accept some form of performance degradation on the Local Area Network, for which they could charge a sizeable rental fee.

LTE-U deployments are likely to be hobbled by high rental costs and restrictions on the density of their deployments, in an effort to mitigate interference with existing WLANs in the building.  It also means that operators will likely have to share LTE-U installations using Neutral Host architectures.  Limitations on deployment density and spectrum usage enforced by the venue owner and tenants will cause LTE-U deployments to suffer congestion just like the outdoor macro network does.  Don’t like it?  The venue owner has every right to show you the door.  You’re not hobnobbing it at MWC anymore Dorothy.  Site acquisition is hard when you’re pissing people off.

Ubiquitous Device Support

Many of the top end smartphones like the iPhone 7, Galaxy S7, Google Pixel and others already contain LTE modems with support for LTE-U.

Here are some useful links:
https://www.qualcomm.com/products/snapdragon/modems/4g-lte/x12
https://www.qualcomm.com/products/snapdragon/modems/4g-lte/x16

But these are the latest, greatest phones.  And it’s just the phones.  There is no major existing use case outside of it.  If you want your new technology to wipe out Wi-Fi, you need to be in every phone, every tablet, every laptop, every mini PC, every gosh darned thermostat, camera, doorbell, pet cam, smart plug, tv and a bazillion other things that didn’t come up on Google’s suggested search items.

Connect Devices without Sim Cards

At this point, if you don’t have a sim card, you can’t connect to LTE-U.  Market researchers IDC expect cellular connected tablet devices in 2019 will still account for less than half of all tablets.  Granted many consumers will simply tether their devices,  but that would ultimately load the LTE-U cell to the point where consumers will want to cut back over to a faster Wi-Fi network on a different channel.  Multefire is one technology which could remove the need for a sim card, but nobody seems to be rolling that out just yet and device support is still a problem.

Low Cost Wireless Access

In the article above there is a claim about the cost of LTE-U small cells.  The estimate is that deploying approximately 24 LTE-U radios is comparable to the cost of deploying 80 Wi-Fi access points.  Which Access Points?  High End Enterprise APs that are worth ±$1500 each?  Or low end SMB entry level APs that sell for around $150 each?  There is a big range in price points for Wi-Fi Access Points and that is a great thing! It means that just about anyone can find something that will work and fit their budget.   There is no such range of pricing and features on LTE-U today.  I am also certain nobody would be investing in this technology if the starting price of the first LTE-U AP to market was only $450 (wink).

Cheap International Roaming

This is hardly a technical constraint. but still a valid one when considering the use cases of LTE-U.  If you want to hop onto an operator’s LTE-U network overseas, sure go right ahead, so long as your home operator has a roaming agreement that doesn’t utterly annihilate your bank balance with fees as high as $10.00 per Megabyte.  Quite seriously, where I come from, if you don’t activate a special “travel saver” option for about $3 per day, they hit your bank account with the Hammer of Thor.  Who on earth connects to a mobile operator overseas with data roaming enabled when you know there is free Wi-Fi somewhere?

Obviously, the solution to this particular problem is a simple business decision (har har), just make cheaper roaming agreements.  Some of you reading this may not have this problem.  But really if the operators wanted to do this internationally, they would have done it already.

Mass Customization

One of the most overlooked advantages of using a Wi-Fi network anywhere is that Venue / Business Owners are free to build an almost infinitely customizable network for all of their internal IT needs and public access services.  Business owners can choose from a plethora of architectures, vendors and solutions providers to build something that meets their exact requirements.

The only initiative that could enable this is the Multefire Alliance who have only just recently released their 1.0 specification. They have some a reasonably impressive member list,  but I’ve seen groups with impressive member lists before.  Importantly there are other technologies out there like Ruckus Wireless’ OpenG that uses the CBRS band for Neutral Host Small Cells and opens up new spectrum!  Either way, LTE-U initiatives have a lot of ground to make up and a big ecosystem to develop within two years before 802.11ax comes wandering round the corner.

Thus far with only Ericsson and Nokia having approved equipment in this space I cannot see how LTE-U will deliver a remotely attractive enterprise use case to snuff out the venerable Fi.

In Conclusion

To think that LTE-U could somehow match Wi-Fi’s depth and breadth of applications for the enterprise in only a few short years is a pipe dream.  Mobile Operators generally have no business interests in common with business / venue owners and typically want as little to do with their enterprise business needs as possible.  You’re never going to be happy with the one size fits all approach that a mobile operator will take to solving what they see as their biggest problem.

The most interesting technology in the LTE-U space right now is actually MulteFire which really could enable something like LTE-U or LAA (which doesn’t affect Wi-Fi as badly) for enterprise use cases.  But there is little evidence right now to demonstrate that this technology will truly get off the ground before the marginal performance gains it delivers over Wi-Fi are matched by newer generations of Wi-Fi equipment.  Until that point, LTE-U and LAA are going to be relegated to the Service Provider segment which by all accounts is only a fraction of the overall WLAN landscape and operators trying to install it will have an uphill battle with venues who already have a WLAN that delivers business value.

That’s it, Rant Over.

It is also worth mentioning that Dean Bubley did a great job of breaking the same topic down here.

Categories
WLAN general

SETTING MINIMUM DATA RATES? – READ THIS FIRST.

 

Nowadays when you speak with a WLAN professional you will often hear the suggestion of setting or restricting minimum PHY rates to optimise your WLAN’s  performance.  Many professionals nowadays consider this to be one of the basic tasks that must be completed in the process of configuring and optimising a WLAN.

Configuring the minimum rates in a WLAN can have many benefits to your network’s performance including reduction of management overhead, removal of unnecessary RTS/CTS frames, better airtime utilisation, and enhanced throughput in the Extended Service Set.  It is an especially useful tool in High Density scenarios like big convention halls, sports stadiums, large lecture theatres and any other environment with many clients in a relatively small space.  Personally I set the minimum rates on all HD designs especially in the 2.4Ghz band!

(Yes, I have used 2.4Ghz in High Density deployments.  No I am not a magician.)

If done incorrectly however, or without a fundamental understanding of what you are actually changing, you may find that your optimisation does not always have the desired effects.  For instance, setting minimum data rates and then expecting this to somehow magically limit the coverage area of your AP… well that’s just a recipe for disappointment.

So let’s go through some of the basics of the various PHY Specifications, what minimum rates are, why we can and should set them to different levels and what EXACTLY we are changing with different settings.

PHY Specifications

WLANs work using the IEEE 802.11 standard and its amendments.  Some of these amendments are known as PHY specifications and define the modulation and coding of Wi-Fi signals that WLAN stations can use to communicate with each other.  The table below summarises the available data rates of each 802.11 PHY specification:

PHY SPECIFICATION

802.11 Amendment Frequency of Operation Supported Data Rates

DSSS

802.11 (original) 2.4GHz

1, 2 Mbps

HR-DSSS

802.11b 2.4GHz

1, 2, 5.5, 11 Mbps

ERP-OFDM

802.11g 2.4GHz

6, 9, 12, 18, 24, 36, 48, 54 Mbps

OFDM

802.11a 5GHz 6, 9, 12, 18, 24, 36, 48, 54 Mbps
HT-OFDM (Greenfields) 802.11n 2.4GHz / 5GHz

6.5 to 600Mbps

VHT-OFDM 802.11ac 5GHz

6 to 6933.3 Mbps

If you want a full breakdown of 802.11n/802.11ac MCS rates you can see them here

Backward Compatibility

802.11g (ERP-OFDM) has a minimum PHY of 6Mbps, but is also required to be backward compatible with 802.11b and 802.11 which both use the same 2.4GHz spectrum.  Even though the specified minimum data rate for 802.11g is 6Mbps, in practice a 802.11g radio will often use a minimum rate of 1 Mbps for the sake of backward compatibility with older clients that could be required to associate to the BSS.  Even a 2.4GHz 802.11n Access Point must be compatible with previous radio generations and client types and will often exhibit a minimum rate of 1 Mbps.

Thankfully a 5GHz 802.11n AP must only be backward compatible 802.11a and so the minimum PHY rate is 6 Mbps.  802.11ac APs only support 5GHz and so also have a minimum rate of 6 Mbps in an effort to maintain backward compatibility to 802.11a.

Preamble & PHY Header

Every single 802.11 frame (regardless of the PHY Specification) carries the same basic format.  The first thing to be transmitted is the preamble.  This is just a sequence of scrambled 1’s (DSSS / HR-DSSS) or simple waveforms (OFDM based PHY Specifications) that allows the listening station to synchronise with the incoming transmission.  It’s like having a code word or a sentence that makes someone aware that you want to talk to them.  AHEM! HEY YOU! LISTEN HERE! I’M TALKING!

The second thing that comes along is the PLCP Header.  Once the receiving stations have perked up and are now listening for the incoming message, the PLCP header gives the receiving stations some more information about the incoming transmission including:

  • The PHY Rate of the transmission of the 802.11 frame (MPDU)
  • How long the transmission will take (DSSS, HR-DSSS Only)
  • How much data is in the transmission (OFDM, ERP-OFDM, HT-OFDM, VHT-OFDM)

The PLCP Headers of 802.11n and 802.11ac carry a lot more information than the above, but it is out of scope for this discussion.

But wait, if the PLCP header defines the rate of transmission for the 802.11 frame, then…  Which rate does the PLCP Header use? Well, it depends on which PHY Specification you’re using.  The Preamble and PLCP Header are ALWAYS sent at the lowest rate defined for the relevant PHY Specification!

A table summarising the Modulation and Coding of the Preamble and PHY Header for different PHY Specs is shown below:

Modulation & Coding

PHY Data Rate

PHY Specification

Preamble

PLCP Header

PLCP Header

DSSS

DBPSK

DBPSK

1 Mbps

HR-DSSS  (Long PPDU format)

 

(Short PPDU format)

DBPSK

DBPSK

1 Mbps

DBPSK

DQPSK

2 Mbps

ERP-OFDM

NA

BPSK R=1/2

6 Mbps

OFDM

NA

BPSK R=1/2

6 Mbps

HT-OFDM (HT-Greenfield)

NA

BPSK R=1/2

6.5 Mbps

VHT-OFDM

NA

BPSK R=1/2

6 Mbps

The DSSS / HR-DSSS Preamble is actually made up of a Sync Field and a Start of Frame Delimiter (SFD).  The Sync Field and SFD are both constructed of randomised 1’s as modulated bits and so have Modulation / Coding information associated with them.

In comparison, the training sequences sent with the ERP-OFDM / OFDM / HT-OFDM / VHT-OFDM preambles are not actually modulated bits.  They are simply a sequence of specific waveforms or symbols that must be correctly interpreted by the receiver to synchronise with the transmitter.  This is why they don’t have modulation / coding associated with them.

Observation #1:

The PHY Rate of the Preamble and PLCP Header is ALWAYS sent at the rate defined for the relevant PHY Specification!  It doesn’t matter if you set your minimum rate to 48Mbps.  The first part of every transmission, the Preamble and the PLCP header will be sent at the MOST ROBUST modulation scheme defined by the PHY Specification you are using.  That means if you are using a 2.4GHz 802.11n AP with full backward compatibility, the MOST ROBUST modulation and coding rate will be DBPSK with a PHY data rate of 1 Mbps.

The MAC Header

Immediately after the PLCP Header, comes the 802.11 frame or Mac Protocol Data Unit (MPDU).  The 802.11 frame or MPDU is sent at the Data Rate specified in the PLCP header.  Every MPDU starts with a MAC Header that contains the MAC Layer Addressing  information (where the frame is from and where it is headed) and a Duration/ID field.  The Duration/ID Field warns any stations that can decode the MAC Header to update their NAV and remain quiet for any future frame transactions that are required after the current frame.

What does Setting the Minimum Rate Change?

The MAC Protocol Data Unit and the MAC Header are sent at the rate specified by the transmitting station in the PLCP Header.  When you set the Minimum Rate for a BSS, you are effectively only setting the minimum rate that may be used for transmitting the MPDU, NOT for the Preamble or PLCP Header.

  1. Setting minimum rates does NOTHING to the size of the coverage area of the BSS.
  2. The PLCP Header will still be sent out at the lowest possible rate defined by the PHY Specification.
  3. You will still cover just as great an area as if you had not touched the minimum rates for the BSS.
  4. Every time a station transmits, every other station that can decode the PLCP Header will be silenced by the transmission.
  5. The number of stations affected by channel contention by other stations HAS NOT CHANGED.
  6. The only thing that has changed is the speed at which we send the MPDU.  Put another way, the only thing we have changed is the maximum amount of Airtime we use sending a specific amount of data.  We’ve improved our efficiency and thereby increased the capacity of the channel, but we have not reduced the CONTENTION AREA.

Ok so I gave minimum rates a bit of a bad rap there.  They might sound quite useless after that list, but hold on a second.

Setting the minimum rate can have a great effect on your WLAN’s performance when trying to limit the amount of air time used up by Management Frames sent on the Wireless LAN and by stations sending data.  Remember that in a BSS, Beacons, Probe Requests and Probe Responses are all sent at the lowest common rate supported by the AP and Client.  This means that if we set our minimum rate to 6Mbps instead of 1 Mbps we will reduce the amount of airtime used for management overhead.  This leaves us with more free airtime to send actual user data!  The biggest effect of changing this setting will be seen in High Density environments with multiple SSIDs that are all beaconing on the same channel.

Setting the Minimum Rate Too High

I have heard some WLAN engineers talk about pushing the minimum rate of a BSS to 24Mbps or 48Mbps.  In my view this can be a bad idea for the following reasons:

  1. The useable coverage area of the AP can become much smaller prompting APs to be placed closer together.  However the area of contention covered by the AP’s PLCP Header remains the same.  This actually increases contention and interference between APs!
  2. Management overhead is typically sufficiently minimised in Very HD environments using a 12Mbps minimum rate, even on 2.4GHz Radios. (I’m talking about a Stadium here)
  3. Management Frames can become hard to decode in some locations, causing clients to drop their connections or miss notifications for queued/buffered traffic etc.
  4. There is also little evidence in my experience that setting high minimum rates prompts sticky clients to roam between APs,  it simply helps them drop more packets.
  5. Wireless is a dynamic medium with many stochastic events and effects on the channel.  Limiting the data rate selection algorithm to only higher rates can cause an AP to suffer higher re-transmit rates and suffer more dropped frames to clients which is exactly what we DON’T want.  Remember, a frame sent at 12 Mbps once is WAY, WAY better than a frame sent at 24Mbps twice or worse one sent at 48 Mbps 4 times or more!

How to Limit the Range of the Preamble and PLCP Header.

Up until now I have only addressed the effects of setting Minimum Rates.  In this section, we will see how we can control the modulation of the Preamble and PLCP Header.

So here is the first thing, you can’t really limit it all that much.  The only way to limit the preamble and PLCP Header is to force the AP not to respect any kind of backward compatibility with older standards.  This may offer you quite considerable shrinkage in the area covered by a preamble and PLCP Header provided you are in a Free Space environment with no obstacles.   if you are indoors however, it may offer less benefit.  Even with any shrinkage, you will still be using BPSK R=1/2 modulation and coding and the range of the PLCP Header will STILL be much greater in comparison to the useful range of the cell.

Let’s look at an example.  Assume that we designed our network to cover to -65dBm to clients throughout the coverage area.  We want to see where our furthest preambles will be heard by our own APs.  Assume we use a high end 4×4:4 802.11ac Wave 2 AP in our design.  This AP has a receive sensitivity of 1Mbps at -101dBm and 6Mbps at -95dBm.  Sure, I have gained about 6dB, which in an outdoor environment equates to halving the coverage range of the PLCP Header.  But indoors, that difference in coverage area may not be as great due to the absorptive effect of physical obstacles like walls, cabinets, furniture etc.

Enabling OFDM-Only Mode

This is by far THE MOST powerful tool in your arsenal.  By simply setting a 2.4GHz 802.11g/n radio to “OFDM-Only” or “802.11g-Only” mode (the naming of this setting differs between vendors) you will immediately accomplish the following:

  1. Force the preambles / PLCP Headers of ALL Management and Data frames to use the OFDM format only.
  2. Banish all 802.11 / 802.11b clients from connecting to your WLAN.
  3. Reduce the use and need of RTS/CTS frames for protecting legacy clients.
  4. Set the minimum rate of the WLAN to 6 Mbps, preventing associated STAs from negotiating DSSS/CCK rates.
  5. All management frames will also be sent at 6Mbps by default, reducing management overhead.

This setting will give you both the ability to set the Preamble and PLCP Header modulation type (OFDM, BPSK, R=1/2) and will also ensure that MPDUs are sent at 6Mbps saving you airtime.


EDIT: Primoz Marinsek, Jim VajdaAndrew von Nagy, and Keith Parsons got me to think about the above very carefully.  This is a revised list after their input.  I would like to thank them for their contributions.  Secondly, if you’re wondering when RTS/CTS modes can be activated by older clients, check out my article about 802.11 PHY Compatibility.


If you are a Ruckus Wireless customer, you can set this for each WLAN from the GUI of the SmartZone Controller.  Just tick the box that says “Enable OFDM-only”.  You should do this in EVERY 2.4GHz network you deploy except when a caveman in a forklift waves an old PSION scanner at you.

Note: I do not intend to insult  men driving forklifts, their technique with a wooden club is generally unmatched.

Enabling N-Only / HT-Greenfields Mode

This used to be a setting I liked using on the 5GHz radios of my WLANs unless I had to support a specific client device.  I would force the 5GHz radios to use only the HT-GreenField PPDU format, preventing 802.11a only stations from associating to the 5GHz WLAN.  My logic here was that very few 802.11a stations are in circulation and those that do support 802.11a are usually dual band.  So I’d force old clients to 2.4GHz.  The value of operating in HT-greenfields mode is marginal in comparison to HT-Mixed mode and doesn’t offer as great a leap as OFDM-Only mode above.

802.11ac:

All the new APs I deploy today are 802.11ac, which defines a single PPDU format that is backward compatible with 802.11n and 802.11a.  I don’t use N-Only mode on my 5GHz WLANs anymore, simply because the option does not exist for an 802.11ac AP.

Management Tx rate

Before I sign off this blog, I feel I should mention my least favourite option for configuring minimum rates simply for the sake of it.  Setting the Management Tx Rate is similar to setting the minimum rate for the BSS, except it does not place any limitation on the minimum BSS rate for client devices.  The management Tx rate simply sets the minimum rate used for the MPDU in management frames.  It can be useful for reducing management overhead without limiting the available rates for clients.  For example you could set the management rate to 2Mbps which would effectively halve your management overhead from beacons.  Personally, I dislike this feature as it can introduce a disparity between the useful range of management frames and clients if you make the difference large enough.

I generally just stick to using OFDM-Only and then setting the minimum BSS rate to a value of 6 or 12 Mbps depending on remaining management overhead.

Anyway, thats all for now folks, hope this helped!

Rob

Categories
WLAN general

802.11 PHY Compatibility – Basic Overview

802.11g <–> 802.11 / 802.11b:

When 802.11g (ERP-OFDM) was released in 2003, we needed a way to co-exist with the large installed base of older 802.11 (DSSS) and 802.11b (HR-DSSS) based networks and equipment out there.  The solution to the problem used RTS/CTS control frames to silence the channel before an 802.11g station used any of the higher modulations.  Before transmitting a frame using OFDM modulation, the transmitting station would initiate an RTS / CTS exchange with the receiving station.  The RTS/CTS exchange uses a PHY rate of 1 Mbps using DBPSK modulation as specified by DSSS, silencing all legacy stations in the coverage area.  This is an awful co-existence mechanism and it chews up valuable airtime silencing the channel before any 802.11g station tries to talk!

As per CWNP.com’s CWAP text and the IEEE 802.11 2012 standard:

RTS/CTS protection is activated by all the STAs in a BSS when the “Use Protection” bit of the ERP Information Element is set to 1 in Beacons and Probe Responses.  The following MUST trigger the “Use Protection” bit to be set:

  1. A legacy client that does not support ERP-OFDM associates to the BSS.
  2. A beacon from a neighbouring legacy BSS (that does not support ERP-OFDM rates) is detected.
  3. Any management frame (excluding a probe request) is detected coming from a neighbouring legacy BSS.

The above list is defines when the Use Protection bit MUST be set to 1. The IEEE 802.11 standard left it open to vendors to choose other scenarios when it is suitable to activate protection mode.  For example, some vendors will set the Use Protection bit as soon as they receive a probe request from a legacy non-ERP station.

802.11g / OFDM-Only Mode

802.11g Radios can be configured to use only ERP-OFDM rates making them incompatible with DSSS / HR-DSSS radios.  This is named differently per vendor but is usually something like “802.11g-only” or “OFDM-only” mode.  This basically configures the radio to ignore probe requests from legacy clients that only support DSSS/CCK rates.  The legacy clients will be prevented from associating to the BSS, removing the need for RTS/CTS for older client stations.  It does not however absolve the OFDM-Only BSS from RTS/CTS completely, we still have to be polite to an legacy DSSS/HR-DSSS cells nearby as per points 2 and 3 above.


Thanks again to Primoz Marinsek, Jim VajdaAndrew von Nagy, and Keith Parsons for making me look deeper.


802.11n <–> 802.11a/g:

Later on with the introduction of 802.11n, the IEEE learned from the horror of RTS/CTS and implemented THREE new PPDU formats to enable more seamless compatibility with either 2.4GHz 802.11g (ERP-OFDM)  or 5GHz 802.11a (OFDM) stations.  The PPDU Frame formats are summarised as follows:

  • non-HT Format: Used when communicating with either an 802.11g or 802.11a station.  The Preamble and PLCP Header are exactly the same as the legacy communications.  802.11n stations can understand the older PPDU formats and keep quiet during transmission.
  • HT-Mixed Format: Used when communicating with an 802.11n (HT-OFDM) station on either band in the presence of 802.11g or 802.11a stations.  This PPDU format starts with the same preamble and PLCP Header as defined for 802.11a/g and then adds a second PLCP Header afterwards that enables 802.11n transmission of the MPDU.  This allows 802.11a/g clients to recognise the impending frame transmission’s properties and keep quiet.
  • HT-GreenField Format: Only has the HT-OFDM defined Preamble and PLCP Header.  Used in networks that do not support backward compatibility.  Does not allow older stations to recognise the impending frame transmission.  This should only be used where no 802.11 a/g stations exist.

2.4Ghz 802.11n radios also interoperate with 802.11/802.11b radios by using RTS /CTS in the same way as 802.11g radios do and are capable of decoding DSSS modulated signals from legacy stations.

Greenfield Mode

802.11n radios have the option of only working in “GreenFields” mode.  This mode only allows 802.11n capable stations to join the BSS and does not use the non-HT or HT-Mixed PPDU formats.  It also does not perform any RTS/CTS for legacy clients using 802.11 / 802.11b.

 802.11ac <–> 802.11a/n

The 802.11ac standard actually only defines a single PPDU frame format for the 802.11ac standard called the VHT PPDU format.  This frame format is compatible with 802.11a (OFDM), 802.11n (HT-OFDM) and 802.11ac (VHT-OFDM) stations all in one go.  802.11ac has no “11ac Only” or greenfield mode like 802.11n does.  In my opinion forcing compatibility with 802.11a only stations was a little silly since they are so rare.

Categories
Ruckus Ruckus Wi-Fi

RUCKUS WIRELESS: Zone Director vs. SmartZone WLAN Architecture Review


Disclaimer:

I work for Ruckus Wireless Inc. (Now a Part of Brocade!).  This article is intended to assist any Ruckus Wireless channel partner or customer in selecting the right product.  It is neither intended as a “plug” nor a critique of the products.  It is simply a description of how the products work and what the relative advantages and disadvantages are for each platform.  I hope you find this useful!


Introduction

Ruckus Wireless’ venerable line of Zone Director WLAN controllers is quite arguably one of the main products that helped build Ruckus from a small 100-and-something employee start-up in 2008 to the world’s third largest Enterprise WLAN vendor today.  It has been used by tens of thousands of customers for deployments in multiple verticals and provided an easy interface to configure and control wireless networks.  I certainly remember it as a revelation when I started my first one up in 2010.  Better yet was when my sales colleague did the same and did not have to phone me for help configuring it!

But with all things, sometimes it becomes time to move on.  Approximately a year ago Ruckus Wireless introduced the world to its new SmartZone WLAN control platform, a new generation of controllers aimed at helping Ruckus Wireless address shifting market needs/trends and correcting many of the shortcomings of the older Zone Director platform.

In this article I intend to present a comparison of the architecture between the Zone Director and the SmartZone control platforms and look at how that affects the kinds of networks we can design.  I’ll start by looking at WLAN MAC architectures in general to build a framework of our understanding.

WLAN MAC Architectures

Generally speaking there are three distinct MAC architectures available to 802.11 Wireless LAN and Radio Access Network vendors: Remote MAC, Split MAC and Local MAC:

Remote MAC:

  • APs are PHY radio only.  Centralized control function performs ALL MAC layer processing.
  • All real time processing and non-real time processing is performed at the controller.
  • PHY layer data connection between the controller and AP
  • Least common architecture for WLANs.
  • A good examples of remote MAC architecture would be modern LTE Active Das systems, or these guyswho implement a proprietary Software Defined Radio solution for LTE in a datacenter that implements everything from the LTE protocol upwards.  From what I can see, their radios are simply PHY layer transceivers.

Split MAC:

  • MAC Layer processing is split between the AP and the Control Function.
  • Real Time MAC Functions are performed at the AP
  • Control is performed via LWAPP or CAPWAP to a centralized controller.
  • Non-real time / Near Real Time Processing performed by the Controller (the actual details of this depend largely on the vendor!)
  • Integration Service (802.11 WLAN to 802.3 Ethernet Frame Conversion) performed at either the AP or the Controller.
  • Layer 2 / Layer 3 Connection from the AP to the Controller.
  • Most Common WLAN architecture.
  • Implemented by the Ruckus Wireless Zone Director

Local MAC:

  • All MAC layer processing performed at AP
  • Real-time, near real-time and non-real-time MAC processing performed at the AP
  • ALL essential functions are performed at the AP – resiliency provided by a lack of dependence on a centralized controller.
  • Additional services can be implemented by the control function (Config Management, etc)
  • Architecture of choice for distributed deployments with a “cloud” controller.
  • Implemented by the SmartZone control platform

Reading what the LWAPP RFC has to say is interesting too.  According to RFC 5412 these are the definitions of SPLIT vs LOCAL MAC:

Function SPLIT MAC LOCAL MAC
Distribution System Controller Access Point
Integration Service Controller Access Point
Beacon Generation Access Point Access Point
Probe Responses Access Point Access Point
Pwr Mgmt / Packet Buffering Access Point Access Point
Fragmentation / De-Fragmentation Access Point Access Point
Association / Disassociation / Reassociation Controller Access Point
802.11e
Classifying Controller Access Point
Scheduling Controller / Access Point Access Point
Queuing Access Point Access Point
802.11i
802.1X / EAP Authenticator Controller Controller
Key Management Controller Controller
Encryption / Decryption Controller / Access Point Access Point

Zone Director MAC Architecture:

The Ruckus Wireless Zone Director platform was developed as a platform targeting small to medium sized enterprise customers.  It implements a customized version of the Split MAC architecture defined in RFC 5412.  Here is a breakdown of services implemented at the AP vs the Zone Director:

Zone Director Split MAC Roles & Responsibilities

Function Access Point Zone Director
Beacons
Probe Requests/Responses
Control Frames: (ACK + RTS/CTS)
Encryption / Decryption of 802.11 Frames
Distribution & Integration Services
Fragmentation / De-Fragmentation
Packet Buffering / Scheduling / Queuing (802.11e)
WMM-Admission Control
Background Scanning
802.11h DFS Processing
Wireless Distribution Services (MESH) Control
802.11 Authentication & Association
802.1X EAP Authenticator
Encryption Key Management
WPA/WPA2 – 4 Way handshake
RADIUS Authentication & Accounting
Rogue AP – Wireless Intrusion Detection / Prevention
Additional Services (Hotspot / Captive portal / Guest Access etc.)
Configuration Management
Management / OSS/BSS Integration (SNMP etc.)

 

It is important to understand the implications of the way the roles and responsibilities are separated between the AP and Zone Director when designing a customer’s network.  Here are some of the most important points that will affect your design:

Resiliency to Failure

From reading the table above you can easily understand which services you will lose when a Zone Director controller fails and you don’t have Smart Redundancy enabled.  Your existing connected clients will remain connected to the network, but no new clients will be able to associate if the controller is not available.  You will also not be able to roam between APs as all association / re-association requests must be sent to the controller.

The Ruckus SmartMesh algorithm runs on the APs enabling each AP to calculate a path cost associated with each uplink candidate.  However, SmartMesh topology changes between APs cannot occur without a controller to allow Mesh APs to re-associate to a new uplink.  The Zone Director also controls which mesh uplinks can be selected by APs.

In addition to preventing any new associations and re-associations, losing connectivity with the Zone Director will also affect encryption key management, RADIUS authentication, Rogue AP Detection / Prevention and any other additional WLAN services that require the controller to be present.

The AP is designed to work in a local breakout mode by default and implements the integration service onto the LAN.  Anything like L3/L4 Access Control Lists and client isolation settings that are stored on the AP will continue to work for associated clients.  L2 Access Control (MAC Address based Access Control) is implemented on the Zone Director at association.

Scale & Redundancy

The Zone Director platform allows for only 1+1 Active/Standby redundancy (called Smart Redundancy) with no options of clustering controllers to increase scale.  This can cause problems when implementing networks of more than 1000 APs.  N+1 redundancy can be achieved using Primary and Secondary Zone Director failover settings in place of the Smart Redundancy option, but this does not provide for stateful failover.  It is also not possible to allow for failover between HA pairs of Zone Directors.

Authentication / Association

The MAJOR limitation of a Zone Director AP is that all 802.11 Authentication and Association requests must go via the controller. This is the case for all WLAN types except for the WLANs using the “Autonomous WLAN” feature introduced in ZoneFlex version 9.7.

The Zone Director also acts as the RADIUS client for all RADIUS authentication and accounting messages and fulfills the role of the 802.1X authenticator in the 802.1X framework.  The Zone Director is responsible for encryption key management and derives and distributes the Master Session Key (MSK), Pairwise Master Key (PMK), and other temporal keys used for encrypting the air interface to the APs in the network.

Integration with other authentication types and their services including LDAP / Active Directory, Captive Portal Access, Guest Access Services, Zero-IT client provisioning, Dynamic Pre-Shared Keys and Hotspot Access etc are managed by the control function and reside on the Zone Director.

The requirement that all 802.11 authentications / associations must traverse the Zone Director places some limits on the way you can design large networks with respect to:

  • 802.11 Authentication/Association request latency.
  • 802.1X Authentication latency / packet loss.
  • WPA / WPA2 4-Way handshake
  • AP roaming delays
  • Distributed and Branch Office Environments

 

802.11 Authentication/Association latency:

Some 802.11 client devices place a limitation on the acceptable delay between an 802.11 authentication or association request and the expected response.  A known issue with specific barcode scanners exists in which the scanner will fail to join a WLAN unless it receives an association response within 100ms of its request.  Testing conducted in 2013 by Ruckus field engineers showed that most modern enterprise clients with updated drivers did not have any problem with latencies of several hundred milliseconds. The longest latency tested between an AP and the controller was > 400ms (from Sunnyvale to South Africa / Beijing) with no adverse effects on WLAN association or other services.  However if you ask a Ruckus employee for the official number here you will receive an answer of “150ms”, mostly because we aren’t sure of the clients you are using and for other reasons, which will become clear as you read on.

The only exception to this is the Autonomous WLAN feature introduced in ZoneFlex version 9.7 that will allow the AP to directly respond to a client’s authentication and association request.

802.1X Authentication with latency/packet loss:

In addition to the 802.11 Authentication and Association messages, EAPOL messages sent between the Supplicant (client device) and the Authenticator (Zone Director) can also run into trouble when being transmitted across a high latency WAN link with unpredictable packet loss.  Remember that LWAPP tunnels use UDP.  In testing, Ruckus engineers observed that it became difficult for 802.1X clients to successfully complete the EAPOL key exchange over a high latency link due to out of order frames and increasing EAPOL replay counts.

WPA / WPA2: 4-Way Handshake

In any Robust Security Network (RSN) the WPA/WPA2 4-Way handshake to establish the Pairwise Transient Key (PTK) and GroupWise Transient Keys for encrypting the air interface is conducted between the Client STA and the Zone Director.  Once the Zone Director has established the PTK for encrypting client station unicast traffic and sent the GTK to the client for multicast traffic, it informs the AP of the key values allowing the AP to perform Encryption/Decryption of the air interface at the edge of the network.  Key Caching is all stored centrally at the Zone Director and APs are not typically made aware of the PMK values or required to derive any Transient Keys.

In Zone Flex version 9.7 Ruckus Wireless released a new feature called “Autonomous WLAN” that allowed a WLAN to use open authentication with the option of WPA/WPA2-Personal for encryption of the air interface.  This is the only WLAN type in which the AP will derive the PTK from the PMK and store the PMK on the AP.

AP Roaming Delays:

Another aspect of your design that must be considered is the issue of roaming delay when moving between access points.  Every time you re-associate to a new AP, the re-association request must be passed to the Zone Director for approval.  If encryption is being used then you will also be required to wait for the 4-way handshake between the Client STA and Zone Director to take place.  This will introduce considerable latency if the control function is placed far away from the client device.

Even with the use fast roaming techniques like PMK Caching, or Opportunistic PMK Caching you may find that roaming times are too long for certain applications purely because of the time taken to complete the association and complete the necessary 4-way handshake with the Zone Director.

Using an example here, if a Zone Director is placed only 50ms RTT away from an AP, it will take >200ms to perform an AP roam excluding any processing time for generating a response or RADIUS authentication messages between the Zone Director and the Authentication Server (AS).

When using 802.11r Fast-BSS Transition, it is important to realize that the Zone Director acts as the Pairwise Master Key R0 Key Holder (R0KH) as well as the Pairwise Master Key R1 Holder (R1KH) and is involved in each and every roam event!

Distributed and Branch Office Deployments.

Imagine a large enterprise customer with multiple large office buildings in geographically dispersed locations.  Each large office would most likely have all of its own IT infrastructure including Active Directory, LDAP, RADIUS, DHCP, DNS and some local servers.  Smaller branch office locations would be connected back to the main offices via lower throughput and potentially high latency WAN links.  Some enterprises make use of MPLS from an Internet service provider, others might simply have a network of VPNs connecting their offices over the open internet.

Each main office will require its own Zone Director controller to integrate with the local IT infrastructure.  Sure, you can place a single Zone Director in a data center and build connectivity from there to all the main offices.  But all of your association and AAA authentication/authorization requests will be forced to hairpin through the controller! Now don’t get me wrong here.  I am not saying it won’t work.  But think on this: in order to achieve a roam time of less than 50ms in a Vo-Wi-Fi project, RTT to the Zone Director would not be able to exceed 12.5ms!

Branch offices will have exactly the same problem, but it becomes unwieldy to place a Zone Director at every branch office.  So you’re stuck with a conundrum.

Layer 3 Networks:

The Ruckus implementation of LWAPP disabled Layer 2 LWAPP tunnels by default in ZoneFlex 9.4 onwards.  Ruckus supports L3 LWAPP tunnels making it possible to place the Zone Director and APs in different subnets.

NAT Traversal:

The Access Point is the source of all LWAPP communication messages to the controller and it is therefore possible to have APs placed behind a NAT with no issue.  Deployments running ZoneFlex 9.2 or later also support Zone Directors behind a NAT provided that the APs are pointed to the public address and the necessary port forwarding is setup using inbound NAT rules.  Smart Redundancy will also work provided that each Zone Director is given a separate public IP address or located behind separate firewalls.

Centralized Data Forwarding

The Zone Director’s Split MAC architecture allows for centralized data forwarding using LWAPP tunnels (UDP port 12222).   Similarly to CAPWAP, LWAPP does not support differentiated handling of the control and data planes.  Both LWAPP Control and LWAPP Data tunnels must terminate on the same interface of the Zone Director. This is not really a problem in most enterprise environments.  Most designs would typically just place the controller somewhere in the core and tunnel all of the data to it and break traffic out from there.  I mean that is where everything goes anyway, right?

This is not always the case with service providers though.  Most large operators and service providers actually prefer to keep network control and subscriber data separate and forward them to parts of the network optimized for dealing with the specific traffic types.

The other major pain point here is that because data and control planes are inherently entwined, losing a control function will result in an interruption of the flow of subscriber data through the system.

The final straw comes in when you realize that the Zone Director or control function will also be performing large amounts of MAC layer processing of subscriber data for centralized forwarding.  It is no wonder that enterprise solutions implementing a Split MAC Architecture typically max out at somewhere around several thousand APs per Controller

Summary

We’ve gone through the implementation of Split MAC architecture on the Ruckus Wireless Zone Director in fair detail.  We have also covered some of the design constraints and considerations when implementing a Zone Director based network.  The 802.11 Client State Machine is implemented on the Zone Director and all 802.11 Authentication, Association, and Key derivation is done at the controller.  This can introduce long AP roam times and create problems in large geographically distributed deployments where integration to multiple AAA servers may be necessary.  Placing Zone Directors locally at each site is the recommended solution to the problem but can make the solution expensive and harder to manage.

SmartZone MAC Architecture

Since it started development in early 2011 the SmartZone Platform represented a shift away from the widely accepted Split MAC architecture of Enterprise Wireless LANs.  The SmartZone platform implements a customized Local MAC Architecture using separate control and data planes with lightweight protocols best suited to each task.  Most importantly the 802.11 Client State Machine and other services are implemented directly on the AP.  Below is a breakdown of the services implemented on the AP and SmartZone platform:

Function Access Point Smart Zone Controller
Beacons
Probe Requests/Responses
Control Frames: (ACK + RTS/CTS)
Encryption / Decryption of 802.11 Frames
Distribution & Integration Services
Fragmentation / De-Fragmentation
Packet Buffering / Scheduling / Queuing (802.11e)
WMM-Admission Control
Background Scanning
802.11h DFS Processing
Wireless Distribution Services (MESH) Control
802.11 Authentication & Association
802.1X EAP Authenticator
Encryption Key Management

(PMK Caching)

(Opp. Key Caching)

WPA/WPA2 – 4 Way handshake
RADIUS Authentication & Accounting

(Proxy Function)

Wireless Intrusion Detection / Prevention
Configuration Management
Management / OSS/BSS Integration (SNMP etc)

Additional value added services implemented on the SmartZone Controller and the AP are shown on the table below:

Function Access Point SmartZone Controller
Active Directory / LDAP Integration
Captive Portal Authentication
Guest Access Portal redirect
Guest Access Pass Creation / Storage / Authentication
Social Media Login – Signup
Hotspot 2.0 / Passpoint Online Sign-Up Server
WiSPr Hotspot

Resilience to Failure

As you can see from the tables above, the SmartZone Platform is highly resilient in the event of a controller failure.  All essential WLAN services and some additional services are implemented at the AP.  Loss of connectivity with the controller will have minimal impact on the operation of normal WLAN services.

Value added services like Social Media Login or Guest access services require the use of the local user database on the SmartZone controller to authenticate subscribers.  However, APs will keep a cache of users who have already connected and their state ensuring that already connected users are not affected.

Authentication via the AP internal captive portal to any external database is unaffected by connectivity to the controller.

The WiSPR Hotspot function requires the SmartZone controller for HTTP/HTTPS redirect support, client proxy handling, and integration with the Landing portal and RADIUS authentication. The walled garden of the hotspot function is implemented on the AP via a DNS cache allowing subscriber traffic to pass directly from the AP to the internet without passing through the controller.

Scale & Redundancy

The SmartZone platform supports N+1 active/active clustering allowing up to 4 nodes to be clustered together for scale and redundancy.  APs are automatically load balanced across the cluster in a randomized order ensuring no single node is ever overloaded.

All client state information is shared and replicated in each node’s memcache.  All configuration information and other permanent records are striped and mirrored across the cluster database allowing for up to 1 node to fail at any given moment in time with no service impact.

A single SmartZone controller can support as many as 10 000 APs while each cluster can scale to support as many as 30 000 APs.  Should an entire cluster fail the SmartZone platform also supports failover between SmartZone clusters.  Configurations between clusters must be manually synchronized in the current release.

802.11 Association

The SmartZone architecture places the 802.11 client state machine at the Access Point.  All 802.11 authentication & association tasks are handled by the Access Point directly and the results are simply reported to the SmartZone controller (in real time) for storage in the events logs.

There is no latency limitation placed between the SmartZone Controller and the Access Point for successful 802.11 authentication and association.

In controlled access networks using a captive portal method to authenticate a subscriber, the Access Point allows the subscriber to associate and simultaneously performs a memcache lookup on the SmartZone controller to establish if the subscriber state is set to “authorised”.  If authorized the AP knows immediately not to implement the captive portal and to simply allow client traffic through.

In WPA2-Personal networks the Pre-Shared Key is stored on every AP as part of the network configuration allowing immediate interaction with the client.  Similarly L2 Access Control Lists for MAC Address based access control are stored on the AP.

The SmartZone platform’s Local MAC architecture eliminates many of the problems Split MAC architectures face with regard to:

  • 802.11 Authentication/Association request latency issues.
  • 802.1X Authentication latency / packet loss.
  • AP roaming delays
  • Distributed and Branch Office Environments

802.1X Authentication & Encryption Key Management

The AP also acts as the RADIUS client for all RADIUS Authentication and Accounting messages with the option of allowing the SmartZone to act as a proxy.

The AP fulfills the role of 802.1X Authenticator in the 802.1X framework and is responsible for all encryption key management deriving the MSK, PMK and other temporal keys used for encrypting the air interface.  Once a PMK is derived it is cached at the AP to enable PMK-caching enhancing the user experience for 802.1X roaming.  The PMK is also sent by the AP to the SmartZone controller where it is stored in the memcache of each control node to enable Opportunistic PMK Caching.  If a subscriber roams to a new AP and supplies the PMK-ID in its re-association request, the AP will lookup the PMK-ID on the SmartZone memcache avoiding a full 802.1X re-authentication.  The WPA-4 way handshake is completed directly between the client and the AP eliminating any latency requirement to the controller.

There is still the possibility here that a long latency link between the SmartZone and AP could cause an unacceptable roaming delay during the memcache lookup of the necessary PMK but the potential impact of a long latency link is dramatically lowered.  The long latency link is only used for a single lookup in the memcache straight after association.  After that all further interactions will occur directly between the Access Point and Client Device.

The SmartZone platform does not currently support 802.11r Fast BSS Transition (current release: SmartZone Version 3.2), but in any future implementation it is sensible to see that the APs would fulfill the role of the Pairwise Master Key R0 Key Holder (R0KH) and be responsible for deriving the PMKR1 and distributing it to other APs in the defined Mobility Domain.  Thus we can deduce that fast roaming messaging would take place directly between the client and the AP providing great improvements in performance over techniques like Opportunistic Key Caching.  I guess we will have to wait until a future release and see if my deduction is correct.

Distributed and Branch Office Deployments.

The SmartZone platform enables Branch office deployments and direct integration between APs and local IT infrastructure (Active Directory, LDAP, RADIUS, DHCP, DNS, Firewalls etc.).  This means a single SmartZone controller can manage multiple sites without forcing authentication requests or client data to hairpin through the SmartZone controller. Of course, you can still set the SmartZone to be a proxy if that is what you want to do.

Layer 3 Networks:

Ruckus SmartZone APs use a proprietary SSH-based control protocols to communicate with the SmartZone controller allowing all control traffic to traverse Layer 3 networks.

NAT Traversal:

The SmartZone AP is the source of all communication messages to the SmartZone controller allowing APs to be placed behind a NAT with no issue[1].  The virtual SmartZone (vSZ-E, vSZ-H) and SmartZone 100 controllers all support ability to be placed behind a NAT.  In the case of a cluster or it will be necessary to use separate Public IPs for each node.

Centralized Data Forwarding

The SmartZone Platform makes use of proprietary SSH-based control and GRE-based data protocols to separate the control and data planes.  The SmartZone platform allows for centralized forwarding to a SmartZone controller appliance or to a virtual Data Plane using RuckusGRE.

RuckusGRE

RuckusGRE is a customized version of L2oGRE (also known as Ethernet over GRE or SoftGRE) with an added Transport Layer header in the outer IP Packet.  This added UDP header allows the AP to originate GRE tunnels from behind a NAT router.  Recall that in standard GRE, there is typically no UDP/TCP header in the outer packet and therefore no way for NAT routers to track NAT sessions based on TCP/UDP port number.

RuckusGRE uses TLS certificates for authentication and optional AES 128 bit tunnel payload encryption.

Differentiated Handling

The SmartZone implements separate control and data planes allowing differentiated handling of control traffic and subscriber data.  The data plane is designed to take its configuration from the control plane, but to maintain a separate engine for processing subscriber data tunnels.

The SmartZone 100 and SCG 200 physical appliances of the SmartZone platform both implement a data plane on the appliance allowing traffic to be tunneled to the same physical location as the SmartZone controller.

The data plane on the SmartZone 100 can use its own IP address and subnet or it can use the same IP address as the SmartZone 100 control plane.  The SCG 200 requires that each of the data planes (it has two) maintain their own IP addresses.

The Virtual SmartZone Data Plane (vSZ-D) is available as a separate virtual appliance and is designed for use with the virtual Smart Zone Controllers.  Each virtual control plane can manage 2 vSZ-D instances and can manage up to 8 vSZ-D instances per cluster.

Data Plane NAT Traversal

The SmartZone 100 supports placing the data plane IP behind a NAT.  The SCG 200 data plane does not support being placed behind a NAT in the current release (SmartZone 3.2).

The vSZ-D supports NAT traversal between itself and the vSZ control plane and APs allowing it to be placed behind a NAT.  Since all incoming connections would use the same port numbers each vSZ-D instance would have to get its own Public IP.

Latency Requirement

The vSZ-D requires a recommended maximum RTT latency of <150ms to the vSZ control plane.  This is not a concern in deployments using the SmartZone 100 or SCG 200 appliances, as the data plane is co-located with the controller in the appliance.

802.11 Frame Processing

APs are still responsible for all integration services converting 802.11 frames to 802.3 before they are encapsulated in the GRE header, reducing the processing requirements, improving performance and increasing scalability on the SmartZone data plane.

Summary

The SmartZone platform implements a Local MAC architecture – placing the 802.11 client state machine on the AP.  We have reviewed how the SmartZone platform enables simpler deployment of WLAN networks spread across multiple sites and improved user experience without the need for additional controllers at each site.  This is central to enabling business models like cloud control, managed services and branch office deployments.

We have also seen how the SmartZone architecture provides differentiated handling of control traffic and tunneled subscriber data.  The ability to place the vSZ-D in its own subnet allows for separation of control and subscriber traffic in carrier networks.  The ability to place the vSZ-D in a custom physical location (or several locations) increases flexibility when forwarding client data traffic.  The added ability to support AES encrypted tunnels opens the potential for use as a concentrator for remote AP deployments.

[1] There is a limitation here when integrating directly with 3rd Party WLAN gateways using L2oGRE / Soft-GRE.  SoftGRE/L2oGRE does not implement a transport layer header in the outer packet.  This is a limitation of SoftGRE, not the SmartZone Architecture.

Categories
WLAN general

What does 802.11 Contention Look Like (Part 3) – Probabilities & Our First Model

In my previous blog posts in this series I covered the inherent problem with CSMA/CA and how it loses efficiency as more stations make use of a channel.  I also covered some of the basic rules of how CSMA/CA works as implemented by 802.11 Wireless LANs.

As I mentioned at the beginning of this blog series, my intention here is to build an argument and the logic for describing what WLAN contention actually looks like in the real world.  We’ve gone over some of the rules of how WLAN contention works.  But now it is time to start building some simple mathematical models of its effects on the medium and then to carefully evaluate how those models stack up against the real world operation of 802.11 WLANs.

First, some background on probabilities

When you are working with system that uses a random or stochastic component and forms a stochastic process, it becomes very hard to build a deterministic model of how that system works.  There is an inherent uncertainty built into the system and we cannot predict the outcome with 100% accuracy.  It becomes necessary to start attaching a probability to any one of the possible outcomes.  Let’s start with an example.

If you have two people in a room, and you ask both people to choose a random number between 0 and 15.  What is the chance that they will choose the same number?

The first person chooses a random number and the second person has a 1 in 16 chance of choosing the same number.  Therefore the chance of both people choosing the same number is 1/16 or 6.25%.  If we want to put it another way, the chances of the two people NOT choosing the same number are 15/16 or 93.75%.  We can also write 15/16 as (1-1/16).

What if we have three people?  Well, then the problem becomes slightly trickier and we have to ask ourselves what we want to know!

There are two things we can calculate:

  1. The possibility that person 2 or person 3 will choose the SAME number as person 1. (A = B or A = C)
  2. The chance that any two clients have chosen the same number. (A = B, A = C, or B = C)

A = B or A =C:

The chance of choosing the same number in this case becomes a little harder to calculate.  We know that as the number of people selecting numbers increases the chance of selecting the same number as the first person must increase, but how?

By looking only at the chances of choosing the same number, we struggle to find a single intuitive equation that can tell us how the chance increases.  Well, at least I do.

As it turns out, we can solve the problem by evaluating the chance of NOT choosing the same number.  The chance of A and B NOT choosing the same number = 93.75%.  If we allow C to choose a number, the chances of also avoiding a collision with A are 93.75% of 93.75%.  What are the chances that both B and C avoided a choosing the same number as A?  We can say P* is the possibility of NOT choosing the same number:

Eq_img_1

Therefore P the chance of choosing the same number:
Screenshot 2016-06-14 13.02.35

We now have an intuitive equation that gives us the possibility of person B or C choosing the same number as person A.

Generalizing to N people

But what if we had more than 3 people? We can also generalize this equation for N People as follows:

Screen Shot 2016-06-13 at 9.44.50 PM

Therefore if we had 4 People, the chances of persons B, C, or D choosing the same number as A would be:

Screen Shot 2016-06-13 at 9.44.59 PM

Generalizing the number of possible choices

Up until this point, we have examined only the scenario where the random number chosen by N people exists within the range of 0 to 15, i.e. there are 16 different possible outcomes with each choice. What if the number of options is larger?

We can generalise our equation to reflect the number of possible outcomes with each selection by labeling the number of possible outcomes as . The Possibility P(N) of a collision with Person 1 therefore becomes:

Screen Shot 2016-06-13 at 9.45.13 PM

In summary, this generalised equation gives us the probability that the random number chosen from different options by a specific person in a group of N people will also be selected by at least one other member of the group of N people.

A = B, A = C, or B = C

In this case we are trying to find the scenario where any choice is the same as any other choice in the group of people.  Put another way, in a group of people of a certain size, what is the chance that any person chose the same number as any other person?

First we can start with the trivial example of two people in a group choosing a random number out of x different options.  The probability of them NOT choosing the same number is easily seen to be:

Screen Shot 2016-06-13 at 9.45.20 PM

What about the case of three people choosing a number? Let’s go through it slowly.

In this case, the first person to choose a number has a 100% probability of not selecting the same number as anyone else, since nobody has selected a number.  The second person has a (1-1/x) chance of no collision with the first.  The third person has a (1-2/x) chance of not colliding with either of the other two.  So we can say that for three people the probability of NO collision is equal to:

Screen Shot 2016-06-13 at 9.45.28 PM

The probability of a collision occurring is therefore equal to:

Screen Shot 2016-06-13 at 9.45.36 PM

Generalizing to N people

What if we had more than three people? The equations from above can be generalised for N people.  The probability of NO collision is equal to:

Screen Shot 2016-06-13 at 9.45.44 PM

Simplifying the equation and multiplying everything out, we can see that the equation becomes:

Screen Shot 2016-06-13 at 9.45.55 PM

Screen Shot 2016-06-13 at 9.46.03 PM

Screen Shot 2016-06-13 at 9.52.12 PM

The probability of a collision occurring is complementary to the probability of no collision, so the probability of a collision is given by:

Screen Shot 2016-06-13 at 9.52.26 PM

where N is the number of people, and x is the number of possible choices.

In summary, this second generalized equation gives us the probability that the random number chosen from x different options by ANY person in a group of N people will also be selected by at least one other member of the group of N people.

Our first model:

For our first model of 802.11 channel contention I have built a simple spreadsheet using the formulae above and the rules laid out in my Second Post in this series.  It shows the possibility of a client experiencing a collision given a variable number of active 802.11 STAs.  I have included the different Contention Window values for different QoS Access Categories and I have also included some of the effects of 802.11 PHY Type on the Contention Window Size.  Feel free to download it here.

For the purposes of keeping things simple the model has the following restrictions / limitations:

  1. Number of total STAs cannot exceed 101 – this is a limitation with Excel mathematical capabilities – I’ll work around that later!
  2. Assume DSSS and HR-DSSS PHY Types do not support QoS.
  3. All STAs are using the same PHY Type.
  4. All STAs are transmitting traffic in the same QoS Access Category
  5. All STAs are contending for medium access 100% of the time (100% Duty Cycle)
  6. All STAs are using the same Contention Window size.
  7. This model  is turn based.  It assumes that once an STA has chosen a random back-off and transmits a frame, the same STA cannot interfere with the remaining STAs until everyone has sent a frame.  So it is only applicable for low traffic  / duty cycle environments.
  8. The BSS does not suffer from Near/Far problems (i.e. if two stations Tx at the same time, both will experience a collision, one cannot overpower the other and get through).

So as we can see from this model, it is pretty limited but it does give your customers an idea about why you cannot support more than a specific amount of VoIP clients on a WLAN.  It will also lend significant credence to the argument by Ben Miller about why many Wi-Fi Calling Apps are better off using Video priority in HD environments with multiple active clients.  You can fidget with the numbers here too and play around.

As for our modeling journey, well it is version 0.1 so we have a long, long way to go before we have something that even closely resembles the real world.  But we’ve cast the first stone.

In my next blog post I will discuss a method that I used to try and improve upon this model, and I’ll discuss its differences with this model and also its limitations.

Rob

Categories
WLAN general

What Does 802.11 Contention Look Like (Part 2) – How contention works:

802.11 Medium Access Control implemented with the Distributed Co-ordination Function (DCF) and Enhanced DCF Channel Access (EDCA) methods, uses a random back-off counter to help ensure that clients do not transmit their data at the same time, but rather take turns to send their data one after the other.  This is the “Collision Avoidance” part of CSMA/CA.

When two (or more) 802.11 stations both have data to send on the same channel and both have established that the channel is clear, both stations will select a random number, wait a pre-defined period called a DCF Interframe Space (DIFS) and then start counting from the random number to zero. The first station to reach zero transmits its frame. The other station hears the PHY & MAC header of the frame transmission and returns to idle state until the transmission is completed. Once the transmission is over and it is time for the second station to contend for the medium again, it will go through the process again and simply start counting down from where it left off. The first station, if it has more data to transmit, will select another random number and contend for the medium again.

The first important observation

It is important to realize early on, there is a possibility here that the first station could choose a random number lower than the one that the second station is on. Think of the scenario where the first station chooses a random back-off value of 4 for its first transmission, and a back-off value of 7 for its second transmission (12,4% probability). If the second station chose a random back off of anything greater than 11 for its first transmission (26% probability) then the first station will send two frames before the second station has even sent one! Statistically however, this should happen an equal number of times to both stations ensuring roughly fair access to the medium! This is why we say WLAN is a BEST EFFORT protocol. There is no guaranteed access to the medium as there is with “Telco” wireless access technologies! (WiMAX, iBurst, 2G, 3G, LTE etc)

An Analogy

You can think of 802.11 channel contention like randomly choosing a number to find your position in a queue in a bank. You pick a number and join the queue in that position and wait until it is your turn to be served by the teller.  If there is a person in front of you, and that person goes to the teller, you take a few steps forward then stop and hold your place in the queue until it is your turn to be served.  If the person ahead of you has any further business requests they join the queue in a new random position and wait their turn again.


EDIT: A point of clarity from Devin Akin here:

“When the CCA is returned as BUSY, then the remaining number of slot times are held during frame transmissions from other STAs/APs, which could each take a long time if they are large aggregate data frames.”

Let’s say there are two people in the queue, and you pick position 7 and another picks position 3.  Both of you will walk forward 3 steps toward the teller.  The other person will get to the counter first, you will see that the teller is busy and you will hold position 4 until that person is done.  This could take some time!  Imagine if the person talks very slowly?  Once the person is done, then they will rejoin the queue in a random place and you will start walking forwards again from the position you were in.


Collisions

Now the trick is, if the range of numbers you can choose is finite, let’s say between 0 to 15, then there is a chance of you choosing the same position in the queue as someone else. This chance obviously increases with the number of the people joining the queue.

If you choose the same random number as someone else, you would both arrive at the front of the queue at the same time. The bank clerk serving the queue would be unable to answer both of your requests simultaneously and would refuse to acknowledge either of you.

You have just had a “collision”.

According to the rules of 802.11 WLAN channel access, both of you would have to pick a NEW random number and rejoin the queue, but with a slight twist. In order to try and reduce the chance of people choosing the same random number and resultant position in the queue, those that suffer a collision would choose a number between 0-31. That is, you would double the range of random numbers you could choose from.

With each consecutive collision you experience, you will double the range of the random numbers you select from. So if you experience a collision and select a random number between 0 – 31 and you reach the counter at the same time as someone else again, you go back (again), select a random number between 0-63 and rejoin the queue and try again, and so on and so forth.

If you succeed in getting to the front of the queue alone, then your request will be served. For your next request you will return to using the original number range (0 – 15).

A second important observation

Just because a specific station has experienced a collision (or several) does not mean that another station has had the same experience. This means that while one station is choosing numbers from 0-31 or 0-63 or even higher after several collisions, there are other stations choosing from 0-15 at the same time!

The Contention Window

In 802.11 WLANs, we call the random number range that stations can choose from the “Contention Window” (CW). The contention window can be calculated by the equation CW = 2N-1. In OFDM Based WLANs with traffic in the Best Effort or Background Access Categories the starting value of N is 4.

802.11 client devices and APs (both referred to as Stations or STAs for short) will typically attempt to re-transmit their data up to 7 times before giving up and dropping the frame. With each successive collision the value of N is incremented until a successful transmission occurs. This means that the Contention Window will typically follow the pattern below:

Attempt # N Value Contention Window (2N-1)
1 4 0-15
2 5 0-31
3 6 0-63
4 7 0-127
5 8 0-255
6 9 0-511
7 10 0-1023
8 NA Give Up. (Dropped Frame)

If the Transmitting STA succeeds in sending the frame and receives an ACK from the receiving station, or to follow our analogy above, gets to the front of the queue alone then N resets to 4 and it returns to using the smallest contention window size (0-15) on the next attempt.

CWmin and CWmax

According to the IEEE 802.11 standard and its amendments, the Contention Window has a minimum and a maximum value.   The example above shows a CWmin of 15 (N = 4) and a CWmax of 1023 (N = 10).   This is true for any 802.11a/g/n/ac WLANs using the Best Effort or Background QoS Access Categories.

802.11 and 802.11b

Any WLAN that uses DSSS (802.11) or HR-DSSS (802.11b) technologies uses a slightly larger CWmin with the N-Value set to 5. This means that the CWmin value for 802.11 and 802.11b WLANs is 31. The CWmax value is still 1023 (N=10)

802.11e QoS:

The 802.11e amendment introduced the ability to prioritize traffic based on 4 different Access Categories. The goal in this case was to ensure that higher priority traffic gained access to the medium faster and ahead of traffic from lower priority Access Categories. The CWmin and CWmax values for the different access categories are shown below.

Access Category CWmin CWmax
VOICE 3 7
VIDEO 7 15
BEST EFFORT – Default for most WLAN traffic 15 1023
BACKGROUND 15 1023

Slot Times

Before we get too far ahead of ourselves it is sensible to just cover how long it takes each STA to count down the randomly selected back off. Each unit in the random back-off countdown lasts for a specific Slot Time as shown in the below table:

PHY Type Slot Time
DSSS / HR-DSSS 20μs
ERP-OFDM 9μs
OFDM 9μs
HT-OFDM 9μs
VHT-OFDM 9μs

It is also important to note that 2.4GHz 802.11g/n networks may only use the 9us slot times when there are no DSSS / HR-DSSS STAs associated to the Basic Service Set. This information enables us to see the actual time delay introduced by the random back-off countdown.

Summary

In this blog post, I really wanted to put a spotlight on the random back off timer and its specifications. I know I haven’t looked at the Interframe Spaces very carefully and I also have not looked at other aspects of traffic flow like TXOP etc. For now I am focused purely on looking at how the random back off affects WLAN contention. We have made some interesting observations and drawn some interesting conclusions:

  1. There is a chance that one STA can successfully transmit two frames or gain access to the medium twice before another STA using the same Contention Window value can gain access to the medium once.
  2. There is a chance that two STAs can pick the same random back off number and collide.
  3. When STAs suffer a collision and don’t receive an ACK they double the size of the contention window and re-contend for the medium.
  4. STAs in the same BSS can all be using different contention Window values at the same time depending on the result of their most recent attempted frame transmission.
  5. STAs will re-transmit up to 7 times, using a contention window of up to 1023 slots (up to 9,2 milliseconds!) before giving up and dropping the packet.
  6. Once a STA has successfully tranmistted a frame and received an ACK, it goes back to the original Contention Window (CWmin) and commences the process again for the next packet.
  7. In QoS enabled networks, Voice and Video traffic use smaller contention windows than Best Effort and Background traffic to guarantee faster access to the medium and also do not increase the N value by more than 1 increment in their re-transmissions.

 

Now that we know the above things, we can start to build some simple models that show the likelihood of a collision under different circumstances. That will come in the next blog post though!

Rob

Categories
WLAN general

What Does 802.11 Contention Look Like? (Part 1)

The IEEE 802.11 Wireless LAN protocol uses Carrier Sense Multiple Access with Collision Avoidance (CSMA/CA) with a fairly robust arbitration mechanism to allow WLAN Stations (STAs) to gain access to the wireless medium based on their traffic priority.

The central problem with CSMA/CA and the 802.11 arbitration mechanisms is that the chances of a collision occurring between two stations increases seemingly exponentially with the number of active stations contending for the wireless medium.  This means that as more and more stations use a given channel, the efficiency of the channel decreases quite dramatically.

Many engineers I have met have questioned this and have wondered why the IEEE Standards Body and 802.11 Working Group did not consider more “efficient” methods of medium access.  Why not TDMA, CDMA or OFDMA?  Now, I am not here to get into a discussion about which medium access method the IEEE or vendors should have chosen to develop way back when in 1990-whatever.  Maybe we can discuss channel access methods and their properties on another day after I have read more on the topic!   The truth is though, the IEEE 802.11 Working Group defined several methods of medium access. As it turns out, vendors only ever really developed their products to use the Distributed Co-ordination Function (DCF) and its 802.11e-defined successor, the Enhanced DCF Channel Access (EDCA) method also referred to as the Hybrid Co-ordination Function or HCF for short.

I have often wondered as a WLAN engineer what CSMA/CA actually LOOKS LIKE in the real world.  As designers of Wireless LANs we are often asked to perform capacity calculations and predict what performance level a given design will achieve for a specified application type.  We have the tools to perform a coverage analysis very well, but sadly the tools for capacity prediction fall short of expectation in my opinion.

I agree that there are some good tools out there that get us into the right ballpark and remove a fair amount of guesswork.  But I have yet to come across any tools that can accurately model the effects of WLAN contention and its effect on capacity from first principles.  Most capacity calculators and models I have ever used will cover for the effects of contention using a simple  “RF environment” setting that reduces the amount of airtime efficiency by a certain factor to account for higher collisions in noisier and busier environments.  While this gets us closer to an estimated value of capacity, it still does not give us a definitive answer.  It also IGNORES the underlying mechanics of what is truly going on, making our calculations susceptible to unforeseen errors.  Don’t get me wrong – errors are present in all calculations and can be worked around, but we need to know what they are or at least see where they may come from!  Oftentimes, if we get pushed to reveal the source of some values used in our current capacity calculators, we often have to admit that it is based largely on empirical evidence, or past experience.  Sometimes that will work, other times it might not.

There are also newer technologies like 802.11ax with OFDMA channel access methods that promise to introduce a new “High Efficiency” mode of operation to the 802.11 Standard.  It would be useful then to understand exactly why and under what circumstances the current standard and its amendments are inefficient!  We also need to build a model that can show the potential improvements brought forth by new amendments.

This is the source of  my motivations for starting this blog series.  And before you say it…

Yes, I know The Birthday Problem

When you mention the term 802.11 Contention or CSMA/CA most WLAN engineers will talk about THE BIRTHDAY PROBLEM, and we all understand that this calculates the overall chance of a collision given a certain number of clients.  But it still doesn’t really give us a clear picture of how it works.  It’s kind of like saying “There is a 15% chance of rain today”, which is useful to tell whether to walk outside with an umbrella, but it doesn’t give us any insight into how the clouds form, or when it will rain, or for how long, or if there will be a monkey’s wedding, or if it will involve a romantic couple kissing in a park.  Ok, maybe I took it too far, but you get my gist.  We want details! Details are important.  Details bring insight and understanding.

So what am I going to attempt?

In this blog series I intend to try and model the 802.11 arbitration process to show how certain factors can affect contention overhead in a Wireless LAN.

I will spend some time outlining the mechanics of what I am trying to model and some of the various approaches I have taken to visualize it.  I will also show how those methods are limited and why they are only an approximation of reality.

Let’s start (and end Part 1) by defining exactly what I am trying to visualize.

I want to explore what 802.11 contention LOOKS LIKE in the real world.  Given N active stations using a given WLAN channel:

  1. What is the likelihood overall of one (or more) collisions occurring given N active clients? (This is the answer given by The Birthday Problem).
  2. What is the likelihood of a specific station experiencing a collision at any given time?
  3. How many collisions should we expect a client to have before successfully transmitting a packet?
  4. Do the number of collisions fluctuate or settle into a relatively stable dynamic equilibrium?
  5. Can the number of collisions “snowball” into a state where the medium becomes unusable?
  6. What (really) is the maximum number of active clients that can be supported on a single WLAN channel before the system becomes unstable and the protocol breaks down?
  7. How do different 802.11 technologies (E.g. DSSS, HR-DSSS) affect the number of collisions?
  8. How does traffic volume affect the likelihood of a collision?
  9. What is the effect of an unequal split of upstream Vs. downstream traffic on the number of collisions?
  10. How does setting QoS values alter the number of collisions?
  11. How can we characterize and model application traffic streams?
  12. How does airtime efficiency and the negotiated PHY Data Rate affect the number of collisions?
  13. How does the effect of contention affect our WLAN capacity calculations?
  14. Most importantly, given this insight how can we minimize the number of collisions to get the best possible network capacity?

Right, well that’s a pretty solid list of items to try and answer. I am not sure we will be able to get to all of them at once, but now that we’ve written down what we want to see, we can move on to Part 2!

Rob

Categories
WLAN general

Wi-Fi CSMA/CA – Going Deep

In my career as a Wireless Engineer I have read a whole bunch of articles on Wireless Arbitration and how it works and to be perfectly honest, I have not ever really looked deep enough to fully understand how the carrier medium is actually marked as busy by the PHY and MAC layers.

I learnt pretty early on that at the beginning of the transmission of a frame, there is a field somewhere that tells all the other stations to stop transmitting for a given period of time.  But I glossed over where exactly it was placed because, well, it just did not seem that important.  The medium gets marked as busy,  I send my frame at my chosen rate and then everyone waits politely until i get my ACK, and we all start again after a nice communal DIFS. Right? Simple!

Hah!  As they say: the devil is in the details.

I must admit that I never fully understood Andrew Von Nagy’s post Understanding Wi-Fi Carrier Sense until now.  It all finally came together for me whilst studying for my CWAP exam and reading the IEEE 802.11-2012 standard along wth it and his post suddenly brought it all together for me. So kudos to Andrew!

HERE IS THE THING.

The implementation of Wi-Fi carrier sense and how it works is actually very important. This is in my view, one of the MOST critical things to understand when designing a WLAN. It will affect your network in many ways including the size of the contention domain, the presence and effect of hidden nodes in your network, and the effects of minimum rate selection when you commence with your attempts to optimise your design.

I don’t want to re-hash what other people have written so much about (and I feel many have done a great job before me) but I do want to focus on the parts that I failed to fully grasp early on, in the hope that it will help someone else get the penny to drop.

So if you have done any reading on this topic (and you can read the articles above for free) you will know that a wireless station is not allowed to transmit until it has determined the Physical Medium to be Idle (Clear Channel Assessment is determined to be True) AND the Network Allocation Vector must be zero.  But how does all this actually work?!

Ok we are going to get into this, but first!

CCA – Energy Detect

802.11 is a polite protocol.  Stations in an 802.11 based WLAN are required to wait for other stations to finish their transmissions before sending their own. They are supposed to share.  But what about scenarios where someone else, who is not using Wi-Fi, is using the same medium?  What about a bluetooth radio? Or something using Zigbee? Or one of those dreaded 5GHz dect phones?  Brace yourselves friends, the IoT is coming…

(*cough* Please note my subtle joke about LTE-U and LAA/LWA *cough*) 

In the event that there is some signal that cannot be decoded by the station, the 802.11 standard provides for polite interoperation with the unknown entity in the form of Clear Channel Assessment  using Energy Detect.

If there is any signal above the specified Energy Detect (ED) threshold, 802.11 based or otherwise, the  affected station detecting the energy in the channel must mark the channel as busy and wait for it to stop.

The Energy Detect threshold is typically calculated as a number ABOVE the minimum required receive sensitivity of the radio.  In 802.11, the original requirement for receive sensitivity was to be able to receive 2Mbps (using DQPSK) at an RSSI of -80dBm with a given error rate (something tiny).  In 802.11a and beyond, the ED threshold was set to 20dB ABOVE the minimum receive sensitivity laid out in the standard.

In the original 802.11 standard the ED threshold was defined as:

  • -80dBm for stations using a transmit power of 100mW or more.
  • -76dBm for stations using a transmit power of more than 50 mW
  • -70dBm for stations using a transmit power of less than or equal to 50mW

In later amendments the threshold was changed as follows:

  • 802.11b (HR-DSSS): -76dBm, -73dBm and -70dBm respectively following the same pattern as defined for DSSS above
  • 802.11a /g/n/ac: -62 dBm  (using a 20MHz Channel)

Vendors will typically implement an ED threshold of just less than -62dBm to be compliant with the standard.  Using a metric of -65dBm in your designs for an ED threshold is quite reasonable.

So, in this scenario, any received signals above the ED threshold will cause the channel to be marked as busy and stations will defer access.   This is called CCA – Energy Detect.

CCA – Carrier Sense (CS)

Now that we know when to defer to very loud and/or non-802.11 radio activity, let’s see how 802.11 stations interact with each other.

If you were a receiving station, with nothing to say, you would be in IDLE mode and you would be listening to the communication medium on your chosen channel.

Let’s assume another station on your channel starts transmitting.

NOTE: This can be ANY station, not just another in your Basic Service Set or an STA sending something addressed to you.  It can literally be any station using the same channel as you and whose transmissions you can successfully decode.

The transmitting station starts by sending the appropriate PLCP preamble for the chosen PHY type (these are just wave forms and training sequences that help receivers sync up with the transmission and lets them know of the imminent arrival of an 802.11 transmission).

The next field to follow straight after the preamble is the PLCP Header.  In the PLCP Header there is a SIGNAL field that contains two pieces of information:

  1. The length (in octets) of the coming 802.11 frame
  2. The data rate or modulation and coding scheme of the data to follow.

NOTE: For legacy DSSS (802.11) and HR-DSSS (802.11b) radios, the PLCP Header is preceded by a Start of Frame Delimiter of 16 bits.  The DSSS / HR-DSSS PLCP Header contains a Length Field defining the period of time in microseconds that the channel will be busy for.  The newer PHY types (802.11a/g/n/ac) simply provide the frame length in bytes and the PHY Rate and let the receiving stations figure out how long to be quiet for.

Any receiving station that can decode the PLCP Header and the SIGNAL field must maintain Physical Medium Busy Status for the duration specified by the two parameters above.  This is called Physical Carrier Sense.  The station heard a PLCP header it could decode, it establishes that the channel is busy, so it shuts up.

It is important to know:  The PLCP Preamble and the PLCP Header are sent AT THE LOWEST RATE using THE MOST ROBUST MODULATION SCHEME.  This basically ensures that ANY other station within range of the transmitting station will be silenced by the PLCP Preamble and Header.

A Common Misconception:

The IEEE 802.11 standard and its amendments (a/g/n/ac) define the minimum required signal level at which PLCP preambles and PLCP Headers must be decoded for 802.11a/g/n networks as -82dBm.

Many people misinterpret this to mean that below -82dBm, the CCA stops having any effect and I can ignore incoming transmissions even if I can decode them…  THIS IS A MISTAKE!

6 Mbps at a signal level of -82dBm is the minimum required receive sensitivity to meet the minimum required 802.11 standards.  Most if not ALL enterprise WLAN equipment has receive sensitivities that are vastly better than this required value.  Some even typically go lower than -95dBm!

In practical terms, this could mean that the signals from your APs are travelling four times further than you originally anticipated, and your contention domain is a whole LOT larger than you originally thought…

The point here is that if a station can detect a preamble and decode it at ANY signal level, the station is obligated to indicate that CCA is false and the medium is busy.

Duration / ID and the Network Allocation Vector

After the PLCP Header is received, the 802.11 frame itself actually starts arriving at the receiving stations.  The 802.11 frame is transmitted at the PHY rate specified in the PLCP Header.

The 802.11 frame contains the MAC Header in which we have a Duration/ID field.  This Duration/ID field is used to tell the wireless stations on the same channel about *future frames* that may follow as part of the MAC protocol in use with this frame transmission.   For example, we know that if we are using the good old DCF that any frame transmission will be followed by a SIFS and and ACK.  The Duration Field in this case provides a time value in microseconds equal to the time taken to wait for a SIFS and receive the ACK, keeping the medium open for the required frame exchange to be successfully completed.

ALL stations who heard the MAC Header in the transmission will now update something called the Network Allocation Vector to add in the extra time needed to remain silent for future frame exchanges. This is called Virtual Carrier Sense.  

 

In Summary:

Accessing the medium is made up of three parts:

  1. CCA – Energy Detect: Is anything currently occupying the channel above the specified Energy Detect threshold?
  2. CCA – Carrier Sense: Have I detected / decoded any 802.11 frames on my channel that are currently being transmitted?
  3. NAV:  Am I waiting for other frames to be sent as part of an exchange between other STAs on my channel?

Other STAs can only enter into contention to access the channel once all three conditions are met.

Some important notes to remember here:

  1. The PLCP Header is always sent at the lowest PHY rate defined in the IEEE 802.11 standard for that PHY Type regardless of what you set your minimum rate to (more on this later).
  2. The MAC Header containing the Duration Field is sent at the PHY Rate / MCS Index defined for the 802.11 frame in the PLCP header.
  3. The information contained in the PLCP Header and the MAC Header are processed by ALL stations on the same channel that are nearby enough to successfully decode the information.
  4. Channel overlap and contention due to CCA – Carrier Sense may be occurring over a MUCH bigger domain than you originally anticipated.  6Mbps at -82dBm receive sensitivity is THE MINIMUM requirement for 802.11a/g/n/ac.  Check your APs’ receive sensitivity tables to see just how far away a PLCP Header might be heard!
  5. Acknowledgement to Andrew Von Nagy over at Revolution Wi-Fi for the remark in his blog about the NAV being used for Future Frames. That was when the penny dropped!

Finally, I would like to acknowledge Keith R Parsons and Jared Griffith and the larger twitter Wi-Fi community for providing me with the discussions necessary to wrap my head around these concepts. Thanks all!

Hope this helped,

Rob