RouterOS L3HW Caution

引言

在2021年12月,RouterOS迎来了一次重大变革,即推出ROS7,在ROS7第一个版本中引入了不少新且重量级的功能,同时也包括ROS6原有功能的推翻重做。其中对本人来说最显著的变化点是“L3 Hardware Offloading”,即L3HW。

Cloud Router Switch

Mikrotik交换机大致分为两类,一是只能运行SwOS(SwitchOS)的CSS与RB260系列,二是既可以运行ROS(RouterOS)也可以运行SwOS和只能运行ROS的CRS系列。
如果曾经或目前正在使用CRS系列交换机的话,想必一定知道这破玩意名字里的”Router”纯纯的虚假宣传。在ROS7还没有出现的时代里,这些被冠以”Cloud Router Switch”的交换机只能拿来处理二层流量。这里并不是说三层流量完全处理不了,严格来说,这些玩意确实能处理三层流量,但是有这个功能和这个功能能用完全是两码事。
为了更好地说明这件事,我以CRS300系列设备举例,CRS200/CRS100系列除CRS112-8P-4S-IN仍在发行外,其余该系列设备均已停产,故不作说明,文章后续对于CRS的说明也仅局限在目前仍在售的CRS300/CRS500系列。

设备型号 CPU架构 CPU型号 交换芯片型号 交换芯片品牌 端口配置
CRS326-24G-2S+RM ARM32 98DX3236 98DX3236 Marvell 24*RJ45 + 2*SFP+
CRS326-24S+2Q+RM MIPSBE QCA9531 98DX8332 Marvell 24*SFP+ + 2*QSFP+

在ROS7发行前,ROS系统并没有任何针对三层流量的硬件卸载能力,即三层流量只能通过交换机的CPU进行处理,虽然ROS拥有Fastpath功能可以减少CPU处理数据包时的工作量,但是Fastpath功能启用的条件和限制较多,在一般部署情况下很难用Fastpath加速处理。所以CRS交换机处理三层流量的性能高低取决于交换机的处理器性能,例如CRS326-24G-2S+RM,这玩意的处理器是Marvell芯片内建的800MHz ARMv7处理器,有两个核心,CPU频率也凑活,在开启了Fastpath后的大包处理性能为亮眼的1266.6Mbps,1043Kpps。这还是1518byte的大包路由性能,64byte那就更为亮眼了,高达175.9Mbps。
我们再看看被冠以”Our fastest switch for the most demanding setups”的CRS326-24S+2Q+RM有什么亮眼表现:开启了Fastpath后,1518byte路由性能为451.8Mbps,64byte性能为74.2Mbps。
综上,可以看到CRS系列的三层惨不忍睹。

RouterOS v7

在ROS7发行后引入了针对于三层流量的硬件负载支持,这意味着只要ROS设备中存在可以处理三层流量的芯片,ROS便可以调用实现三层的线速转发,听起来很美好,是吧?

事实并不像Mikrotik说的那么美好,自ROS7发布第一版开始,针对于L3HW的bug修复几乎出现在每一次更新中,在早期版本中L3HW甚至会引发灾难性的后果。我们来看一下从第一版ROS7.1到目前笔者撰写此文章时的最新稳定版7.15.2对于L3HW的修补。

版本 Change log内容(原文)
7.1 support for Layer 3 hardware acceleration on all CRS3xx devices
7.1.1 fixed HW offloaded routing when using 7 or more VLAN interfaces; fixed bonding source MAC address; improved system stability when using 7 or more VLAN interfaces
7.2 added HW offloaded FastTrack support for inter-VLAN routing; fixed HW offloaded NAT; fixed HW offloaded routing when using 7 or more VLAN interfaces; fixed ICMP message when routed packet exceeds MTU and DF flag is set; fixed bonding source MAC address; fixed default route offloading for CRS305, CRS326-24G-2S+, CRS328, netPower, netFiber devices; improved routing table offloading for CRS305, CRS326-24G-2S+, CRS328, netPower, netFiber devices; improved system stability when using 7 or more VLAN interfaces
7.2.2 improved offloading for directly connected hosts on CRS305, CRS326-24G-2S+, CRS328, CRS318, CRS310; improved route table offloading for CRS317, CRS309, CRS312, CRS326-24S+2Q+, CRS354, CRS5xx, CCR2x16 devices
7.3 greatly improved route offloading speed; improved offloading for directly connected hosts on CRS305, CRS326-24G-2S+, CRS328, CRS318, CRS310; improved offloading in cases of HW table overflow for CRS305, CRS326-24G-2S+, CRS328, CRS318, CRS310; improved route table offloading for CRS317, CRS309, CRS312, CRS326-24S+2Q+, CRS354, CRS5xx, CCR2x16 devices; log HW routes count and the shortest offloaded subnet prefix if the HW memory gets full; offload only main routing table; optimized offloading when dealing with large volume of directly connected hosts; partial routing table offload for Marvell Prestera DX4000/DX8000 switch chip series
7.5 fixed HW offloaded NAT
7.6 added “l3hw-settings” sub menu under the switch menu; added support for IPv6 route offloading (disabled by default); fixed “H” flag presence for accelerated connection tracking entries; fixed possible packet loss when using HW offloaded NAT; improved connected host offloading on startup; improved connected IPv6 host offloading when routing table is nearly full for 98DX224S, 98DX226S, and 98DX3236 switch chips
7.7 fixed host offloading in a case of MAC address change; fixed offloaded NAT for CRS309 switch; improved system stability when disabling or enabling L3HW offloading
7.8 added destination MAC address check for offloaded FastTrack connections
7.9 improved route offloading for 98DX224S, 98DX226S, and 98DX3236 switch chips
7.10 added “autorestart” option to L3HW settings; added advanced configuration options for fine-tuning the L3HW offload (l3hw-settings are cleared after upgrade or downgrade) (CLI only); added monitoring options for L3HW utilization (CLI only); fixed /32 route deletion; fixed IPv6 ECMP route offloading; fixed offloading of /32 IPv4 and /128 IPv6 routes; fixed route table offloading during large volume of route updates; improved host and nexthop offloading; improved offloading of IPv6 hosts after L3HW driver restart; improved performance of partial offloading; improved route offloading after gateway change; improved system stability for partial routing table offload
7.11 changed minimal supported values for “neigh-discovery-interval” and “neigh-keepalive-interval” properties; fixed /32 and /128 route offloading after nexthop change; fixed incorrect source MAC usage for offloaded bonding interface; improved system responsiveness during partial offloading; improved system stability during IPv6 route offloading; improved system stability
7.12 fixed IPv6 route suppression; improved system stability during IPv6 route offloading; prioritize local IP addresses over the respective /32 and /128 routes
7.13 fixed routing for IPsec encapsulated packets
7.14 fixed IPv6 host offloading in certain cases; fixed neighbor offloading after link flap; preserve offloading for VLANs when bridge ports are down

L3HW

我们看一下对于启用L3HW的注意要点

最重要的一点放在前面,ROS仅支持针对VRF “main” 即主路由表的L3HW,且需要注意主路由表内的L3HW路由不能与L3HW port共存,即不能与在switch选项中配置了”l3-hw-offloading=yes”的端口共存,否则会出现路由错误,如果在其他VRF表中开启了L3HW,该VRF中的路由会出现不可预料的错误。

L3HW仅支持”802.3ad”和”balance-xor”两种类型的bonding端口。

Mikrotik官方并不建议将IPv6路由进行L3HW处理,理由为IPv6路由会大量占用可用的硬转发资源。Mikrotik官方建议对于IPv4开启L3HW,将IPv6流量丢给CPU处理。

和Fastpath一样,在配置Firewall要格外小心,注意Firewall功能是否支持L3HW。

针对于Marvell Prestera DX2000和DX3000系列芯片,仅能修改端口mac地址的最后两位。

Mikrotik的CRS交换机与CCR路由器内建的芯片型号繁多,不同型号芯片的TCAM不尽相同,支持的NAT条目,IPv4/6路由条目,QOS策略条目也不同,在配置时需要额外注意是否超过了最大可用范围。