引言
在2021年12月,RouterOS迎来了一次重大变革,即推出ROS7,在ROS7第一个版本中引入了不少新且重量级的功能,同时也包括ROS6原有功能的推翻重做。其中对本人来说最显著的变化点是“L3 Hardware Offloading”,即L3HW。
Cloud Router Switch
Mikrotik交换机大致分为两类,一是只能运行SwOS(SwitchOS)的CSS与RB260系列,二是既可以运行ROS(RouterOS)也可以运行SwOS和只能运行ROS的CRS系列。
如果曾经或目前正在使用CRS系列交换机的话,想必一定知道这破玩意名字里的”Router”纯纯的虚假宣传。在ROS7还没有出现的时代里,这些被冠以”Cloud Router Switch”的交换机只能拿来处理二层流量。这里并不是说三层流量完全处理不了,严格来说,这些玩意确实能处理三层流量,但是有这个功能和这个功能能用完全是两码事。
为了更好地说明这件事,我以CRS300系列设备举例,CRS200/CRS100系列除CRS112-8P-4S-IN仍在发行外,其余该系列设备均已停产,故不作说明,文章后续对于CRS的说明也仅局限在目前仍在售的CRS300/CRS500系列。
设备型号 | CPU架构 | CPU型号 | 交换芯片型号 | 交换芯片品牌 | 端口配置 |
---|---|---|---|---|---|
CRS326-24G-2S+RM | ARM32 | 98DX3236 | 98DX3236 | Marvell | 24*RJ45 + 2*SFP+ |
CRS326-24S+2Q+RM | MIPSBE | QCA9531 | 98DX8332 | Marvell | 24*SFP+ + 2*QSFP+ |
在ROS7发行前,ROS系统并没有任何针对三层流量的硬件卸载能力,即三层流量只能通过交换机的CPU进行处理,虽然ROS拥有Fastpath功能可以减少CPU处理数据包时的工作量,但是Fastpath功能启用的条件和限制较多,在一般部署情况下很难用Fastpath加速处理。所以CRS交换机处理三层流量的性能高低取决于交换机的处理器性能,例如CRS326-24G-2S+RM,这玩意的处理器是Marvell芯片内建的800MHz ARMv7处理器,有两个核心,CPU频率也凑活,在开启了Fastpath后的大包处理性能为亮眼的1266.6Mbps,1043Kpps。这还是1518byte的大包路由性能,64byte那就更为亮眼了,高达175.9Mbps。
我们再看看被冠以”Our fastest switch for the most demanding setups”的CRS326-24S+2Q+RM有什么亮眼表现:开启了Fastpath后,1518byte路由性能为451.8Mbps,64byte性能为74.2Mbps。
综上,可以看到CRS系列的三层惨不忍睹。
RouterOS v7
在ROS7发行后引入了针对于三层流量的硬件负载支持,这意味着只要ROS设备中存在可以处理三层流量的芯片,ROS便可以调用实现三层的线速转发,听起来很美好,是吧?
事实并不像Mikrotik说的那么美好,自ROS7发布第一版开始,针对于L3HW的bug修复几乎出现在每一次更新中,在早期版本中L3HW甚至会引发灾难性的后果。我们来看一下从第一版ROS7.1到目前笔者撰写此文章时的最新稳定版7.15.2对于L3HW的修补。
版本 | Change log内容(原文) |
---|---|
7.1 | support for Layer 3 hardware acceleration on all CRS3xx devices |
7.1.1 | fixed HW offloaded routing when using 7 or more VLAN interfaces; fixed bonding source MAC address; improved system stability when using 7 or more VLAN interfaces |
7.2 | added HW offloaded FastTrack support for inter-VLAN routing; fixed HW offloaded NAT; fixed HW offloaded routing when using 7 or more VLAN interfaces; fixed ICMP message when routed packet exceeds MTU and DF flag is set; fixed bonding source MAC address; fixed default route offloading for CRS305, CRS326-24G-2S+, CRS328, netPower, netFiber devices; improved routing table offloading for CRS305, CRS326-24G-2S+, CRS328, netPower, netFiber devices; improved system stability when using 7 or more VLAN interfaces |
7.2.2 | improved offloading for directly connected hosts on CRS305, CRS326-24G-2S+, CRS328, CRS318, CRS310; improved route table offloading for CRS317, CRS309, CRS312, CRS326-24S+2Q+, CRS354, CRS5xx, CCR2x16 devices |
7.3 | greatly improved route offloading speed; improved offloading for directly connected hosts on CRS305, CRS326-24G-2S+, CRS328, CRS318, CRS310; improved offloading in cases of HW table overflow for CRS305, CRS326-24G-2S+, CRS328, CRS318, CRS310; improved route table offloading for CRS317, CRS309, CRS312, CRS326-24S+2Q+, CRS354, CRS5xx, CCR2x16 devices; log HW routes count and the shortest offloaded subnet prefix if the HW memory gets full; offload only main routing table; optimized offloading when dealing with large volume of directly connected hosts; partial routing table offload for Marvell Prestera DX4000/DX8000 switch chip series |
7.5 | fixed HW offloaded NAT |
7.6 | added “l3hw-settings” sub menu under the switch menu; added support for IPv6 route offloading (disabled by default); fixed “H” flag presence for accelerated connection tracking entries; fixed possible packet loss when using HW offloaded NAT; improved connected host offloading on startup; improved connected IPv6 host offloading when routing table is nearly full for 98DX224S, 98DX226S, and 98DX3236 switch chips |
7.7 | fixed host offloading in a case of MAC address change; fixed offloaded NAT for CRS309 switch; improved system stability when disabling or enabling L3HW offloading |
7.8 | added destination MAC address check for offloaded FastTrack connections |
7.9 | improved route offloading for 98DX224S, 98DX226S, and 98DX3236 switch chips |
7.10 | added “autorestart” option to L3HW settings; added advanced configuration options for fine-tuning the L3HW offload (l3hw-settings are cleared after upgrade or downgrade) (CLI only); added monitoring options for L3HW utilization (CLI only); fixed /32 route deletion; fixed IPv6 ECMP route offloading; fixed offloading of /32 IPv4 and /128 IPv6 routes; fixed route table offloading during large volume of route updates; improved host and nexthop offloading; improved offloading of IPv6 hosts after L3HW driver restart; improved performance of partial offloading; improved route offloading after gateway change; improved system stability for partial routing table offload |
7.11 | changed minimal supported values for “neigh-discovery-interval” and “neigh-keepalive-interval” properties; fixed /32 and /128 route offloading after nexthop change; fixed incorrect source MAC usage for offloaded bonding interface; improved system responsiveness during partial offloading; improved system stability during IPv6 route offloading; improved system stability |
7.12 | fixed IPv6 route suppression; improved system stability during IPv6 route offloading; prioritize local IP addresses over the respective /32 and /128 routes |
7.13 | fixed routing for IPsec encapsulated packets |
7.14 | fixed IPv6 host offloading in certain cases; fixed neighbor offloading after link flap; preserve offloading for VLANs when bridge ports are down |
L3HW
我们看一下对于启用L3HW的注意要点
最重要的一点放在前面,ROS仅支持针对VRF “main” 即主路由表的L3HW,且需要注意主路由表内的L3HW路由不能与L3HW port共存,即不能与在switch选项中配置了”l3-hw-offloading=yes”的端口共存,否则会出现路由错误,如果在其他VRF表中开启了L3HW,该VRF中的路由会出现不可预料的错误。
L3HW仅支持”802.3ad”和”balance-xor”两种类型的bonding端口。
Mikrotik官方并不建议将IPv6路由进行L3HW处理,理由为IPv6路由会大量占用可用的硬转发资源。Mikrotik官方建议对于IPv4开启L3HW,将IPv6流量丢给CPU处理。
和Fastpath一样,在配置Firewall要格外小心,注意Firewall功能是否支持L3HW。
针对于Marvell Prestera DX2000和DX3000系列芯片,仅能修改端口mac地址的最后两位。
Mikrotik的CRS交换机与CCR路由器内建的芯片型号繁多,不同型号芯片的TCAM不尽相同,支持的NAT条目,IPv4/6路由条目,QOS策略条目也不同,在配置时需要额外注意是否超过了最大可用范围。