Homepage
Privacy Policy
iYoRoy DN42 Network
About
More
Friends
Language
简体中文
English
Search
1
Centralized Deployment of EasyTier using Docker
1,705 Views
2
Adding KernelSU Support to Android 4.9 Kernel
1,091 Views
3
Enabling EROFS Support for an Android ROM with Kernel 4.9
309 Views
4
Installing 1Panel Using Docker on TrueNAS
300 Views
5
2025 Yangcheng Cup CTF Preliminary WriteUp
296 Views
Android
Ops
NAS
Develop
Network
Projects
DN42
One Man ISP
CTF
Kubernetes
Cybersecurity
Brain Dumps
Login
Search
Search Tags
Network Technology
BGP
BIRD
Linux
DN42
Android
OSPF
C&C++
Web
AOSP
CTF
Cybersecurity
Docker
iBGP
Windows
MSVC
Services
Kernel
IGP
TrueNAS
Kagura iYoRoy
A total of
32
articles have been written.
A total of
23
comments have been received.
Index
Column
Android
Ops
NAS
Develop
Network
Projects
DN42
One Man ISP
CTF
Kubernetes
Cybersecurity
Brain Dumps
Pages
Privacy Policy
iYoRoy DN42 Network
About
Friends
Language
简体中文
English
12
articles related to
were found.
Building a Cross-Region K3s Cluster from Scratch - Ep.1 Calico No-Encapsulation CNI
# Preface I've actually wanted to play with a K8s cluster for a long time, but always felt that without sufficient knowledge, it would be too difficult to attempt. Recently, I spent some time studying DN42 and routing protocols like BGP and OSPF, and realized that it no longer feels so difficult. So I decisively started with K3s ( The main reason for choosing K3s over K8s is its lightweight nature: low resource requirements, no need to pull a bunch of images for deployment, availability of domestic mirrors… In short, K3s suits my needs better. I'm a beginner just starting to explore K3s, so please go easy on me if I make any mistakes~ # Analysis ## Choosing the CNI Component My current network architecture looks like this: ```mermaid graph TD subgraph ZeroTier Domestic subgraph WDS Gateway <--> VM1 Gateway <--> VM2 end NGB <--> Gateway HFE-NAS <--> Gateway NGB <--> HFE-NAS end subgraph IEPL Global-NIC <==OSPF==> CN-NIC end subgraph ZeroTier Global HKG02 <--> HKG04 TYO <--> HKG04 TYO <--> HKG02 end CN-NIC <--> NGB CN-NIC <--> HFE-NAS CN-NIC <--OSPF--> Gateway Global-NIC <--OSPF--> TYO Global-NIC <--OSPF--> HKG02 Global-NIC <--OSPF--> HKG04 %% Style definition: orange background, bold border to represent routers classDef router fill:#f96,stroke:#333,stroke-width:2px,font-weight:bold; class Global-NIC,CN-NIC,Gateway router; Among this, the WDS node is a Proxmox VE host with multiple VMs underneath. It advertises its VMs' IPv4 prefixes via OSPF. When Hong Kong nodes need to access a VM under the WDS node, they can do so by joining the OSPF internal network to achieve multi-hop reachability. This keeps the encapsulation layer count to only one, so there's no worry about MTU "disappearing act". I plan to create two new VMs under WDS to serve as the master and a node (temporarily called KubeMaster and KubeNode-WDS1). Then HKG04 (temporarily called KubeNode-HKG04) will also join the K3s cluster as a node. The simplest approach would be to use K3s's default Flannel as the CNI. However, Flannel is based on VXLAN, and adding another layer of my existing internal network would lead to the following MTU "disappearing act": Data packet -> Flannel VXLAN encapsulation -> ZeroTier encapsulation -> Physical link The actual usable MTU for inter-container communication would likely be compressed to 1350 or even lower. Therefore, I tried to find a CNI solution that can work directly on top of this internal network, and then I found Calico. As I understand, Calico uses BGP as its underlying routing protocol, supports starting in no-encapsulation (No-Encap) mode, and hands packets directly to the upper routers for routing. Thus, I chose Calico as the CNI component. Routing Design To ensure that intermediate routers know how to route Pod IPs, KubeMaster and KubeNode-WDS1 are under the Proxmox VE host. They need to establish BGP with HKG04 across the entire internal network. This means that every router at each intermediate level must learn the full BGP routes, so that the following routing path can be established: graph LR subgraph WDS KubeMaster KubeNode-WDS1 Gateway end subgraph IEPL CN-Namespace Global-Namespace end KubeNode-WDS1 <--> Gateway KubeMaster <--> Gateway <--> CN-Namespace <--> Global-Namespace <--> HKG04 %% Style definition: highlight nodes with routing capability classDef router fill:#f96,stroke:#333,stroke-width:2px,font-weight:bold; class Gateway,CN-Namespace,Global-Namespace router; Otherwise, any intermediate hop would drop packets because it doesn't recognize the source/destination IP. Also, due to the property of iBGP that routes learned from a neighbor cannot be propagated to the next iBGP neighbor, all BGP sessions between Gateway, CN-Namespace, Global-Namespace and the nodes need to enable Route Reflector; otherwise, nodes cannot correctly learn routes from each other. That said, this architecture would be more suitable for BGP Confederation, but my existing network is already quite complex, and adding BGP confederations would make later maintenance more troublesome. Moreover, my number of nodes is small, so the overhead of iBGP Full Mesh is acceptable. It's definitely not because I'm lazy (so Thus, the final network routing structure is as follows: graph TD subgraph WDS VM1 VM2 Gateway end subgraph IEPL CN-Namespace Global-Namespace end VM1 <-.Calico iBGP Full Mesh.-> VM2 VM1 <--iBGP Route Reflector--> Gateway VM2 <--iBGP Route Reflector--> Gateway <--iBGP--> CN-Namespace <--iBGP--> Global-Namespace <--iBGP Route Reflector--> HKG04 Gateway <--iBGP--> Global-Namespace HKG04 <-.Calico iBGP Full Mesh.-> VM1 VM2 <-.Calico iBGP Full Mesh.-> HKG04 %% Style definition classDef router fill:#f96,stroke:#333,stroke-width:2px,font-weight:bold; %% Mark nodes with routing/forwarding or RR functions as Router class Gateway,CN-Namespace,Global-Namespace router; The dashed-line BGP sessions are automatically created by Calico, while the solid-line parts need to be manually created by us. Keeping Calico's own iBGP Full Mesh is for future scalability, so that nodes can preferentially establish direct P2P connections via ZeroTier instead of taking a detour through the Route Reflector aggregation router. Deployment After clarifying the structure, deployment becomes simple. Enable Kernel Forwarding and Disable rp_filter Standard practice. echo "net.ipv4.ip_forward=1" >> /etc/sysctl.conf echo "net.ipv6.conf.default.forwarding=1" >> /etc/sysctl.conf echo "net.ipv6.conf.all.forwarding=1" >> /etc/sysctl.conf echo "net.ipv4.conf.default.rp_filter=0" >> /etc/sysctl.conf echo "net.ipv4.conf.all.rp_filter=0" >> /etc/sysctl.conf sysctl -p Install K3s Master Because the KubeMaster control plane node is located inside China, it's best to configure image acceleration: mkdir -p /etc/rancher/k3s cat <<EOF > /etc/rancher/k3s/registries.yaml mirrors: docker.io: endpoint: - "https://docker.m.daocloud.io" quay.io: endpoint: - "https://quay.m.daocloud.io" EOF Install using the mirror: curl -sfL https://rancher-mirror.rancher.cn/k3s/k3s-install.sh | \ INSTALL_K3S_MIRROR=cn INSTALL_K3S_EXEC=" \ --flannel-backend=none \ --disable-network-policy \ --cluster-cidr=10.42.0.0/16" sh - Note the need to specify --flannel-backend=none and --disable-network-policy to disable the default CNI component. Use cat /var/lib/rancher/k3s/server/node-token to view the token and record it. Worker Nodes For nodes inside China, configure image acceleration: mkdir -p /etc/rancher/k3s cat <<EOF > /etc/rancher/k3s/registries.yaml mirrors: docker.io: endpoint: - "https://docker.m.daocloud.io" quay.io: endpoint: - "https://quay.m.daocloud.io" EOF Then install K3s using the mirror and join the cluster: export INSTALL_K3S_MIRROR=cn export K3S_URL=https://<master node IP>:6443 # Replace with your master node's actual IP export K3S_TOKEN=K10...your token...::server:xxx # Replace with the full token obtained in the first step curl -sfL https://rancher-mirror.rancher.cn/k3s/k3s-install.sh | sh - At this point, the status of each node should be NotReady because the CNI component is missing. Install Calico and Configure No-Encap Mode On the master, manually download https://raw.githubusercontent.com/projectcalico/calico/v3.26.1/manifests/tigera-operator.yaml and install the Calico operator: kubectl create -f tigera-operator.yaml Configure a custom resource by creating a custom-resource.yaml file: apiVersion: operator.tigera.io/v1 kind: Installation metadata: name: default spec: # Add image registry configuration registry: quay.m.daocloud.io calicoNetwork: ipPools: - blockSize: 26 cidr: 10.42.0.0/16 encapsulation: None natOutgoing: Enabled nodeSelector: all() Here, specify encapsulation: None to enable No-Encap mode. You can also modify the IPv4 CIDR here if needed. Then: kubectl apply -f custom-resource.yaml to perform the installation. Use: kubectl get pods -A -o wide to check Pod status, waiting for each node to finish pulling images. Configure BGP Topology Label Nodes Label nodes to specify that nodes under WDS connect to the Gateway's BGP in the WDS node, and nodes outside China connect to the BGP of the Global Namespace: kubectl label nodes kubemaster region=WDS kubectl label nodes kubenode-wds-1 region=WDS kubectl label nodes kubenode-hkg04 region=Global Calico Configuration Create a YAML configuration file: apiVersion: crd.projectcalico.org/v1 kind: BGPPeer metadata: name: route-reflector-domestic spec: nodeSelector: region == 'Domestic' # This part is not actually used; I originally designed a general aggregation router in the Domestic area peerIP: 100.64.0.108 asNumber: 64512 --- apiVersion: crd.projectcalico.org/v1 kind: BGPPeer metadata: name: route-reflector-wds spec: nodeSelector: region == 'WDS' peerIP: 192.168.100.1 asNumber: 64512 --- apiVersion: crd.projectcalico.org/v1 kind: BGPPeer metadata: name: route-reflector-global spec: nodeSelector: region == 'Global' peerIP: 100.64.1.106 asNumber: 64512 This means: All nodes with label region equal to Domestic will have a BGP session to 100.64.0.108 (the domestic aggregation router) using AS 64512 All nodes with label region equal to WDS will have a BGP session to 192.168.100.1 (the Gateway for all VMs under the WDS node) using AS 64512 All nodes with label region equal to Global will have a BGP session to 100.64.1.106 (the overseas aggregation router) using AS 64512 This achieves what is shown in the diagram: all VMs under the WDS node, including the master and KubeNode-WDS1, connect to the Gateway aggregation router of the WDS node, and all nodes in overseas areas connect to the overseas aggregation router. Configure Aggregation Router iBGP This part is simply a matter of writing Bird configuration files (easy). Here are a few examples: k3s/ibgp.conf: function is_insider_as(){ if bgp_path.len > 0 && !(bgp_path ~ [= 64512 =]) then { return false; } if net ~ [ 10.42.0.0/16{16,32} ] then { return true; } return false; } template bgp k3sbackbone{ local as K3S_AS; router id INTRA_ROUTER_ID; neighbor as K3S_AS; ipv4{ table intra_table_v4; import filter{ if is_insider_as() then accept; reject; }; export filter{ if is_insider_as() then accept; reject; }; next hop self; extended next hop; }; ipv6{ table intra_table_v6; import filter{ if is_insider_as() then accept; reject; }; export filter{ if is_insider_as() then accept; reject; }; next hop self; }; }; template bgp k3speers{ local as K3S_AS; neighbor as K3S_AS; router id INTRA_ROUTER_ID; rr client; rr cluster id INTRA_ROUTER_ID; ipv4{ table intra_table_v4; import filter{ if is_insider_as() then accept; reject; }; export filter{ if is_insider_as() then accept; reject; }; next hop self; }; ipv6{ table intra_table_v6; import filter{ if is_insider_as() then accept; reject; }; export filter{ if is_insider_as() then accept; reject; }; next hop self; }; }; include "ibgpeers/*"; ibgpeers/backbone-cn.conf: protocol bgp 'k3s_backbone_cn_v4' from k3sbackbone{ neighbor fd18:3e15:61d0:cafe:f001::1; }; ibgpeers/master.conf: protocol bgp 'k3s_master_v4' from k3speers{ neighbor 192.168.100.251; }; Main points: it's best not to enable Route Reflector between the aggregation routers, and remember to enable next hop self. After everything is done, using kubectl get nodes should show all nodes as Ready: NAME STATUS ROLES AGE VERSION kubemaster Ready control-plane 2d23h v1.34.5+k3s1 kubenode-hkg04 Ready <none> 11h v1.34.6+k3s1 kubenode-wds-1 Ready <none> 2d7h v1.34.5+k3s1 Use kubectl get pods -A -o wide to view Pods: NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES calico-system calico-kube-controllers-64fc874957-6bdlz 1/1 Running 0 5h38m 10.42.253.136 kubenode-hkg04 <none> <none> calico-system calico-node-2qz82 1/1 Running 0 4h24m 10.2.5.7 kubenode-hkg04 <none> <none> calico-system calico-node-dhl2c 1/1 Running 0 4h24m 192.168.100.251 kubemaster <none> <none> calico-system calico-node-nbpkj 1/1 Running 0 4h23m 192.168.100.252 kubenode-wds-1 <none> <none> calico-system calico-typha-7bb5db4bdc-rfpwg 1/1 Running 0 5h38m 10.2.5.7 kubenode-hkg04 <none> <none> calico-system calico-typha-7bb5db4bdc-rwwr5 1/1 Running 0 5h38m 192.168.100.251 kubemaster <none> <none> calico-system csi-node-driver-jglwp 2/2 Running 0 5h38m 10.42.64.68 kubenode-wds-1 <none> <none> calico-system csi-node-driver-jqjsc 2/2 Running 0 5h38m 10.42.253.137 kubenode-hkg04 <none> <none> calico-system csi-node-driver-vk26s 2/2 Running 0 5h38m 10.42.141.16 kubemaster <none> <none> kube-system coredns-695cbbfcb9-8fx4p 1/1 Running 1 (7h27m ago) 2d23h 10.42.141.14 kubemaster <none> <none> kube-system helm-install-traefik-crd-5bkwx 0/1 Completed 0 2d23h <none> kubemaster <none> <none> kube-system helm-install-traefik-m9fgj 0/1 Completed 1 2d23h <none> kubemaster <none> <none> kube-system local-path-provisioner-546dfc6456-dmn4g 1/1 Running 1 (7h27m ago) 2d23h 10.42.141.15 kubemaster <none> <none> kube-system metrics-server-c8774f4f4-2wkwh 1/1 Running 1 (7h27m ago) 2d23h 10.42.141.12 kubemaster <none> <none> kube-system svclb-traefik-999cddce-hpmcm 2/2 Running 6 (7h26m ago) 11h 10.42.253.134 kubenode-hkg04 <none> <none> kube-system svclb-traefik-999cddce-q4225 2/2 Running 2 (7h27m ago) 2d22h 10.42.141.9 kubemaster <none> <none> kube-system svclb-traefik-999cddce-xmd64 2/2 Running 2 (7h26m ago) 2d6h 10.42.64.66 kubenode-wds-1 <none> <none> kube-system traefik-788bc4688c-vbbhj 1/1 Running 1 (7h27m ago) 2d22h 10.42.141.13 kubemaster <none> <none> tigera-operator tigera-operator-6b95bbf4db-vl46l 1/1 Running 1 (7h27m ago) 2d23h 192.168.100.251 kubemaster <none> <none> Use kubectl exec -it -n calico-system <calico-node-xxxx> -- birdcl s p to check the status of Bird: root@KubeMaster:~/kube/calico# kubectl exec -it -n calico-system calico-node-2qz82 -- birdcl s p Defaulted container "calico-node" out of: calico-node, flexvol-driver (init), install-cni (init) BIRD v0.3.3+birdv1.6.8 ready. name proto table state since info static1 Static master up 08:58:17 kernel1 Kernel master up 08:58:17 device1 Device master up 08:58:17 direct1 Direct master up 08:58:17 Mesh_192_168_100_251 BGP master up 08:58:33 Established Mesh_192_168_100_252 BGP master up 08:59:00 Established Node_100_64_1_106 BGP master up 12:57:44 Established ip r shows the system routing table: root@KubeMaster:~/kube/calico# ip r default via 192.168.100.1 dev eth0 proto static 10.42.64.64/26 proto bird nexthop via 192.168.100.1 dev eth0 weight 1 nexthop via 192.168.100.252 dev eth0 weight 1 blackhole 10.42.141.0/26 proto bird 10.42.141.9 dev caliac6501d3794 scope link 10.42.141.12 dev calib07c23291bb scope link 10.42.141.13 dev caliab16e60bd19 scope link 10.42.141.14 dev calid5959219080 scope link 10.42.141.15 dev cali026d8f1ddb7 scope link 10.42.141.16 dev califa657ba417a scope link 10.42.253.128/26 via 192.168.100.1 dev eth0 proto bird 192.168.100.0/24 dev eth0 proto kernel scope link src 192.168.100.251 Ping a Pod's IP – if everything is fine, it should work directly: root@KubeMaster:~/kube/calico# ping 10.42.253.137 PING 10.42.253.137 (10.42.253.137) 56(84) bytes of data. 64 bytes from 10.42.253.137: icmp_seq=1 ttl=60 time=33.7 ms 64 bytes from 10.42.253.137: icmp_seq=2 ttl=60 time=33.5 ms ^C --- 10.42.253.137 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1002ms rtt min/avg/max/mdev = 33.546/33.632/33.718/0.086 ms Tune MTU This step is actually for stability…? Tests have shown that although my ZeroTier MTU is 1420, packets start to fragment around 1392 bytes (test with ping -M do -s <packet size> <Pod_IP>). Therefore, force the Pod MTU to 1370: root@KubeMaster:~/kube/calico# cat patch-mtu.yaml apiVersion: operator.tigera.io/v1 kind: Installation metadata: name: default spec: calicoNetwork: mtu: 1370 nodeAddressAutodetectionV4: firstFound: true root@KubeMaster:~/kube/calico# kubectl apply -f patch-mtu.yaml installation.operator.tigera.io/default configured
05/04/2026
36 Views
0 Comments
4 Stars
[Fun Experiment] A LAN Spanning 20km: Seamlessly Merging Remote Networks on OpenWrt Using ZeroTier + OSPF
Background I was originally setting up my own ZeroTier "big internal network". Because the network structure is relatively complex, I decided to use OSPF instead of static routes to configure internal routing. I had tried to configure ZeroTier on my home OpenWrt before but never succeeded. Recently, I took it out again to work on it and discovered it was a configuration issue with OpenWrt. After fixing it, I was chatting with a good friend and had an idea: Kagura iYoRoy: 02-10 14:49:05 Hey... Kagura iYoRoy: 02-10 14:49:06 Then... Kagura iYoRoy: 02-10 14:49:20 If you also set up OSPF on your router... Kagura iYoRoy: 02-10 14:49:27 Our two home networks would be directly interconnected, huh? ( Let's do it! Basic Information Local Side Router OS: OpenWrt, X-WRT 26.04_b202601250827 LAN IPv4 Prefix: 192.168.3.0/24 ISP: Hefei China Unicom NAT Environment: NAT1 Remote Side Router OS: OpenWrt, X-WRT 25.04_b202510240128 LAN IPv4 Prefix: 192.168.1.0/24 ISP: Hefei China Mobile NAT Environment: NAT1 Installing ZeroTier and Using a Self-Hosted Planet I used ZTNet as the self-hosted Controller. The setup process won't be elaborated here as you can find it online. The OpenWrt version I'm using has started using apk instead of opkg as the package manager. Use apk to install zerotier-one directly: apk add zerotier After completion, open /etc/config/zerotier to find the default configuration file. config zerotier 'global' # Sets whether ZeroTier is enabled or not option enabled 0 # Sets the ZeroTier listening port (default 9993; set to 0 for random) #option port '9993' # Client secret (leave blank to generate a secret on first run) option secret '' # Path of the optional file local.conf (see documentation at # https://docs.zerotier.com/config#local-configuration-options) #option local_conf_path '/etc/zerotier.conf' # Persistent configuration directory (to perform other configurations such # as controller mode or moons, etc.) #option config_path '/etc/zerotier' # Copy the contents of the persistent configuration directory to memory # instead of linking it, this avoids writing to flash #option copy_config_path '1' # Network configuration, you can have as many configurations as networks you # want to join (the network name is optional) config network 'earth' # Identifier of the network you wish to join option id '8056c2e21c000001' # Network configuration parameters (all are optional, if not indicated the # default values are set, see documentation at # https://docs.zerotier.com/config/#network-specific-configuration) option allow_managed '1' option allow_global '0' option allow_default '0' option allow_dns '0' # Example of a second network (unnamed as it is optional) #config network # option id '1234567890123456' # option allow_managed '1' # option allow_global '0' # option allow_default '0' # option allow_dns '0' Modify it according to your needs: config zerotier 'global' option enabled '1' # Enable ZeroTier client service option config_path '/etc/zerotier' # Persistent directory: for storing identity secret, Moon node definitions, and network settings option secret '' # Leave secret blank: identity will be auto-generated on first run and saved to identity.secret option copy_config_path '1' # Flash protection policy: copy config to memory on startup. If set to 0, read/write directly to Flash config network 'earth' option id '<network ID>' # 16-digit ZeroTier Network ID option allow_managed '1' # Allow receiving controller-assigned IPs, routes, and tags option allow_global '1' # Allow receiving globally routable IPv6 unicast addresses (GUA) via ZeroTier option allow_default '0' # Allow ZeroTier to take over the default gateway (similar to a global proxy) option allow_dns '1' # Allow receiving and using DNS servers configured in the ZeroTier control panel Regarding copy_config_path '1' Because the ZeroTier working directory /var/lib/zerotier-one is part of tmpfs in OpenWrt, its contents are cleared on reboot. Therefore, configurations like planet, identity, and network files need to be stored in the router's Flash storage, i.e., the path set in config_path. The default logic is to create a soft link from the configured config_path to /var/lib/zerotier-one on startup to achieve persistence. All read/write operations in /var/lib/zerotier-one are then written to Flash. However, frequent ZeroTier read/writes can significantly reduce Flash lifespan. Enabling copy_config_path '1' specifies that on ZeroTier startup, the configurations from config_path are copied directly into /var/lib/zerotier-one. This greatly extends the internal Flash lifespan, but the downside is that modifications made via zerotier-cli are not automatically synced back to Flash by default, making this option less suitable for scenarios requiring frequent configuration adjustments. After making changes, use: /etc/init.d/zerotier start /etc/init.d/zerotier enable to start ZeroTier and enable auto-start on boot. On first startup, if the secret field was left empty, it will be auto-generated. After startup, copy all files from /var/lib/zerotier-one to /etc/zerotier. Download the Planet file to the config_path set above, i.e., /etc/zerotier. After completion, restart ZeroTier: /etc/init.d/zerotier restart That's it. Then, go to your ZeroTier Controller console, and you should see the new device has joined. Next, you may need to allow ZeroTier traffic through the firewall. This step can be referenced from other online tutorials. I chose to allow all traffic; it shouldn't be a big issue under NAT1. Installing and Configuring Bird2 I didn't expect the Bird2 version in the apk repository to be very recent. As of this writing on 2026-02-10, the Bird2 version in apk is 2.18 Use the following command to install: apk add bird2 # bird daemon itself apk add bird2c # birdc command Because OpenWrt's default bird configuration file is located at /etc/bird.conf, and I prefer modular referencing by placing different configurations in separate folders based on function, I chose to move the default config file to /etc/bird/bird.conf and store various config files within that folder. Open /etc/init.d/bird: #!/bin/sh /etc/rc.common # Copyright (C) 2010-2017 OpenWrt.org USE_PROCD=1 START=70 STOP=10 BIRD_BIN="/usr/sbin/bird" BIRD_CONF="/etc/bird.conf" BIRD_PID_FILE="/var/run/bird.pid" start_service() { mkdir -p /var/run procd_open_instance procd_set_param command $BIRD_BIN -f -c $BIRD_CONF -P $BIRD_PID_FILE procd_set_param file "$BIRD_CONF" procd_set_param stdout 1 procd_set_param stderr 1 procd_set_param respawn procd_close_instance } reload_service() { procd_send_signal bird } Change the BIRD_CONF value to /etc/bird/bird.conf: - BIRD_CONF="/etc/bird.conf" + BIRD_CONF="/etc/bird/bird.conf" Then create the /etc/bird folder. All subsequent OSPF configuration files will be placed here. Configuring OSPF My configuration file structure follows these rules: /etc/bird/bird.conf serves as the sole entry point, defining basic configurations like Router ID, filter prefixes, and then including other sub-configurations. Configurations for different networks are placed in separate folders, e.g., public internet parts in /etc/bird/inet/, DN42 parts in /etc/bird/dn42/, and my own internal network parts in /etc/bird/intra/. Each network has a defs.conf handling common functions (similar to utils in Golang development?). Thus, the final configuration file structure is: /etc/bird/bird.conf: Configuration entry point define INTRA_ROUTER_ID = 100.64.0.100; define INTRA_PREFIX_V4 = [ 100.64.0.0/16+, 192.168.0.0/16+ ]; # IPv4 prefixes allowed to be advertised via OSPF define INTRA_PREFIX_V6 = [ fd18:3e15:61d0::/48+ ]; # IPv6 prefixes allowed to be advertised via OSPF protocol device { scan time 10; }; ipv4 table intra_table_v4; # Define internal routing IPv4 table ipv6 table intra_table_v6; # Define internal routing IPv6 table include "intra/defs.conf"; include "intra/kernel.conf"; include "intra/ospf.conf"; The RouterID here is directly taken from the node's IPv4 address within the ZeroTier internal network. Separate tables are used for future safety, e.g., if connecting this node to DN42. /etc/bird/intra/defs.conf: Functions for filters function is_intra_net4() { return net ~ INTRA_PREFIX_V4; } function is_intra_net6(){ return net ~ INTRA_PREFIX_V6; } function is_intra_dn42_net4(){ return net ~ [ 172.20.0.0/14+ ]; } function is_intra_dn42_net6(){ return net ~ [ fd00::/8+ ]; } /etc/bird/intra/kernel.conf: Write routes learned by OSPF into the system routing table protocol kernel intra_kernel_v4 { kernel table 254; scan time 20; ipv4 { table intra_table_v4; import none; export filter { if source = RTS_STATIC then reject; accept; }; }; }; protocol kernel intra_kernel_v6 { kernel table 254; scan time 20; ipv6 { table intra_table_v6; import none; export filter { if source = RTS_STATIC then reject; accept; }; }; }; /etc/bird/intra/ospf.conf: OSPF module protocol ospf v3 intra_ospf_v4 { router id INTRA_ROUTER_ID; # Specify RouterID ipv4 { table intra_table_v4; # Specify routing table import where is_intra_dn42_net4() || is_intra_net4() && source != RTS_BGP; export where is_intra_dn42_net4() || is_intra_net4() && source != RTS_BGP; }; include "ospf/*"; }; protocol ospf v3 intra_ospf_v6 { router id INTRA_ROUTER_ID; # Specify RouterID ipv6 { table intra_table_v6; # Specify routing table import where is_intra_dn42_net6() || is_intra_net6() && source != RTS_BGP; export where is_intra_dn42_net6() || is_intra_net6() && source != RTS_BGP; }; include "ospf/*"; }; /etc/bird/intra/ospf/backbone.conf: OSPF Area Configuration area 0.0.0.0 { interface "br-lan" { stub; }; # Local LAN interface interface "zta7oqfzy6" { # ZeroTier interface type broadcast; cost 100; hello 20; }; }; After completion, use: /etc/init.d/bird start /etc/init.d/bird enable to start Bird and enable auto-start on boot. If everything is fine, you can use birdc s p to check Bird's status. If all goes well, after the other side is configured, you should see the OSPF state as Running: root@X-WRT:/etc/bird# birdc s p BIRD 2.18 ready. Name Proto Table State Since Info device1 Device --- up 14:28:02.410 intra_kernel_v4 Kernel intra_table_v4 up 14:28:02.410 intra_kernel_v6 Kernel intra_table_v6 up 14:28:02.410 intra_ospf_v4 OSPF intra_table_v4 up 14:28:02.410 Running intra_ospf_v6 OSPF intra_table_v6 up 14:31:38.389 Running Have your friend follow the same process. Once both sides show Running status, you can use birdc s r protocol intra_ospf_v4 to view the routes learned by OSPF. You'll find that routes to the other side via ZeroTier are being learned normally: root@X-WRT:/etc/bird# birdc s r protocol intra_ospf_v4 BIRD 2.18 ready. Table intra_table_v4: ... 192.168.1.0/24 unicast [intra_ospf_v4 23:20:21.398] * I (150/110) [100.64.0.163] via 100.64.0.163 on zta7oqfzy6 ... 192.168.3.0/24 unicast [intra_ospf_v4 14:28:02.511] * I (150/10) [100.64.0.100] dev br-lan You can also ping your friend's server from your PC: iyoroy@iYoRoy-PC:~$ ping 192.168.1.103 PING 192.168.1.103 (192.168.1.103) 56(84) bytes of data. 64 bytes from 192.168.1.103: icmp_seq=1 ttl=63 time=54.3 ms 64 bytes from 192.168.1.103: icmp_seq=2 ttl=63 time=10.7 ms 64 bytes from 192.168.1.103: icmp_seq=3 ttl=63 time=15.2 ms ^C --- 192.168.1.103 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 1998ms rtt min/avg/max/mdev = 10.678/26.717/54.279/19.576 ms iyoroy@iYoRoy-PC:~$ traceroute 192.168.1.103 traceroute to 192.168.1.103 (192.168.1.103), 30 hops max, 60 byte packets 1 100.64.0.163 (100.64.0.163) 10.445 ms 9.981 ms 9.892 ms 2 192.168.1.103 (192.168.1.103) 11.621 ms 10.994 ms 10.948 ms Web browsing and speed tests work normally: Summary This series of operations essentially implements the following network structure: flowchart TB %% === Style Definitions === classDef phyNet fill:#e3f2fd,stroke:#1565c0,stroke-width:2px classDef virNet fill:#fff3e0,stroke:#ef6c00,stroke-width:2px,stroke-dasharray: 5 5 classDef router fill:#333,stroke:#000,stroke-width:2px,color:#fff classDef ztCard fill:#f57c00,stroke:#e65100,stroke-width:2px,color:#fff,shape:rect classDef bird fill:#a5d6a7,stroke:#2e7d32,stroke-width:1px,color:#000 classDef invisibleContainer fill:none,stroke:none,color:none %% === Physical Layer Containers === subgraph Top_Physical_Layer [" "] direction LR subgraph Left_Side ["My Home (Node A)"] direction TB L_Router[X-WRT Router A]:::router L_LAN[LAN: 192.168.3.0/24] L_LAN <--> L_Router end subgraph Right_Side ["Friend's Home (Node B)"] direction TB R_Router[X-WRT Router B]:::router R_LAN[LAN: 192.168.1.0/24] R_LAN <--> R_Router end end %% === Virtual Layer Container === subgraph Middle_Side [ZeroTier Virtual L2 Network] direction LR subgraph ZT_Stack_A [My Home ZT Access] direction TB L_NIC(zt0: 100.64.0.x):::ztCard L_Bird(Bird OSPF):::bird L_NIC <-.- L_Bird end subgraph ZT_Stack_B [Friend's Home ZT Access] direction TB R_NIC(zt0: 100.64.0.y):::ztCard R_Bird(Bird OSPF):::bird R_NIC <-.- R_Bird end L_NIC <==P2P Tunnel==> R_NIC end %% === Cross-Layer Connections === L_Router === L_NIC R_Router === R_NIC %% === Style Application === class Left_Side,Right_Side phyNet class Middle_Side virNet class Top_Physical_Layer invisibleContainer The underlying P2P network is still powered by ZeroTier. However, using OSPF for internal routing allows both sides to directly route to devices on each other's network segments. Since both sides can fully learn each other's routes, no NAT is required, and both sides can directly see each other's source addresses. Check out the other side of this story! From my friend's side: Linux Operations - OSPF Networking Implementation Based on Bird for New OpenWrt » NanamiのTechLaunchTower
10/02/2026
379 Views
2 Comments
2 Stars
An Experience of Manually Installing Proxmox VE, Configuring Multipath iSCSI, and NAT Forwarding
The reason was that I rented a physical server, but the IDC did not provide Proxmox VE or Debian system images, only Ubuntu, CentOS, and Windows series. Additionally, the data disk was provided via multipath iSCSI. I wanted to use PVE for isolating different usage scenarios, so I attempted to reinstall the system and migrate the aforementioned configurations. Backup Configuration First, perform a general check of the system, which reveals: The system has two Network Interfaces: enp24s0f0 is connected to a public IP address for external access; enp24s0f1 is connected to the private network address 192.168.128.153. The data disk is mapped to /dev/mapper/mpatha. Under /etc/iscsi, there are configurations for two iSCSI Nodes: 192.168.128.250:3260 and 192.168.128.252:3260, both corresponding to the same target iqn.2024-12.com.ceph:iscsi. It can be inferred that the data disk is mounted by configuring two iSCSI Nodes and then merging them into a single device using multipath. Check the system's network configuration: network: version: 2 renderer: networkd ethernets: enp24s0f0: addresses: [211.154.[REDACTED]/24] routes: - to: default via: [REDACTED] match: macaddress: ac:1f:6b:0b:e2:d4 set-name: enp24s0f0 nameservers: addresses: - 114.114.114.114 - 8.8.8.8 enp24s0f1: addresses: - 192.168.128.153/17 match: macaddress: ac:1f:6b:0b:e2:d5 set-name: enp24s0f1 It's found to be very simple static routing. The internal network interface doesn't even have a default route; just binding the IP is sufficient. Then, save the iSCSI configuration files from /etc/iscsi, which include account and password information. Reinstall Debian Used the bin456789/reinstall script for this reinstallation. Download the script: curl -O https://cnb.cool/bin456789/reinstall/-/git/raw/main/reinstall.sh || wget -O ${_##*/} $_ Reinstall as Debian 13 (Trixie): bash reinstall.sh debian 13 Then, enter the password you want to set as prompted. If all goes well, wait about 10 minutes, and it will automatically complete and reinstall into a clean Debian 13. You can connect via SSH during the process using the set password to check the installation progress. After reinstalling, perform a source change and apt upgrade as usual to get a clean Debian 13. For changing sources, directly refer to the USTC Mirror Site tutorial. Install Proxmox VE This step mainly refers to the Proxmox official tutorial. Note: The Debian installed by the above script sets the hostname to localhost. If you want to change it, please modify it before configuring the Hostname and change the hostname in hosts to your modified hostname, not localhost. Configure Hostname Proxmox VE requires the current hostname to be resolvable to a non-loopback IP address: The hostname of your machine must be resolvable to an IP address. This IP address must not be a loopback one like 127.0.0.1 but one that you and other hosts can connect to. For example, my server IP is 211.154.[CENSORED], I need to add the following record in /etc/hosts: 127.0.0.1 localhost +211.154.[CENSORED] localhost ::1 localhost ip6-localhost ip6-loopback ff02::1 ip6-allnodes ff02::2 ip6-allrouters After saving, use hostname --ip-address to check if it outputs the set non-loopback address: ::1 127.0.0.1 211.154.[CENSORED]. Add Proxmox VE Software Repository Debian 13 uses the Deb822 format (though you can use sources.list if you want), so just refer to the USTC Proxmox Mirror Site: cat > /etc/apt/sources.list.d/pve-no-subscription.sources <<EOF Types: deb URIs: https://mirrors.ustc.edu.cn/proxmox/debian/pve Suites: trixie Components: pve-no-subscription Signed-By: /usr/share/keyrings/proxmox-archive-keyring.gpg EOF Here, a keyring needs to be migrated but I couldn't find one after searching online, so I chose to pull a copy from an existing Proxmox VE server. It's available here: proxmox-keyrings.zip Extract the public key file and place it in /usr/share/keyrings/, then run: apt update apt upgrade -y This will sync the Proxmox VE software repository. Install Proxmox VE Kernel Use the following command to install the PVE kernel and reboot to apply the new kernel: apt install proxmox-default-kernel reboot Afterwards, uname -r should show a kernel version ending with pve, like 6.17.2-2-pve, indicating the new kernel is successfully applied. Install Proxmox VE Related Packages Use apt to install the corresponding packages: apt install proxmox-ve postfix open-iscsi chrony During configuration, you will need to set up the postfix mail server. Official explanation: If you have a mail server in your network, you should configure postfix as a satellite system. Your existing mail server will then be the relay host which will route the emails sent by Proxmox VE to their final recipient. If you don't know what to enter here, choose local only and leave the system name as is. After this, you should be able to access the Web console at https://<your server address>:8006. The account is root, and the password is your root password, i.e., the password configured during the Debian reinstallation. Remove Old Debian Kernel and os-prober Use the following commands: apt remove linux-image-amd64 'linux-image-6.1*' update-grub apt remove os-prober to remove the old Debian kernel, update grub, and remove os-prober. Removing os-prober is not mandatory, but it is recommended by the official guide because it might mistakenly identify VM boot files as multi-boot files, adding incorrect entries to the boot list. At this point, the installation of Proxmox VE is complete and ready for normal use! Configuring Internal Network Interface Because the iSCSI network interface and the public network interface are different, and the reinstallation lost this configuration, the internal network interface needs to be manually configured. Open the Proxmox VE Web interface, go to Datacenter - localhost (hostname) - Network, edit the internal network interface (e.g., ens6f1 here), enter the backed-up IPv4 in CIDR format: 192.168.128.153/17, and check Autostart, then save. Then use the command to set the interface state to UP: ip link set ens6f1 up Now you should be able to ping the internal iSCSI server's IP. Configure Data Disk iSCSI In the previous step, we should have installed the open-iscsi package required for iscsiadm. We just need to reset the nodes according to the backed-up configuration. First, discover the iSCSI storage: iscsiadm -m discovery -t st -p 192.168.128.250:3260 This should yield the two original LUN Targets: 192.168.128.250:3260,1 iqn.2024-12.com.ceph:iscsi 192.168.128.252:3260,2 iqn.2024-12.com.ceph:iscsi Transfer the backed-up configuration files to the server, overwriting the existing configuration in /etc/iscsi. Also, in my backed-up config, I found the authentication configuration: # /etc/iscsi/nodes/iqn.2024-12.com.ceph:iscsi/192.168.128.250,3260,1/default # BEGIN RECORD 2.1.5 node.name = iqn.2024-12.com.ceph:iscsi ... # Some unimportant configurations omitted node.session.auth.authmethod = CHAP node.session.auth.username = [CENSORED] node.session.auth.password = [CENSORED] node.session.auth.chap_algs = MD5 ... # Some unimportant configurations omitted # /etc/iscsi/nodes/iqn.2024-12.com.ceph:iscsi/192.168.128.252,3260,2/default # BEGIN RECORD 2.1.5 node.name = iqn.2024-12.com.ceph:iscsi ... # Some unimportant configurations omitted node.session.auth.authmethod = CHAP node.session.auth.username = [CENSORED] node.session.auth.password = [CENSORED] node.session.auth.chap_algs = MD5 ... # Some unimportant configurations omitted Write these configurations to the new system using: iscsiadm -m node -T iqn.2024-12.com.ceph:iscsi -p 192.168.128.250:3260 -o update -n node.session.auth.authmethod -v CHAP iscsiadm -m node -T iqn.2024-12.com.ceph:iscsi -p 192.168.128.250:3260 -o update -n node.session.auth.username -v [CENSORED] iscsiadm -m node -T iqn.2024-12.com.ceph:iscsi -p 192.168.128.250:3260 -o update -n node.session.auth.password -v [CENSORED] iscsiadm -m node -T iqn.2024-12.com.ceph:iscsi -p 192.168.128.250:3260 -o update -n node.session.auth.chap_algs -v MD5 iscsiadm -m node -T iqn.2024-12.com.ceph:iscsi -p 192.168.128.252:3260 -o update -n node.session.auth.authmethod -v CHAP iscsiadm -m node -T iqn.2024-12.com.ceph:iscsi -p 192.168.128.252:3260 -o update -n node.session.auth.username -v [CENSORED] iscsiadm -m node -T iqn.2024-12.com.ceph:iscsi -p 192.168.128.252:3260 -o update -n node.session.auth.password -v [CENSORED] iscsiadm -m node -T iqn.2024-12.com.ceph:iscsi -p 192.168.128.250:3260 -o update -n node.session.auth.chap_algs -v MD5 (I don't know why the auth info needs to be written separately, but testing shows it won't log in without rewriting it.) Then, use: iscsiadm -m node -T iqn.2024-12.com.ceph:iscsi -p 192.168.128.250:3260 --login iscsiadm -m node -T iqn.2024-12.com.ceph:iscsi -p 192.168.128.252:3260 --login to log into the Targets. Then use: iscsiadm -m node -T iqn.2024-12.com.ceph:iscsi -p 192.168.128.250:3260 -o update -n node.startup -v automatic iscsiadm -m node -T iqn.2024-12.com.ceph:iscsi -p 192.168.128.252:3260 -o update -n node.startup -v automatic to enable automatic mounting on boot. At this point, checking disks with tools like lsblk should reveal two additional hard drives; in my case, sdb and sdc appeared. Configure Multipath To identify if it's a multipath device, I tried: /usr/lib/udev/scsi_id --whitelisted --device=/dev/sdb /usr/lib/udev/scsi_id --whitelisted --device=/dev/sdc Checking the scsi_id of the two disk devices revealed they were identical, confirming they are the same disk using multi-path for load balancing and failover. Install multipath-tools using apt: apt install multipath-tools Then, create /etc/multipath.conf and add: defaults { user_friendly_names yes find_multipaths yes } Configure multipathd to start on boot: systemctl start multipathd systemctl enable multipathd Then, use the following command to scan and automatically configure the multipath device: multipath -ll It should output: mpatha(360014056229953ef442476e85501bfd7)dm-0LIO-ORG,TCMU device size=500G features='1 queue_if_no_path' hwhandler='1 alua'wp=rw |-+- policy='service-time 0' prio=50 status=active | `- 14:0:0:152 sdb 8:16 active ready running `-+- policy='service-time 0' prio=50 status=active `- 14:0:0:152 sdc 8:16 active ready running This shows the two disks have been recognized as a single multipath device. Now, you can find the multipath disk under /dev/mapper/: root@localhost:/dev/mapper# ls control mpatha mpatha is the multipath aggregated disk. If it's not scanned, try using: rescan-scsi-bus.sh to rescan the SCSI bus and try again. If the command is not found, install it via apt install sg3-utils. If all else fails, just reboot. Configure Proxmox VE to Use the Data Disk Because we used multipath, we cannot directly add an iSCSI type storage. Use the following commands to create the PV and VG: pvcreate /dev/mapper/mpatha vgcreate <vg name> /dev/mapper/mpatha Here, I configured the entire disk as a PV. You could also create a separate partition for this. After completion, open the Proxmox VE management interface, go to Datacenter - Storage, click Add - LVM, select the name of the VG you just created for Volume group, give it an ID (name), and click Add. At this point, all configurations from the original system should have been migrated. Configure NAT and Port Forwarding NAT Because only one IPv4 address was purchased, NAT needs to be configured to allow all VMs to access the internet normally. Open /etc/network/interfaces and add the following content: auto vmbr0 iface vmbr0 inet static address 192.168.100.1 netmask 255.255.255.0 bridge_ports none bridge_stp off bridge_fd 0 post-up echo 1 > /proc/sys/net/ipv4/ip_forward post-up iptables -t nat -A POSTROUTING -s 192.168.100.0/24 -o ens6f0 -j MASQUERADE post-up iptables -t raw -I PREROUTING -i fwbr+ -j CT --zone 1 post-up iptables -A FORWARD -i vmbr0 -j ACCEPT post-down iptables -t nat -D POSTROUTING -s 192.168.100.0/24 -o ens6f0 -j MASQUERADE post-down iptables -t raw -D PREROUTING -i fwbr+ -j CT --zone 1 post-down iptables -D FORWARD -i vmbr0 -j ACCEPT Here, vmbr0 is the NAT bridge, with the IP segment 192.168.100.0/24. Traffic from this segment will be translated to the IP of the external network interface ens6f0 for outgoing traffic, and translated back to the original IP upon receiving replies, enabling IP sharing. Then, use: ifreload -a to reload the configuration. Now, the VMs should be able to access the internet. Just configure a static IP within the 192.168.100.0/24 range during installation, set the default gateway to 192.168.100.1, and configure the DNS address. Port Forwarding Got lazy, directly prompted an AI. Had an AI write a configuration script /usr/local/bin/natmgr: #!/bin/bash # =================Configuration Area================= # Public network interface name (Please modify according to your actual situation) PUB_IF="ens6f0" # ==================================================== ACTION=$1 ARG1=$2 ARG2=$3 ARG3=$4 ARG4=$5 # Check if running as root if [ "$EUID" -ne 0 ]; then echo "Please run this script with root privileges" exit 1 fi # Generate random ID (6 characters) generate_id() { # Introduce nanoseconds and random salt to ensure ID uniqueness even if the script runs quickly echo "$RANDOM $(date +%s%N)" | md5sum | head -c 6 } # Show help information usage() { echo "Usage: $0 {add|del|list|save} [parameters]" echo "" echo "Commands:" echo " add <Public Port> <Internal IP> <Internal Port> [Protocol] Add forwarding rule" echo " [Protocol] optional: tcp, udp, both (default: both)" echo " del <ID> Delete forwarding rule by ID" echo " list View all current forwarding rules" echo " save Save current rules to persist after reboot (Must run!)" echo "" echo "Examples:" echo " $0 add 8080 192.168.100.101 80 both" echo " $0 save" echo "" } # Internal function: add single protocol rule _add_single_rule() { local PROTO=$1 local L_PORT=$2 local T_IP=$3 local T_PORT=$4 local RULE_ID=$(generate_id) local COMMENT="NAT_ID:${RULE_ID}" # 1. Add DNAT rule (PREROUTING chain) iptables -t nat -A PREROUTING -i $PUB_IF -p $PROTO --dport $L_PORT -j DNAT --to-destination $T_IP:$T_PORT -m comment --comment "$COMMENT" # 2. Add FORWARD rule (Allow packet passage) iptables -A FORWARD -p $PROTO -d $T_IP --dport $T_PORT -m comment --comment "$COMMENT" -j ACCEPT # Output result printf "%-10s %-10s %-10s %-20s %-10s\n" "$RULE_ID" "$PROTO" "$L_PORT" "$T_IP:$T_PORT" "Success" # Remind user to save echo "Please run '$0 save' to ensure rules persist after reboot." } # Main add function add_rule() { local L_PORT=$1 local T_IP=$2 local T_PORT=$3 local PROTO_REQ=${4:-both} # Default to both if [[ -z "$L_PORT" || -z "$T_IP" || -z "$T_PORT" ]]; then echo "Error: Missing parameters" usage exit 1 fi # Convert to lowercase PROTO_REQ=$(echo "$PROTO_REQ" | tr '[:upper:]' '[:lower:]') echo "Adding rule..." printf "%-10s %-10s %-10s %-20s %-10s\n" "ID" "Protocol" "Public Port" "Target Address" "Status" echo "------------------------------------------------------------------" if [[ "$PROTO_REQ" == "tcp" ]]; then _add_single_rule "tcp" "$L_PORT" "$T_IP" "$T_PORT" elif [[ "$PROTO_REQ" == "udp" ]]; then _add_single_rule "udp" "$L_PORT" "$T_IP" "$T_PORT" elif [[ "$PROTO_REQ" == "both" ]]; then _add_single_rule "tcp" "$L_PORT" "$T_IP" "$T_PORT" _add_single_rule "udp" "$L_PORT" "$T_IP" "$T_PORT" else echo "Error: Unsupported protocol '$PROTO_REQ'. Please use tcp, udp, or both." exit 1 fi echo "------------------------------------------------------------------" } # Delete rule (Delete in reverse line number order) del_rule() { local RULE_ID=$1 if [[ -z "$RULE_ID" ]]; then echo "Error: Please provide rule ID" usage exit 1 fi echo "Searching for rule with ID [${RULE_ID}]..." local FOUND=0 # --- Clean NAT table (PREROUTING) --- LINES=$(iptables -t nat -nL PREROUTING --line-numbers | grep "NAT_ID:${RULE_ID}" | awk '{print $1}' | sort -rn) if [[ ! -z "$LINES" ]]; then for line in $LINES; do iptables -t nat -D PREROUTING $line echo "Deleted NAT table PREROUTING chain line $line" FOUND=1 done fi # --- Clean Filter table (FORWARD) --- LINES=$(iptables -t filter -nL FORWARD --line-numbers | grep "NAT_ID:${RULE_ID}" | awk '{print $1}' | sort -rn) if [[ ! -z "$LINES" ]]; then for line in $LINES; do iptables -t filter -D FORWARD $line echo "Deleted Filter table FORWARD chain line $line" FOUND=1 done fi if [[ $FOUND -eq 0 ]]; then echo "No rule found with ID $RULE_ID." else echo "Delete operation completed." echo "Please run '$0 save' to update the persistent configuration file." fi } # Save rules to disk (New feature) save_rules() { echo "Saving current iptables rules..." # netfilter-persistent is the service managing iptables-persistent in Debian/Proxmox if command -v netfilter-persistent &> /dev/null; then netfilter-persistent save if [ $? -eq 0 ]; then echo "✅ Rules successfully saved to /etc/iptables/rules.v4, will be automatically restored after system reboot." else echo "❌ Failed to save rules. Please check the status of the 'netfilter-persistent' service." fi else echo "Warning: 'netfilter-persistent' command not found." echo "Please ensure the 'iptables-persistent' package is installed." echo "Install command: apt update && apt install iptables-persistent" fi } # List rules list_rules() { echo "Current Port Forwarding Rules List:" printf "%-10s %-10s %-10s %-20s %-10s\n" "ID" "Protocol" "Public Port" "Target Address" "Target Port" echo "------------------------------------------------------------------" # Parse iptables output iptables -t nat -nL PREROUTING -v | grep "NAT_ID:" | while read line; do id=$(echo "$line" | grep -oP '(?<=NAT_ID:)[^ ]*') # Extract protocol if echo "$line" | grep -q "tcp"; then proto="tcp"; else proto="udp"; fi # Extract port after dpt: l_port=$(echo "$line" | grep -oP '(?<=dpt:)[0-9]+') # Extract IP:Port after to: target=$(echo "$line" | grep -oP '(?<=to:).*') t_ip=${target%:*} t_port=${target#*:} printf "%-10s %-10s %-10s %-20s %-10s\n" "$id" "$proto" "$l_port" "$t_ip" "$t_port" done } # Main logic case "$ACTION" in add) add_rule "$ARG1" "$ARG2" "$ARG3" "$ARG4" ;; del) del_rule "$ARG1" ;; list) list_rules exit 0 ;; save) save_rules ;; *) usage exit 1 ;; esac save_rules This script automatically adds/deletes iptables rules for port forwarding. Remember to chmod +x. Use iptables-persistent to save the configuration and load it automatically on boot: apt install iptables-persistent During configuration, you will be asked whether to save the current rules; Yes or No is fine. When adding a forwarding rule, use natmgr add <host listen port> <VM internal IP> <VM port> [tcp/udp/both]. The script will automatically assign a unique ID. Use natmgr del <ID> to delete. Use natmgr list to view the existing forwarding list. Reference Articles: bin456789/reinstall: 一键DD/重装脚本 (One-click reinstall OS on VPS) - GitHub Install Proxmox VE on Debian 12 Bookworm - Proxmox VE PVE连接 TrueNAS iSCSI存储实现本地无盘化_pve iscsi-CSDN博客 ProxmoxVE (PVE) NAT 网络配置方法 - Oskyla 烹茶室
29/11/2025
293 Views
0 Comments
2 Stars
DN42&OneManISP - Troubleshooting OSPF Source Address in a Coexistence Environment
Backstory As mentioned in the previous post of this series, because the VRF solution was too isolating, the DNS service I deployed on the HKG node (172.20.234.225) became inaccessible from the DN42 network. Research indicated this could be achieved by setting up veth or NAT forwarding, but due to the scarcity of available documentation, I ultimately abandoned the VRF approach. Structure Analysis This time, I planned to place both DN42 and clearnet BGP routes into the system's main routing table, then separate them for export using filters to distinguish which should be exported. For clarity, I stored the configuration for the DN42 part and the clearnet part (hereinafter referred to as inet) separately, and then included them from the main configuration file. Also, since there should ideally only be one kernel configuration per routing table, I merged the DN42 and inet kernel parts, keeping only one instance. After multiple optimizations and revisions, my final directory structure is as follows: /etc/bird/ ├─envvars ├─bird.conf: Main Bird config file, defines basic info (ASN, IP, etc.), includes sub-configs below ├─kernel.conf: Kernel config, imports routes into the system routing table ├─dn42 | ├─defs.conf: DN42 function definitions, e.g., is_self_dn42_net() | ├─ibgp.conf: DN42 iBGP template | ├─rpki.conf: DN42 RPKI route validation | ├─ospf.conf: DN42 OSPF internal network | ├─static.conf: DN42 static routes | ├─ebgp.conf: DN42 Peer template | ├─ibgp | | └<ibgp configs>: DN42 iBGP configs for each node | ├─ospf | | └backbone.conf: OSPF area | ├─peers | | └<ibgp configs>: DN42 Peer configs for each node ├─inet | ├─peer.conf: Clearnet Peer | ├─ixp.conf: Clearnet IXP connection | ├─defs.conf: Clearnet function definitions, e.g., is_self_inet_v6() | ├─upstream.conf: Clearnet upstream | └static.conf: Clearnet static routes I separated the function definitions because I needed to reference them in the filters within kernel.conf, so I isolated them for early inclusion. After filling in the respective configurations and setting up the include relationships, I ran birdc configure and it started successfully. So, case closed... right? Problems occurred After running for a while, I suddenly found that I couldn't ping the HKG node from my internal devices, nor could I ping my other internal nodes from the HKG node. Strangely, external ASes could ping my other nodes or other external ASes through my HKG node, and my internal nodes could also ping other non-directly connected nodes (e.g., 226(NKG)->225(HKG)->229(LAX)) via the HKG node. Using ip route get <other internal node address> revealed: root@iYoRoyNetworkHKG:/etc/bird# ip route get 172.20.234.226 172.20.234.226 via 172.20.234.226 dev dn42_nkg src 23.149.120.51 uid 0 cache See the problem? The src address should have been the HKG node's own DN42 address (configured on the OSPF stub interface), but here it showed the HKG node's clearnet address instead. Attempting to read the route learned by Bird using birdc s r for 172.20.234.226: root@iYoRoyNetworkHKGBGP:/etc/bird/dn42/ospf# birdc s r for 172.20.234.226 BIRD 2.17.1 ready. Table master4: 172.20.234.226/32 unicast [dn42_ospf_iyoroynet_v4 00:30:29.307] * I (150/50) [172.20.234.226] via 172.20.234.226 on dn42_nkg onlink Looks seemingly normal...? Theoretically, although the DN42 source IP is different from the usual, DN42 rewrites krt_prefsrc when exporting to the kernel to inform the kernel of the correct source address, so this issue shouldn't occur: protocol kernel kernel_v4{ ipv4 { import none; export filter { if source = RTS_STATIC then reject; + if is_valid_dn42_network() then krt_prefsrc = DN42_OWNIP; accept; }; }; } protocol kernel kernel_v6 { ipv6 { import none; export filter { if source = RTS_STATIC then reject; + if is_valid_dn42_network_v6() then krt_prefsrc = DN42_OWNIPv6; accept; }; }; } Regarding krt_prefsrc, it stands for Kernel Route Preferred Source. This attribute doesn't manipulate the route directly but instead attaches a piece of metadata to it. This metadata directly instructs the Linux kernel to prioritize the specified IP address as the source address for packets sent via this route. I was stuck on this for a long time. The Solution Finally, during an unintentional attempt, I added the krt_prefsrc rewrite to the OSPF import configuration as well: protocol ospf v3 dn42_ospf_iyoroynet_v4 { router id DN42_OWNIP; ipv4 { - import where is_self_dn42_net() && source != RTS_BGP; + import filter { + if is_self_dn42_net() && source != RTS_BGP then { + krt_prefsrc=DN42_OWNIP; + accept; + } + reject; + }; export where is_self_dn42_net() && source != RTS_BGP; }; include "ospf/*"; }; protocol ospf v3 dn42_ospf_iyoroynet_v6 { router id DN42_OWNIP; ipv6 { - import where is_self_dn42_net_v6() && source != RTS_BGP; + import filter { + if is_self_dn42_net_v6() && source != RTS_BGP then { + krt_prefsrc=DN42_OWNIPv6; + accept; + } + reject; + }; export where is_self_dn42_net_v6() && source != RTS_BGP; }; include "ospf/*"; }; After running this, the src address became correct, and mutual pinging worked. Configuration files for reference: KaguraiYoRoy/Bird2-Configuration
29/10/2025
99 Views
0 Comments
1 Stars
DN42&OneManISP - Using VRF to Run Clearnet BGP and DN42 on the Same Machine
Background Currently, clearnet BGP and DN42 each use a separate VPS in the same region, meaning two machines are required per region. After learning about VRF from a group member, I explored using VRF to enable a single machine to handle both clearnet BGP and DN42 simultaneously. Note: Due to its isolation nature, the VRF solution will prevent DN42 from accessing services on the host. If you need to run services (like DNS) on the server for DN42, you might need additional port forwarding or veth configuration, which is beyond the scope of this article. (This is also the reason why I ultimately did not adopt VRF in my production environment). Advantages of VRF Although DN42 uses private IP ranges and internal ASNs, which theoretically shouldn't interfere with clearnet BGP, sharing the same routing table can lead to issues like route pollution and management complexity. VRF (Virtual Routing and Forwarding) allows creating multiple routing tables on a single machine. This means we can isolate DN42 routes into a separate routing table, keeping them apart from the clearnet routing table. The advantages include: Absolute Security and Policy Isolation: The DN42 routing table is isolated from the clearnet routing table, fundamentally preventing route leaks. Clear Operation and Management: Use commands like birdc show route table t_dn42 and birdc show route table t_inet to view and debug two completely independent routing tables, making things clear at a glance. Fault Domain Isolation: If a DN42 peer flaps, the impact is confined to the dn42 routing table. It won't consume routing computation resources for the clearnet instance nor affect clearnet forwarding performance. Alignment with Modern Network Design Principles: Using VRF for different routing domains (production, testing, customer, partner) is standard practice in modern network engineering. It logically divides your device into multiple virtual routers. Configuration System Part Creating the VRF Interface Use the following commands to create a VRF device named dn42-vrf and associate it with the system's routing table number 1042: ip link add dn42-vrf type vrf table 1042 ip link set dev dn42-vrf up # Enable it You can change the routing table number according to your preference, but avoid the following reserved routing table IDs: Name ID Description unspec 0 Unspecified, rarely used main 254 Main routing table, where most ordinary routes reside default 253 Generally unused, reserved local 255 Local routing table, contains 127.0.0.1/8, local IPs, broadcast addresses, etc. Cannot be modified Associating Existing Network Interfaces with VRF In my current DN42 setup, several WireGuard interfaces and a dummy interface are used for DN42. Therefore, associate these interfaces with the VRF: ip link set dev <interface_name> master dn42-vrf Note: After associating an interface with a VRF, it might lose its IP addresses. Therefore, you need to readd the addresses, for example: ip addr add 172.20.234.225 dev dn42 After completion, ip a should show the corresponding interface's master as dn42-vrf: 156: dn42: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue master dn42-vrf state UNKNOWN group default qlen 1000 link/ether b6:f5:28:ed:23:04 brd ff:ff:ff:ff:ff:ff inet 172.20.234.225/32 scope global dn42 valid_lft forever preferred_lft forever inet6 fd18:3e15:61d0::1/128 scope global valid_lft forever preferred_lft forever inet6 fe80::b4f5:28ff:feed:2304/64 scope link valid_lft forever preferred_lft forever Persistence I use ifupdown2 to automatically load the dummy interface and VRF device on boot. auto dn42-vrf iface dn42-vrf inet manual vrf-table 1042 auto dn42 iface dn42 inet static pre-up ip link add $IFACE type dummy || true vrf dn42-vrf address <IPv4 Address>/32 address <IPv6 Address>/128 post-down ip link del $IFACE My dummy interface is named dn42; modify accordingly if yours is different. After creation, use ifup dn42-vrf && ifup dn42 to start the dummy interface. Note: The number prefix for the VRF device file should be smaller than that of the dummy interface file, ensuring the VRF device starts first. WireGuard Tunnels Add PostUp commands to associate them with the VRF and readd their addresses. Example: [Interface] PrivateKey = [Data Redacted] ListenPort = [Data Redacted] Table = off Address = fe80::2024/64 + PostUp = ip link set dev %i master dn42-vrf + PostUp = ip addr add fe80::2024/64 dev %i PostUp = sysctl -w net.ipv6.conf.%i.autoconf=0 [Peer] PublicKey = [Data Redacted] Endpoint = [Data Redacted] AllowedIPs = 10.0.0.0/8, 172.20.0.0/14, 172.31.0.0/16, fd00::/8, fe00::/8 Then restart the tunnel. Bird2 Part First, define two routing tables for DN42's IPv4 and IPv6: ipv4 table dn42_table_v4; ipv6 table dn42_table_v6 Then, specify the VRF and system routing table number in the kernel protocol, and specify the previously created v4/v6 routing tables in the IPv4/IPv6 sections: protocol kernel dn42_kernel_v6{ + vrf "dn42-vrf"; + kernel table 1042; scan time 20; ipv6 { + table dn42_table_v6; import none; export filter { if source = RTS_STATIC then reject; krt_prefsrc = DN42_OWNIPv6; accept; }; }; }; protocol kernel dn42_kernel_v4{ + vrf "dn42-vrf"; + kernel table 1042; scan time 20; ipv4 { + table dn42_table_v4; import none; export filter { if source = RTS_STATIC then reject; krt_prefsrc = DN42_OWNIP; accept; }; }; } For protocols other than kernel, add the VRF and the independent IPv4/IPv6 tables, but do not specify the system routing table number: protocol static dn42_static_v4{ + vrf "dn42-vrf"; route DN42_OWNNET reject; ipv4 { + table dn42_table_v4; import all; export none; }; } protocol static dn42_static_v6{ + vrf "dn42-vrf"; route DN42_OWNNETv6 reject; ipv6 { + table dn42_table_v6; import all; export none; }; } In summary: Configure a VRF and the previously defined routing tables for everything related to DN42. Only the kernel protocol needs the system routing table number specified; others do not. Apply the same method to BGP, OSPF, etc. However, I chose to use separate Router IDs for the clearnet and DN42, so a separate Router ID needs to be configured: # /etc/bird/dn42/ospf.conf protocol ospf v3 dn42_ospf_iyoroynet_v4 { + vrf "dn42-vrf"; + router id DN42_OWNIP; ipv4 { + table dn42_table_v4; import where is_self_dn42_net() && source != RTS_BGP; export where is_self_dn42_net() && source != RTS_BGP; }; include "ospf/*"; }; protocol ospf v3 dn42_ospf_iyoroynet_v6 { + vrf "dn42-vrf"; + router id DN42_OWNIP; ipv6 { + table dn42_table_v6; import where is_self_dn42_net_v6() && source != RTS_BGP; export where is_self_dn42_net_v6() && source != RTS_BGP; }; include "ospf/*"; }; # /etc/bird/dn42/ebgp.conf ... template bgp dnpeers { + vrf "dn42-vrf"; + router id DN42_OWNIP; local as DN42_OWNAS; path metric 1; ipv4 { + table dn42_table_v4; ... }; ipv6 { + table dn42_table_v6; ... }; } include "peers/*"; After completion, reload the configuration with birdc c. Now, we can view the DN42 routing table separately using ip route show vrf dn42-vrf: root@iYoRoyNetworkHKGBGP:~# ip route show vrf dn42-vrf 10.26.0.0/16 via inet6 fe80::ade0 dev dn42_4242423914 proto bird src 172.20.234.225 metric 32 10.29.0.0/16 via inet6 fe80::ade0 dev dn42_4242423914 proto bird src 172.20.234.225 metric 32 10.37.0.0/16 via inet6 fe80::ade0 dev dn42_4242423914 proto bird src 172.20.234.225 metric 32 ... You can also ping through the VRF using the -I dn42-vrf parameter: root@iYoRoyNetworkHKGBGP:~# ping 172.20.0.53 -I dn42-vrf ping: Warning: source address might be selected on device other than: dn42-vrf PING 172.20.0.53 (172.20.0.53) from 172.20.234.225 dn42-vrf: 56(84) bytes of data. 64 bytes from 172.20.0.53: icmp_seq=1 ttl=64 time=3.18 ms 64 bytes from 172.20.0.53: icmp_seq=2 ttl=64 time=3.57 ms 64 bytes from 172.20.0.53: icmp_seq=3 ttl=64 time=3.74 ms 64 bytes from 172.20.0.53: icmp_seq=4 ttl=64 time=2.86 ms ^C --- 172.20.0.53 ping statistics --- 4 packets transmitted, 4 received, 0% packet loss, time 3006ms rtt min/avg/max/mdev = 2.863/3.337/3.740/0.341 ms Important Notes If the VRF device is reloaded, all devices originally associated with the VRF need to be reloaded as well, otherwise they won't function correctly. Currently, DN42 cannot access services inside the host configured with VRF. A future article might explain how to allow traffic within the VRF to access host services (Adding to the TODO list). I learned from a friend that by setting net.ipv4.tcp_l3mdev_accept=1 and net.ipv4.udp_l3mdev_accept=1, it is possible to allow the listening sockets in the global space to accept connection requests from the VRF domain, thus achieving cross-vrf listening services. Reference Articles:: Run your MPLS network with BIRD
16/09/2025
154 Views
0 Comments
1 Stars
1
2
3