Homepage
Privacy Policy
iYoRoy DN42 Network
About
More
Friends
Language
简体中文
English
Search
1
Centralized Deployment of EasyTier using Docker
1,705 Views
2
Adding KernelSU Support to Android 4.9 Kernel
1,091 Views
3
Enabling EROFS Support for an Android ROM with Kernel 4.9
309 Views
4
Installing 1Panel Using Docker on TrueNAS
300 Views
5
2025 Yangcheng Cup CTF Preliminary WriteUp
296 Views
Android
Ops
NAS
Develop
Network
Projects
DN42
One Man ISP
CTF
Kubernetes
Cybersecurity
Brain Dumps
Login
Search
Search Tags
Network Technology
BGP
BIRD
Linux
DN42
Android
OSPF
C&C++
Web
AOSP
CTF
Cybersecurity
Docker
iBGP
Windows
MSVC
Services
Kernel
IGP
TrueNAS
Kagura iYoRoy
A total of
32
articles have been written.
A total of
23
comments have been received.
Index
Column
Android
Ops
NAS
Develop
Network
Projects
DN42
One Man ISP
CTF
Kubernetes
Cybersecurity
Brain Dumps
Pages
Privacy Policy
iYoRoy DN42 Network
About
Friends
Language
简体中文
English
8
articles related to
were found.
Building a Cross-Region K3s Cluster from Scratch - Ep.1 Calico No-Encapsulation CNI
# Preface I've actually wanted to play with a K8s cluster for a long time, but always felt that without sufficient knowledge, it would be too difficult to attempt. Recently, I spent some time studying DN42 and routing protocols like BGP and OSPF, and realized that it no longer feels so difficult. So I decisively started with K3s ( The main reason for choosing K3s over K8s is its lightweight nature: low resource requirements, no need to pull a bunch of images for deployment, availability of domestic mirrors… In short, K3s suits my needs better. I'm a beginner just starting to explore K3s, so please go easy on me if I make any mistakes~ # Analysis ## Choosing the CNI Component My current network architecture looks like this: ```mermaid graph TD subgraph ZeroTier Domestic subgraph WDS Gateway <--> VM1 Gateway <--> VM2 end NGB <--> Gateway HFE-NAS <--> Gateway NGB <--> HFE-NAS end subgraph IEPL Global-NIC <==OSPF==> CN-NIC end subgraph ZeroTier Global HKG02 <--> HKG04 TYO <--> HKG04 TYO <--> HKG02 end CN-NIC <--> NGB CN-NIC <--> HFE-NAS CN-NIC <--OSPF--> Gateway Global-NIC <--OSPF--> TYO Global-NIC <--OSPF--> HKG02 Global-NIC <--OSPF--> HKG04 %% Style definition: orange background, bold border to represent routers classDef router fill:#f96,stroke:#333,stroke-width:2px,font-weight:bold; class Global-NIC,CN-NIC,Gateway router; Among this, the WDS node is a Proxmox VE host with multiple VMs underneath. It advertises its VMs' IPv4 prefixes via OSPF. When Hong Kong nodes need to access a VM under the WDS node, they can do so by joining the OSPF internal network to achieve multi-hop reachability. This keeps the encapsulation layer count to only one, so there's no worry about MTU "disappearing act". I plan to create two new VMs under WDS to serve as the master and a node (temporarily called KubeMaster and KubeNode-WDS1). Then HKG04 (temporarily called KubeNode-HKG04) will also join the K3s cluster as a node. The simplest approach would be to use K3s's default Flannel as the CNI. However, Flannel is based on VXLAN, and adding another layer of my existing internal network would lead to the following MTU "disappearing act": Data packet -> Flannel VXLAN encapsulation -> ZeroTier encapsulation -> Physical link The actual usable MTU for inter-container communication would likely be compressed to 1350 or even lower. Therefore, I tried to find a CNI solution that can work directly on top of this internal network, and then I found Calico. As I understand, Calico uses BGP as its underlying routing protocol, supports starting in no-encapsulation (No-Encap) mode, and hands packets directly to the upper routers for routing. Thus, I chose Calico as the CNI component. Routing Design To ensure that intermediate routers know how to route Pod IPs, KubeMaster and KubeNode-WDS1 are under the Proxmox VE host. They need to establish BGP with HKG04 across the entire internal network. This means that every router at each intermediate level must learn the full BGP routes, so that the following routing path can be established: graph LR subgraph WDS KubeMaster KubeNode-WDS1 Gateway end subgraph IEPL CN-Namespace Global-Namespace end KubeNode-WDS1 <--> Gateway KubeMaster <--> Gateway <--> CN-Namespace <--> Global-Namespace <--> HKG04 %% Style definition: highlight nodes with routing capability classDef router fill:#f96,stroke:#333,stroke-width:2px,font-weight:bold; class Gateway,CN-Namespace,Global-Namespace router; Otherwise, any intermediate hop would drop packets because it doesn't recognize the source/destination IP. Also, due to the property of iBGP that routes learned from a neighbor cannot be propagated to the next iBGP neighbor, all BGP sessions between Gateway, CN-Namespace, Global-Namespace and the nodes need to enable Route Reflector; otherwise, nodes cannot correctly learn routes from each other. That said, this architecture would be more suitable for BGP Confederation, but my existing network is already quite complex, and adding BGP confederations would make later maintenance more troublesome. Moreover, my number of nodes is small, so the overhead of iBGP Full Mesh is acceptable. It's definitely not because I'm lazy (so Thus, the final network routing structure is as follows: graph TD subgraph WDS VM1 VM2 Gateway end subgraph IEPL CN-Namespace Global-Namespace end VM1 <-.Calico iBGP Full Mesh.-> VM2 VM1 <--iBGP Route Reflector--> Gateway VM2 <--iBGP Route Reflector--> Gateway <--iBGP--> CN-Namespace <--iBGP--> Global-Namespace <--iBGP Route Reflector--> HKG04 Gateway <--iBGP--> Global-Namespace HKG04 <-.Calico iBGP Full Mesh.-> VM1 VM2 <-.Calico iBGP Full Mesh.-> HKG04 %% Style definition classDef router fill:#f96,stroke:#333,stroke-width:2px,font-weight:bold; %% Mark nodes with routing/forwarding or RR functions as Router class Gateway,CN-Namespace,Global-Namespace router; The dashed-line BGP sessions are automatically created by Calico, while the solid-line parts need to be manually created by us. Keeping Calico's own iBGP Full Mesh is for future scalability, so that nodes can preferentially establish direct P2P connections via ZeroTier instead of taking a detour through the Route Reflector aggregation router. Deployment After clarifying the structure, deployment becomes simple. Enable Kernel Forwarding and Disable rp_filter Standard practice. echo "net.ipv4.ip_forward=1" >> /etc/sysctl.conf echo "net.ipv6.conf.default.forwarding=1" >> /etc/sysctl.conf echo "net.ipv6.conf.all.forwarding=1" >> /etc/sysctl.conf echo "net.ipv4.conf.default.rp_filter=0" >> /etc/sysctl.conf echo "net.ipv4.conf.all.rp_filter=0" >> /etc/sysctl.conf sysctl -p Install K3s Master Because the KubeMaster control plane node is located inside China, it's best to configure image acceleration: mkdir -p /etc/rancher/k3s cat <<EOF > /etc/rancher/k3s/registries.yaml mirrors: docker.io: endpoint: - "https://docker.m.daocloud.io" quay.io: endpoint: - "https://quay.m.daocloud.io" EOF Install using the mirror: curl -sfL https://rancher-mirror.rancher.cn/k3s/k3s-install.sh | \ INSTALL_K3S_MIRROR=cn INSTALL_K3S_EXEC=" \ --flannel-backend=none \ --disable-network-policy \ --cluster-cidr=10.42.0.0/16" sh - Note the need to specify --flannel-backend=none and --disable-network-policy to disable the default CNI component. Use cat /var/lib/rancher/k3s/server/node-token to view the token and record it. Worker Nodes For nodes inside China, configure image acceleration: mkdir -p /etc/rancher/k3s cat <<EOF > /etc/rancher/k3s/registries.yaml mirrors: docker.io: endpoint: - "https://docker.m.daocloud.io" quay.io: endpoint: - "https://quay.m.daocloud.io" EOF Then install K3s using the mirror and join the cluster: export INSTALL_K3S_MIRROR=cn export K3S_URL=https://<master node IP>:6443 # Replace with your master node's actual IP export K3S_TOKEN=K10...your token...::server:xxx # Replace with the full token obtained in the first step curl -sfL https://rancher-mirror.rancher.cn/k3s/k3s-install.sh | sh - At this point, the status of each node should be NotReady because the CNI component is missing. Install Calico and Configure No-Encap Mode On the master, manually download https://raw.githubusercontent.com/projectcalico/calico/v3.26.1/manifests/tigera-operator.yaml and install the Calico operator: kubectl create -f tigera-operator.yaml Configure a custom resource by creating a custom-resource.yaml file: apiVersion: operator.tigera.io/v1 kind: Installation metadata: name: default spec: # Add image registry configuration registry: quay.m.daocloud.io calicoNetwork: ipPools: - blockSize: 26 cidr: 10.42.0.0/16 encapsulation: None natOutgoing: Enabled nodeSelector: all() Here, specify encapsulation: None to enable No-Encap mode. You can also modify the IPv4 CIDR here if needed. Then: kubectl apply -f custom-resource.yaml to perform the installation. Use: kubectl get pods -A -o wide to check Pod status, waiting for each node to finish pulling images. Configure BGP Topology Label Nodes Label nodes to specify that nodes under WDS connect to the Gateway's BGP in the WDS node, and nodes outside China connect to the BGP of the Global Namespace: kubectl label nodes kubemaster region=WDS kubectl label nodes kubenode-wds-1 region=WDS kubectl label nodes kubenode-hkg04 region=Global Calico Configuration Create a YAML configuration file: apiVersion: crd.projectcalico.org/v1 kind: BGPPeer metadata: name: route-reflector-domestic spec: nodeSelector: region == 'Domestic' # This part is not actually used; I originally designed a general aggregation router in the Domestic area peerIP: 100.64.0.108 asNumber: 64512 --- apiVersion: crd.projectcalico.org/v1 kind: BGPPeer metadata: name: route-reflector-wds spec: nodeSelector: region == 'WDS' peerIP: 192.168.100.1 asNumber: 64512 --- apiVersion: crd.projectcalico.org/v1 kind: BGPPeer metadata: name: route-reflector-global spec: nodeSelector: region == 'Global' peerIP: 100.64.1.106 asNumber: 64512 This means: All nodes with label region equal to Domestic will have a BGP session to 100.64.0.108 (the domestic aggregation router) using AS 64512 All nodes with label region equal to WDS will have a BGP session to 192.168.100.1 (the Gateway for all VMs under the WDS node) using AS 64512 All nodes with label region equal to Global will have a BGP session to 100.64.1.106 (the overseas aggregation router) using AS 64512 This achieves what is shown in the diagram: all VMs under the WDS node, including the master and KubeNode-WDS1, connect to the Gateway aggregation router of the WDS node, and all nodes in overseas areas connect to the overseas aggregation router. Configure Aggregation Router iBGP This part is simply a matter of writing Bird configuration files (easy). Here are a few examples: k3s/ibgp.conf: function is_insider_as(){ if bgp_path.len > 0 && !(bgp_path ~ [= 64512 =]) then { return false; } if net ~ [ 10.42.0.0/16{16,32} ] then { return true; } return false; } template bgp k3sbackbone{ local as K3S_AS; router id INTRA_ROUTER_ID; neighbor as K3S_AS; ipv4{ table intra_table_v4; import filter{ if is_insider_as() then accept; reject; }; export filter{ if is_insider_as() then accept; reject; }; next hop self; extended next hop; }; ipv6{ table intra_table_v6; import filter{ if is_insider_as() then accept; reject; }; export filter{ if is_insider_as() then accept; reject; }; next hop self; }; }; template bgp k3speers{ local as K3S_AS; neighbor as K3S_AS; router id INTRA_ROUTER_ID; rr client; rr cluster id INTRA_ROUTER_ID; ipv4{ table intra_table_v4; import filter{ if is_insider_as() then accept; reject; }; export filter{ if is_insider_as() then accept; reject; }; next hop self; }; ipv6{ table intra_table_v6; import filter{ if is_insider_as() then accept; reject; }; export filter{ if is_insider_as() then accept; reject; }; next hop self; }; }; include "ibgpeers/*"; ibgpeers/backbone-cn.conf: protocol bgp 'k3s_backbone_cn_v4' from k3sbackbone{ neighbor fd18:3e15:61d0:cafe:f001::1; }; ibgpeers/master.conf: protocol bgp 'k3s_master_v4' from k3speers{ neighbor 192.168.100.251; }; Main points: it's best not to enable Route Reflector between the aggregation routers, and remember to enable next hop self. After everything is done, using kubectl get nodes should show all nodes as Ready: NAME STATUS ROLES AGE VERSION kubemaster Ready control-plane 2d23h v1.34.5+k3s1 kubenode-hkg04 Ready <none> 11h v1.34.6+k3s1 kubenode-wds-1 Ready <none> 2d7h v1.34.5+k3s1 Use kubectl get pods -A -o wide to view Pods: NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES calico-system calico-kube-controllers-64fc874957-6bdlz 1/1 Running 0 5h38m 10.42.253.136 kubenode-hkg04 <none> <none> calico-system calico-node-2qz82 1/1 Running 0 4h24m 10.2.5.7 kubenode-hkg04 <none> <none> calico-system calico-node-dhl2c 1/1 Running 0 4h24m 192.168.100.251 kubemaster <none> <none> calico-system calico-node-nbpkj 1/1 Running 0 4h23m 192.168.100.252 kubenode-wds-1 <none> <none> calico-system calico-typha-7bb5db4bdc-rfpwg 1/1 Running 0 5h38m 10.2.5.7 kubenode-hkg04 <none> <none> calico-system calico-typha-7bb5db4bdc-rwwr5 1/1 Running 0 5h38m 192.168.100.251 kubemaster <none> <none> calico-system csi-node-driver-jglwp 2/2 Running 0 5h38m 10.42.64.68 kubenode-wds-1 <none> <none> calico-system csi-node-driver-jqjsc 2/2 Running 0 5h38m 10.42.253.137 kubenode-hkg04 <none> <none> calico-system csi-node-driver-vk26s 2/2 Running 0 5h38m 10.42.141.16 kubemaster <none> <none> kube-system coredns-695cbbfcb9-8fx4p 1/1 Running 1 (7h27m ago) 2d23h 10.42.141.14 kubemaster <none> <none> kube-system helm-install-traefik-crd-5bkwx 0/1 Completed 0 2d23h <none> kubemaster <none> <none> kube-system helm-install-traefik-m9fgj 0/1 Completed 1 2d23h <none> kubemaster <none> <none> kube-system local-path-provisioner-546dfc6456-dmn4g 1/1 Running 1 (7h27m ago) 2d23h 10.42.141.15 kubemaster <none> <none> kube-system metrics-server-c8774f4f4-2wkwh 1/1 Running 1 (7h27m ago) 2d23h 10.42.141.12 kubemaster <none> <none> kube-system svclb-traefik-999cddce-hpmcm 2/2 Running 6 (7h26m ago) 11h 10.42.253.134 kubenode-hkg04 <none> <none> kube-system svclb-traefik-999cddce-q4225 2/2 Running 2 (7h27m ago) 2d22h 10.42.141.9 kubemaster <none> <none> kube-system svclb-traefik-999cddce-xmd64 2/2 Running 2 (7h26m ago) 2d6h 10.42.64.66 kubenode-wds-1 <none> <none> kube-system traefik-788bc4688c-vbbhj 1/1 Running 1 (7h27m ago) 2d22h 10.42.141.13 kubemaster <none> <none> tigera-operator tigera-operator-6b95bbf4db-vl46l 1/1 Running 1 (7h27m ago) 2d23h 192.168.100.251 kubemaster <none> <none> Use kubectl exec -it -n calico-system <calico-node-xxxx> -- birdcl s p to check the status of Bird: root@KubeMaster:~/kube/calico# kubectl exec -it -n calico-system calico-node-2qz82 -- birdcl s p Defaulted container "calico-node" out of: calico-node, flexvol-driver (init), install-cni (init) BIRD v0.3.3+birdv1.6.8 ready. name proto table state since info static1 Static master up 08:58:17 kernel1 Kernel master up 08:58:17 device1 Device master up 08:58:17 direct1 Direct master up 08:58:17 Mesh_192_168_100_251 BGP master up 08:58:33 Established Mesh_192_168_100_252 BGP master up 08:59:00 Established Node_100_64_1_106 BGP master up 12:57:44 Established ip r shows the system routing table: root@KubeMaster:~/kube/calico# ip r default via 192.168.100.1 dev eth0 proto static 10.42.64.64/26 proto bird nexthop via 192.168.100.1 dev eth0 weight 1 nexthop via 192.168.100.252 dev eth0 weight 1 blackhole 10.42.141.0/26 proto bird 10.42.141.9 dev caliac6501d3794 scope link 10.42.141.12 dev calib07c23291bb scope link 10.42.141.13 dev caliab16e60bd19 scope link 10.42.141.14 dev calid5959219080 scope link 10.42.141.15 dev cali026d8f1ddb7 scope link 10.42.141.16 dev califa657ba417a scope link 10.42.253.128/26 via 192.168.100.1 dev eth0 proto bird 192.168.100.0/24 dev eth0 proto kernel scope link src 192.168.100.251 Ping a Pod's IP – if everything is fine, it should work directly: root@KubeMaster:~/kube/calico# ping 10.42.253.137 PING 10.42.253.137 (10.42.253.137) 56(84) bytes of data. 64 bytes from 10.42.253.137: icmp_seq=1 ttl=60 time=33.7 ms 64 bytes from 10.42.253.137: icmp_seq=2 ttl=60 time=33.5 ms ^C --- 10.42.253.137 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1002ms rtt min/avg/max/mdev = 33.546/33.632/33.718/0.086 ms Tune MTU This step is actually for stability…? Tests have shown that although my ZeroTier MTU is 1420, packets start to fragment around 1392 bytes (test with ping -M do -s <packet size> <Pod_IP>). Therefore, force the Pod MTU to 1370: root@KubeMaster:~/kube/calico# cat patch-mtu.yaml apiVersion: operator.tigera.io/v1 kind: Installation metadata: name: default spec: calicoNetwork: mtu: 1370 nodeAddressAutodetectionV4: firstFound: true root@KubeMaster:~/kube/calico# kubectl apply -f patch-mtu.yaml installation.operator.tigera.io/default configured
05/04/2026
36 Views
0 Comments
4 Stars
DN42&OneManISP - Troubleshooting OSPF Source Address in a Coexistence Environment
Backstory As mentioned in the previous post of this series, because the VRF solution was too isolating, the DNS service I deployed on the HKG node (172.20.234.225) became inaccessible from the DN42 network. Research indicated this could be achieved by setting up veth or NAT forwarding, but due to the scarcity of available documentation, I ultimately abandoned the VRF approach. Structure Analysis This time, I planned to place both DN42 and clearnet BGP routes into the system's main routing table, then separate them for export using filters to distinguish which should be exported. For clarity, I stored the configuration for the DN42 part and the clearnet part (hereinafter referred to as inet) separately, and then included them from the main configuration file. Also, since there should ideally only be one kernel configuration per routing table, I merged the DN42 and inet kernel parts, keeping only one instance. After multiple optimizations and revisions, my final directory structure is as follows: /etc/bird/ ├─envvars ├─bird.conf: Main Bird config file, defines basic info (ASN, IP, etc.), includes sub-configs below ├─kernel.conf: Kernel config, imports routes into the system routing table ├─dn42 | ├─defs.conf: DN42 function definitions, e.g., is_self_dn42_net() | ├─ibgp.conf: DN42 iBGP template | ├─rpki.conf: DN42 RPKI route validation | ├─ospf.conf: DN42 OSPF internal network | ├─static.conf: DN42 static routes | ├─ebgp.conf: DN42 Peer template | ├─ibgp | | └<ibgp configs>: DN42 iBGP configs for each node | ├─ospf | | └backbone.conf: OSPF area | ├─peers | | └<ibgp configs>: DN42 Peer configs for each node ├─inet | ├─peer.conf: Clearnet Peer | ├─ixp.conf: Clearnet IXP connection | ├─defs.conf: Clearnet function definitions, e.g., is_self_inet_v6() | ├─upstream.conf: Clearnet upstream | └static.conf: Clearnet static routes I separated the function definitions because I needed to reference them in the filters within kernel.conf, so I isolated them for early inclusion. After filling in the respective configurations and setting up the include relationships, I ran birdc configure and it started successfully. So, case closed... right? Problems occurred After running for a while, I suddenly found that I couldn't ping the HKG node from my internal devices, nor could I ping my other internal nodes from the HKG node. Strangely, external ASes could ping my other nodes or other external ASes through my HKG node, and my internal nodes could also ping other non-directly connected nodes (e.g., 226(NKG)->225(HKG)->229(LAX)) via the HKG node. Using ip route get <other internal node address> revealed: root@iYoRoyNetworkHKG:/etc/bird# ip route get 172.20.234.226 172.20.234.226 via 172.20.234.226 dev dn42_nkg src 23.149.120.51 uid 0 cache See the problem? The src address should have been the HKG node's own DN42 address (configured on the OSPF stub interface), but here it showed the HKG node's clearnet address instead. Attempting to read the route learned by Bird using birdc s r for 172.20.234.226: root@iYoRoyNetworkHKGBGP:/etc/bird/dn42/ospf# birdc s r for 172.20.234.226 BIRD 2.17.1 ready. Table master4: 172.20.234.226/32 unicast [dn42_ospf_iyoroynet_v4 00:30:29.307] * I (150/50) [172.20.234.226] via 172.20.234.226 on dn42_nkg onlink Looks seemingly normal...? Theoretically, although the DN42 source IP is different from the usual, DN42 rewrites krt_prefsrc when exporting to the kernel to inform the kernel of the correct source address, so this issue shouldn't occur: protocol kernel kernel_v4{ ipv4 { import none; export filter { if source = RTS_STATIC then reject; + if is_valid_dn42_network() then krt_prefsrc = DN42_OWNIP; accept; }; }; } protocol kernel kernel_v6 { ipv6 { import none; export filter { if source = RTS_STATIC then reject; + if is_valid_dn42_network_v6() then krt_prefsrc = DN42_OWNIPv6; accept; }; }; } Regarding krt_prefsrc, it stands for Kernel Route Preferred Source. This attribute doesn't manipulate the route directly but instead attaches a piece of metadata to it. This metadata directly instructs the Linux kernel to prioritize the specified IP address as the source address for packets sent via this route. I was stuck on this for a long time. The Solution Finally, during an unintentional attempt, I added the krt_prefsrc rewrite to the OSPF import configuration as well: protocol ospf v3 dn42_ospf_iyoroynet_v4 { router id DN42_OWNIP; ipv4 { - import where is_self_dn42_net() && source != RTS_BGP; + import filter { + if is_self_dn42_net() && source != RTS_BGP then { + krt_prefsrc=DN42_OWNIP; + accept; + } + reject; + }; export where is_self_dn42_net() && source != RTS_BGP; }; include "ospf/*"; }; protocol ospf v3 dn42_ospf_iyoroynet_v6 { router id DN42_OWNIP; ipv6 { - import where is_self_dn42_net_v6() && source != RTS_BGP; + import filter { + if is_self_dn42_net_v6() && source != RTS_BGP then { + krt_prefsrc=DN42_OWNIPv6; + accept; + } + reject; + }; export where is_self_dn42_net_v6() && source != RTS_BGP; }; include "ospf/*"; }; After running this, the src address became correct, and mutual pinging worked. Configuration files for reference: KaguraiYoRoy/Bird2-Configuration
29/10/2025
99 Views
0 Comments
1 Stars
DN42&OneManISP - Using VRF to Run Clearnet BGP and DN42 on the Same Machine
Background Currently, clearnet BGP and DN42 each use a separate VPS in the same region, meaning two machines are required per region. After learning about VRF from a group member, I explored using VRF to enable a single machine to handle both clearnet BGP and DN42 simultaneously. Note: Due to its isolation nature, the VRF solution will prevent DN42 from accessing services on the host. If you need to run services (like DNS) on the server for DN42, you might need additional port forwarding or veth configuration, which is beyond the scope of this article. (This is also the reason why I ultimately did not adopt VRF in my production environment). Advantages of VRF Although DN42 uses private IP ranges and internal ASNs, which theoretically shouldn't interfere with clearnet BGP, sharing the same routing table can lead to issues like route pollution and management complexity. VRF (Virtual Routing and Forwarding) allows creating multiple routing tables on a single machine. This means we can isolate DN42 routes into a separate routing table, keeping them apart from the clearnet routing table. The advantages include: Absolute Security and Policy Isolation: The DN42 routing table is isolated from the clearnet routing table, fundamentally preventing route leaks. Clear Operation and Management: Use commands like birdc show route table t_dn42 and birdc show route table t_inet to view and debug two completely independent routing tables, making things clear at a glance. Fault Domain Isolation: If a DN42 peer flaps, the impact is confined to the dn42 routing table. It won't consume routing computation resources for the clearnet instance nor affect clearnet forwarding performance. Alignment with Modern Network Design Principles: Using VRF for different routing domains (production, testing, customer, partner) is standard practice in modern network engineering. It logically divides your device into multiple virtual routers. Configuration System Part Creating the VRF Interface Use the following commands to create a VRF device named dn42-vrf and associate it with the system's routing table number 1042: ip link add dn42-vrf type vrf table 1042 ip link set dev dn42-vrf up # Enable it You can change the routing table number according to your preference, but avoid the following reserved routing table IDs: Name ID Description unspec 0 Unspecified, rarely used main 254 Main routing table, where most ordinary routes reside default 253 Generally unused, reserved local 255 Local routing table, contains 127.0.0.1/8, local IPs, broadcast addresses, etc. Cannot be modified Associating Existing Network Interfaces with VRF In my current DN42 setup, several WireGuard interfaces and a dummy interface are used for DN42. Therefore, associate these interfaces with the VRF: ip link set dev <interface_name> master dn42-vrf Note: After associating an interface with a VRF, it might lose its IP addresses. Therefore, you need to readd the addresses, for example: ip addr add 172.20.234.225 dev dn42 After completion, ip a should show the corresponding interface's master as dn42-vrf: 156: dn42: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue master dn42-vrf state UNKNOWN group default qlen 1000 link/ether b6:f5:28:ed:23:04 brd ff:ff:ff:ff:ff:ff inet 172.20.234.225/32 scope global dn42 valid_lft forever preferred_lft forever inet6 fd18:3e15:61d0::1/128 scope global valid_lft forever preferred_lft forever inet6 fe80::b4f5:28ff:feed:2304/64 scope link valid_lft forever preferred_lft forever Persistence I use ifupdown2 to automatically load the dummy interface and VRF device on boot. auto dn42-vrf iface dn42-vrf inet manual vrf-table 1042 auto dn42 iface dn42 inet static pre-up ip link add $IFACE type dummy || true vrf dn42-vrf address <IPv4 Address>/32 address <IPv6 Address>/128 post-down ip link del $IFACE My dummy interface is named dn42; modify accordingly if yours is different. After creation, use ifup dn42-vrf && ifup dn42 to start the dummy interface. Note: The number prefix for the VRF device file should be smaller than that of the dummy interface file, ensuring the VRF device starts first. WireGuard Tunnels Add PostUp commands to associate them with the VRF and readd their addresses. Example: [Interface] PrivateKey = [Data Redacted] ListenPort = [Data Redacted] Table = off Address = fe80::2024/64 + PostUp = ip link set dev %i master dn42-vrf + PostUp = ip addr add fe80::2024/64 dev %i PostUp = sysctl -w net.ipv6.conf.%i.autoconf=0 [Peer] PublicKey = [Data Redacted] Endpoint = [Data Redacted] AllowedIPs = 10.0.0.0/8, 172.20.0.0/14, 172.31.0.0/16, fd00::/8, fe00::/8 Then restart the tunnel. Bird2 Part First, define two routing tables for DN42's IPv4 and IPv6: ipv4 table dn42_table_v4; ipv6 table dn42_table_v6 Then, specify the VRF and system routing table number in the kernel protocol, and specify the previously created v4/v6 routing tables in the IPv4/IPv6 sections: protocol kernel dn42_kernel_v6{ + vrf "dn42-vrf"; + kernel table 1042; scan time 20; ipv6 { + table dn42_table_v6; import none; export filter { if source = RTS_STATIC then reject; krt_prefsrc = DN42_OWNIPv6; accept; }; }; }; protocol kernel dn42_kernel_v4{ + vrf "dn42-vrf"; + kernel table 1042; scan time 20; ipv4 { + table dn42_table_v4; import none; export filter { if source = RTS_STATIC then reject; krt_prefsrc = DN42_OWNIP; accept; }; }; } For protocols other than kernel, add the VRF and the independent IPv4/IPv6 tables, but do not specify the system routing table number: protocol static dn42_static_v4{ + vrf "dn42-vrf"; route DN42_OWNNET reject; ipv4 { + table dn42_table_v4; import all; export none; }; } protocol static dn42_static_v6{ + vrf "dn42-vrf"; route DN42_OWNNETv6 reject; ipv6 { + table dn42_table_v6; import all; export none; }; } In summary: Configure a VRF and the previously defined routing tables for everything related to DN42. Only the kernel protocol needs the system routing table number specified; others do not. Apply the same method to BGP, OSPF, etc. However, I chose to use separate Router IDs for the clearnet and DN42, so a separate Router ID needs to be configured: # /etc/bird/dn42/ospf.conf protocol ospf v3 dn42_ospf_iyoroynet_v4 { + vrf "dn42-vrf"; + router id DN42_OWNIP; ipv4 { + table dn42_table_v4; import where is_self_dn42_net() && source != RTS_BGP; export where is_self_dn42_net() && source != RTS_BGP; }; include "ospf/*"; }; protocol ospf v3 dn42_ospf_iyoroynet_v6 { + vrf "dn42-vrf"; + router id DN42_OWNIP; ipv6 { + table dn42_table_v6; import where is_self_dn42_net_v6() && source != RTS_BGP; export where is_self_dn42_net_v6() && source != RTS_BGP; }; include "ospf/*"; }; # /etc/bird/dn42/ebgp.conf ... template bgp dnpeers { + vrf "dn42-vrf"; + router id DN42_OWNIP; local as DN42_OWNAS; path metric 1; ipv4 { + table dn42_table_v4; ... }; ipv6 { + table dn42_table_v6; ... }; } include "peers/*"; After completion, reload the configuration with birdc c. Now, we can view the DN42 routing table separately using ip route show vrf dn42-vrf: root@iYoRoyNetworkHKGBGP:~# ip route show vrf dn42-vrf 10.26.0.0/16 via inet6 fe80::ade0 dev dn42_4242423914 proto bird src 172.20.234.225 metric 32 10.29.0.0/16 via inet6 fe80::ade0 dev dn42_4242423914 proto bird src 172.20.234.225 metric 32 10.37.0.0/16 via inet6 fe80::ade0 dev dn42_4242423914 proto bird src 172.20.234.225 metric 32 ... You can also ping through the VRF using the -I dn42-vrf parameter: root@iYoRoyNetworkHKGBGP:~# ping 172.20.0.53 -I dn42-vrf ping: Warning: source address might be selected on device other than: dn42-vrf PING 172.20.0.53 (172.20.0.53) from 172.20.234.225 dn42-vrf: 56(84) bytes of data. 64 bytes from 172.20.0.53: icmp_seq=1 ttl=64 time=3.18 ms 64 bytes from 172.20.0.53: icmp_seq=2 ttl=64 time=3.57 ms 64 bytes from 172.20.0.53: icmp_seq=3 ttl=64 time=3.74 ms 64 bytes from 172.20.0.53: icmp_seq=4 ttl=64 time=2.86 ms ^C --- 172.20.0.53 ping statistics --- 4 packets transmitted, 4 received, 0% packet loss, time 3006ms rtt min/avg/max/mdev = 2.863/3.337/3.740/0.341 ms Important Notes If the VRF device is reloaded, all devices originally associated with the VRF need to be reloaded as well, otherwise they won't function correctly. Currently, DN42 cannot access services inside the host configured with VRF. A future article might explain how to allow traffic within the VRF to access host services (Adding to the TODO list). I learned from a friend that by setting net.ipv4.tcp_l3mdev_accept=1 and net.ipv4.udp_l3mdev_accept=1, it is possible to allow the listening sockets in the global space to accept connection requests from the VRF domain, thus achieving cross-vrf listening services. Reference Articles:: Run your MPLS network with BIRD
16/09/2025
154 Views
0 Comments
1 Stars
OneManISP - Ep.2 Announcing Our Own IP Prefix to the World
Preface In the previous article, we successfully registered an ASN and obtained an IPv6 address block. Now, we will announce this block to the world. Setting Up the Subnet Object in the RIPE Database It's important to note that the minimum IPv6 prefix allowed for announcement on the public internet is /48. This means if you only have a single /48 block, you cannot break it down into smaller segments. Therefore, I later leased a separate /40 block, intending to split it into multiple /48s for announcement. The IPv6 block I obtained is 2a14:7583:f200::/40, and I plan to split out 2a14:7583:f203::/48 for use with Vultr. If you don't need to split your block, please skip directly to the "Creating the Route Object" section. Splitting the Prefix First, go to Create "inet6num" object - RIPE Database and fill in the following: inet6num: The IP block you want to split out, in CIDR format. netname: Network name. country: The country to which the IP block belongs, must conform to the ISO 3166 standard (can be selected directly in the RIPE DB). admin-c: The primary key value of the Role object created earlier. tech-c: The primary key value of the Role object created earlier. status: Keep ASSIGNED This step splits a smaller /48 address block from your obtained allocation. Creating the Route Object Go to Create "route6" object - RIPE Database and fill in the following: route6: The IPv6 address block you intend to announce, in CIDR format. origin: The ASN you applied for, including the 'AS' prefix. This step declares that your ASN is permitted to use this address block for originating BGP routes. Applying for BGP Session with a VPS Provider This time I'm using a machine from Vultr. Their BGP Session setup is very beginner-friendly, with their own validation system. Furthermore, their upstream has good filters ensuring that incorrect route advertisements generally won't affect the public internet. (I forgot to take screenshots during my configuration, but you can refer to the section 申请 Vultr 的 BGP 广播功能 in Bao Shuo's article 年轻人的第一个 ASN for reference.) Go to BGP - Vultr.com, select Get Started, and fill in your ASN and IPv6 block information as required. For the LOA (Letter Of Authorization), you can refer to this template: LOA-template.docx (I rewrote one for individuals as most templates found online are for companies). After submission, the system will automatically create a ticket, and you will see your ASN and IP block in a pending verification state: Click Start, and the system will send a verification email to the abuse-mailbox email address registered with your Role object: The received email looks like this: The top link represents approving the authorization for Vultr to announce your IP block, and the bottom one is for disapproval. Click the top link, which will take you to Vultr's webpage: Then click Approve Announcement. Both the ASN and the IP block need to be verified once. Next, wait for the Vultr staff to review and complete the process. Then, in your VPS control panel, you will see the BGP tab, where you can find the upstream information: I must commend Vultr's ticket efficiency here; it took me an average of only about 10 minutes from creating the ticket requesting authorization to completion. (In contrast, the average weekday ticket response time at iFog GmbH was around 1 day, which is much slower in comparison). The process with other VPS providers is generally similar. You need to inform their staff of the ASN and IP block you want to announce. After verifying ownership, the staff will configure the corresponding BGP Session for you. Advertisement! You should have received the following information from your upstream: Upstream's ASN Upstream's IP address for the BGP Session (Optional) Password The operating system I use is Debian 12 Bookworm, using Bird2 as the routing software. I updated Bird2 to the latest version following the section "Update Bird2 to v2.16 or above" in this article. The upstream ASN Vultr gave me is 64515, the upstream BGP Session address is 2001:19f0:ffff::1, and the VPS's BGP Session address is 2001:19f0:0006:0ff5:5400:05ff:fe96:881f. My Bird2 configuration file is modified from the configuration file used in DN42: log syslog all; define OWNAS = 205369; define OWNIPv6 = 2a14:7583:f203::1; define OWNNETv6 = 2a14:7583:f203::/48; define OWNNETSETv6 = [ 2a14:7583:f203::/48+ ]; router id 45.77.x.x; protocol device { scan time 10; } function is_self_net_v6() { return net ~ OWNNETSETv6; } protocol kernel { scan time 20; ipv6 { import none; export filter { if source = RTS_STATIC then reject; krt_prefsrc = OWNIPv6; accept; }; }; }; protocol static { route OWNNETv6 reject; ipv6 { import all; export none; }; } template bgp upstream { local as OWNAS; path metric 1; multihop; ipv6 { import filter { if net ~ [::/0] then reject; accept; }; export filter { if is_self_net_v6() then accept; reject; }; import limit 1000 action block; }; graceful restart; } protocol bgp 'Vultr_v6' from upstream{ local 2001:19f0:0006:0ff5:5400:05ff:fe96:881f as OWNAS; password "123456"; neighbor 2001:19f0:ffff::1 as 64515; } A few noteworthy points: The import rule in the upstream template here rejects the default route. This prevents the routing table sent by the upstream from overwriting local default gateway routes and other routing information. If we have multiple BGP neighbors, this could cause detours or even routing loops. The upstream template specifies multihop (multihop;) because Vultr's BGP peer is not directly reachable. Without setting multihop, the BGP session would get stuck in the Idle state. If your BGP upstream is directly connected, you can omit this line or set it to direct;. After filling in the configuration file, run birdc configure to load the configuration. Run birdc show protocols to check the status. If all goes well, you should see the BGP session state as Established: At this point, you can take a break and wait for global routing convergence. After about half an hour, open bgp.tools and query your /48 block. You should see that it has been successfully received by the global internet, and you can see our upstream information: Next, we create a dummy interface on the VPS and assign a single IPv6 address from the block allocated for this machine. For example, I assigned 2a14:7583:f203::1 to my machine: ip link add dummy0 type dummy ip addr add 2a14:7583:f203::1/128 dev dummy0 Then, using your own PC, you should be able to ping this address, and traceroute will show the complete routing path: Thanks to Mi Lu for the technical support! Reference Articles: 自己在家开运营商 Part.2 - 向世界宣告 IP 段 (BGP Session & BIRD) 年轻人的第一个 ASN - 宝硕博客 BGPlayer 从零开始速成指北 - 开通 Vultr 的 BGP 广播功能 - AceSheep BGP (2) 在 Vultr 和 HE 使用自己的 IPV6 地址 - 131's Blog
20/08/2025
198 Views
0 Comments
2 Stars
OneManISP - Ep.1 Registering an ASN
Introduction This article documents my complete process of applying for an ASN through the RIPE NCC. The content is suitable for beginners. If you find any errors, please feel free to contact me via email, and I will correct them promptly. Now that we've learned the basic BGP concepts on DN42, it seems a bit of a waste not to play with the public internet, right? Basic Concepts Currently, the allocation of public ASN and IP resources is managed by five Regional Internet Registries (RIRs) worldwide: ARIN: Manages the North American region. RIPE NCC: Manages the European region. APNIC: Manages the Asia-Pacific region. LACNIC: Manages the Latin American region. AfriNIC: Manages the African region. RIRs do not provide services directly to end users. Instead, they allocate resources to Local Internet Registries (LIRs), which then assign them to end users. Of course, individual users can also register as an LIR, but this is generally not cost-effective. If you're willing to pay thousands of dollars in annual fees, then forget I said that. Among these, RIPE NCC is considered more friendly towards individual applications, followed by ARIN and APNIC. Compared to RIPE NCC, APNIC's fees are generally about 30% higher. Furthermore, RIPE NCC provides an online management system allowing users to modify information and check progress themselves, whereas with APNIC, you typically need to contact an LIR for changes. Overall, I chose to apply for an ASN through the RIPE NCC. The resources obtained (both ASN and IPs) are generally categorized into two types: PA (Provider Aggregatable) Resources: Belong to the LIR and are assigned for your use by the LIR. PI (Provider Independent) Resources: Belong to you directly. These are generally more expensive. Preparation Stage Choosing an LIR Search online for LIR Service to find many companies offering such services. Currently, RIPE NCC charges an annual administrative fee of 50 EUR for PI resources. This means the cost from an LIR for registering an ASN generally won't be lower than 50 EUR per year (approximately 60 USD at the time of writing). Here, I chose NoPKT LLC, recommended by peers. Their pricing is quite reasonable and includes a /48 block of PA IPv6 addresses with the ASN. The activation speed was also very fast – it only took half a day from submitting the required documents to getting the ASN. Preparing Documents Proof of Identity Individual: Provide an ID card or passport (I submitted photos of the front and back of my national ID card). Company: Provide a valid business license. If the applicant is a minor, usually written consent from their legal guardian is required, and the guardian must fulfill corresponding responsibilities. All submitted documents must be authentic and valid, and should be originals or notarized copies. Contact Information Postal Address: Used for registration in the RIPE Database. Technical Contact Email. Abuse Contact Email. Technical Justification Billing from a BGP-capable provider within the European region. Options include Vultr, BuyVM, iFog, V.PS, etc. Note: Vultr uses a post-payment system, generating invoices at the beginning of the month. If you need the documents ready quickly, consider other providers. ASNs of two upstream providers you plan to connect to. (In practice, the reviewers won't strictly verify the specific upstream ASNs you list. Therefore, you can fill in common, publicly known ASNs to make it look reasonable. Don't overthink it too much. You can even put mine.) Registering a RIPE DB Account and Creating Objects Go to the RIPE Database and register an account. For Chinese, it's recommended to use the 拼音(Pinyin) of your real legal name. Enabling 2FA is mandatory, so please install a TOTP app on your phone beforehand. Creating a Role Object and Maintainer Object Go to Create role and maintainer pair - RIPE Database to create a role object. Here, a 'role' is an abstract concept describing the contact information for a team, department, or functional role – it represents a role, such as NOC (Network Operations Center), Abuse Team, Hostmaster, etc. mntner: The identifier for the maintainer object. It can contain uppercase/lowercase letters, numbers, and -_. For example, I used IYOROY-MNT. role: The name for the role object. It can contain uppercase/lowercase letters, numbers, and ][)(._"*@,&:!'+/-. For example, I used IYOROY-NETWORK-NOC. address: The office address for this role. e-mail: The email address for this role. Click SUBMIT after filling out the form to create both the role object and the maintainer object. Please note the returned primary key name, which usually ends with -RIPE. You will need this for future modifications and submissions to the LIR. The maintainer object identifier here is conceptually different from the role object. The maintainer signifies who has the authority to maintain (create/modify/delete) objects in the database – it's the maintaining entity. The relationship between different concepts in the RIPE Database can be referenced in the diagram later in the article. Adding an Abuse Contact Mailbox Go to Query - RIPE Database and search for the primary key of the role you just created. You should find the entry you created. Click "Update Object" on the right. Click the plus sign (+) next to the email field to add an abuse-mailbox attribute and fill in your abuse contact email address: Click SUBMIT to save. Note: RIPE periodically checks if the abuse-mailbox is functional. Please ensure you provide a real, active email address. Creating an Organization Object The Organization object here is an abstraction of a legal entity or organization (company, university, ISP, individual user, etc.). It serves as the top-level ownership information for resource objects (like aut-num, inetnum, inet6num) in the RIPE Database. This means subsequent ASN and IP resources will be assigned to this Organization object. Go to Create Organization - RIPE Database and fill in the following information: organisation: A unique ID. Keep it as the default AUTO-1 to let RIPE NCC assign one. org-name: The name of the organization. For Chinese individuals, use your full name in Pinyin. address: Postal address. country: Country code, refer to ISO 3166. For China, use CN. e-mail: The organization's email address. admin-c / tech-c: Administrative and technical contact objects (referencing the role handle). abuse-c: Specifies the abuse contact (must be a role object linked to the abuse-mailbox in that role). mnt-ref: Specifies which maintainer(s) can create objects referencing this organisation. mnt-by: Specifies who can maintain this organisation object itself. Click SUBMIT after filling out the form and note the returned object identifier, which follows a format like ORG-XXXX-RIPE. If you need to make changes after submission, go to Query - RIPE Database and search for the previously noted Role primary key or the Organization object identifier to find the update option. Paying the LIR Fee and Submitting Documents Submit the following documents to your chosen LIR: Proof of Identity Full Name Address (recommended to match your ID document) Photos of the front and back of your ID card RIPE Database Information org: Organization object identifier as-name: AS Name admin-c: Primary key of the role object created earlier tech-c: Primary key of the role object created earlier abuse-c: Primary key of the role object created earlier nic-hdl: Primary key of the role object created earlier mnt-by: Name of the maintainer object created earlier Technical Justification VPS Bill/Invoice Upstream ASNs The LIR will likely ask you to add a mnt-ref attribute to your Organization object, pointing to the LIR's maintainer. This allows the LIR to assign the AS and IP resources to your Organization. Once the LIR reviews and approves your application, they will submit the request to RIPE. Then, it's a waiting game. Generally, it takes 3-5 working days to get your ASN. At this point, we have successfully registered our own ASN on the public internet. Supplement: Relationships Between Concepts in the RIPE Database graph LR %% ========== ORG Layer ========== subgraph Org["Organisation"] ORG["organisation\n(ORG-XXX-RIPE)"] end %% ========== Resource Layer ========== subgraph Resource["Resources"] INETNUM["inetnum\n(IPv4 Block)"] INET6NUM["inet6num\n(IPv6 Block)"] AUTNUM["aut-num\n(ASN)"] ASSET["as-set\n(ASN Set)"] end %% ========== Routing Layer ========== subgraph Routing["Routing"] ROUTE["route\n(IPv4 Route Announcement)"] ROUTE6["route6\n(IPv6 Route Announcement)"] end %% ========== Contact Layer ========== subgraph Contact["Contacts"] ROLE["role\n(Team/Function)\nnic-hdl"] PERSON["person\n(Individual)\nnic-hdl"] end %% ========== Authorization Layer ========== subgraph Maintainer["Authorization"] MNT["mntner\n(Maintainer)"] end %% ========== Contact Links ========== INETNUM --> ROLE INET6NUM --> ROLE AUTNUM --> ROLE ASSET --> ROLE ROUTE --> ROLE ROUTE6 --> ROLE ROLE --> PERSON %% ========== Organization Assignment ========== ORG --> INETNUM ORG --> INET6NUM ORG --> AUTNUM ORG --> ASSET %% ========== Authorization ========== ORG --> MNT INETNUM --> MNT INET6NUM --> MNT AUTNUM --> MNT ASSET --> MNT ROUTE --> MNT ROUTE6 --> MNT ROLE --> MNT PERSON --> MNT %% ========== Route Binding ========== ROUTE -->|origin| AUTNUM ROUTE6 -->|origin| AUTNUM %% ========== Route Scope ========== ROUTE -->|belongs to| INETNUM ROUTE6 -->|belongs to| INET6NUM Special thanks to Mi Lu for providing technical support and answering questions! Reference Articles: 自己在家开运营商 Part.1 - 注册一个 ASN - LYC8503 从0开始注册一个ASN并广播IP | Pysio's Home 青年人的第一个运营商:注册一个 ASN | liuzhen932 的小窝
18/08/2025
460 Views
0 Comments
2 Stars
1
2