Ever since moving to Pop!_OS 22.04 LTS wired networking have been very slow. I recalled seeing posts on this problem and the issue being the included driver for the RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller.
Performance was not only poor but downloads were much slower than uploads as evidenced by the Speedtest CLI package:
I’m not sure if this is an issue with Lenovo’s updating of the BIOS resetting the default boot back to Windows or if it is Windows itself. Since this has happened a couple of times, this is a reminder to myself on how to fix this:
Repair the bootloader – since even manually selecting Pop!_OS leads to a hang in UEFI
Well, despite my misgivings over the price of Netgate’s rack mount for the 4100/6100 firewall series, I finally broke down and bought one. I still think it is far overpriced even with the nice blue anodised aluminium. I really wish that Netgate had created a proper chassis with integrated power supply for the 4100/6100 series. Not that I would be buying a replacement even if they did given the cost. It does look okay; but, in reality, a standard rack shelf is good enough.
Next, notice anything else new in the rack:
Hint: Is isn’t HPE but from IBM…
My son has made fun of me for not having “one of those pull out screens” for the rack. I actually have been looking for some time on eBay but the prices for any that will ship to Canada is like cra-cra even for those that are broken, unknown working, missing cables, missing rails, etc. However, I found one that was being thrown out (keyboard damaged, monitor unknown) but it has the rails, the cable management arm and both the rails and arm were only slightly bent (i.e., fixable).
The keyboard was the “classic” ThinkPad keyboard with the TrackPoint and touch pad. But it has PS/2 connectors (oh-so-retro) that neither the DL360 G8 or DL380 Gen9 has and I didn’t have a proper PS/2 to USB adaptor (most are just electrical pass-thru which does not work – for me at least).
The solution was to pick up a Lenovo ThinkPad Compact USB Keyboard with TrackPoint. Not having a track pad doesn’t bother me as my servers just run in text mode. I did have to modify the drawer as the compact USB keyboard is about 1 cm narrower and if bumped would drop the drawer. The drawer is designed with the original keyboard hanging through the drawer and held down with a big piece of Velcro (as is the monitor’s power supply – simple and effective). I solved this with a piece of 3/8 thick backer board and drilling some countersunk screw holes. But no Velco because I don’t have any and I don’t think it is needed.
I am now a “proper” nerd with a rack KVM console. No KVM switch because the two I still have are HDMI, not VGA – but my two servers (and old pfSense box) use VGA. There is some interference on the monitor (it is only 15″) and it uses some weird interface that the power and VGA connect. It is pretty thin and uses a standard VESA mount. I’ve looked around but these seem hard to get and I can live with the interference for the limited times I use it. I mostly use the iLO but there are cases when I need to be at the console.
The old DL360 G8 is only used for testing. I did manage to get the pfSense box working again after replacing the RAM, reseating the mSATA drive and re-installing pfSense. But, with a server, my backup UniFi Switch24 and a firewall, hmmm… Opportunity?
It has been just over 17 days with my new Netgate 6100. Shipping was sort-of a day late – I guess because couriers seem to have difficulty with delivery times to the easterly part of North America. I purchased the “base” model with 8GB RAM and 16GB of storage. I’m not worried about the storage since all the logs go to another syslog server anyway. And I’m not running a branch office or anything this is more than sufficient for my needs.
The migration from the old commodity-based router was fairly easy. I booted up the new router with my workstation connected to my planned “LAN” interface (not really as my configuration has a few VLANs, etc.) and the planned WAN interface into, well, the WAN, I allowed the 6100 to update to pfSense+ 23.01 (the most current). After looking up the network interface names from Netgate’s awesome documentation you only have to edit the XML file to replace the old network interface names with the new interface names. Then you restore the suitably modified backup file. Some additional time is needed to bring the additional services such as OpenVPN, pfBlockerNG, etc. to download and update.
The only problem I had was that despite adding the MAC address to my ISP’s router’s Advanced DMZ configuration inbound access was not working. After checking – and double checking – my configurations such as “Did I enter the correct MAC from the 6100?” I fell back on the old, default IT help desk recommendation… I rebooted and all was working again.
What I like about the 6100:
Longer-term futureproofing: I now have 10GbE interfaces if I go above a 1GbE WAN connection and/or upgrade to 10GbE internally. The four “LAN” ports are actually 2.5GbE so more room there, too.
pfSense is fully supported by Netgate on known hardware: Less worries about upgrades going wrong.
Price: The price is essentially the same for a generic router with two 10GbE SFP+ ports, two copper/SFP shared ports plus four 2.5GbE ports – assuming you can actually find this configuration.
What I do not like:
Having to buy a 1U adaptor. The price of US$107 is something I really do not like.
I guess buying the not-so-cheap Alibaba-ish 1u rack router was not a totally great idea. Something failed and I think that it is network ports. All the network ports. All six of them.
Anyway, I have a Netgate 6100 ordered with one-day testing and express shipping by FedEd. One day shipping – I wonder how that will work out coming from the US – was only $7 more.
Once I get it in, I hope that simple editing of the backup file to line up the new interface names with the port assignments will put everything back to normal. The one thing I think I’m going to miss is the VGA output as it makes set up so easy. I wonder how the console connection will work out.
Not much to do now but wait. With pfSense down, there are no VLAN/subnets, DHCP, DNS, access any network resources like the NAS, etc.
Here’s hoping that everyone had a Merry Christmas (or whichever holiday you celebrate!) and wishing everyone a happy, healthy and prosperous 2023!
Since my last post back in mid-October, I had my thoughts on what my next upgrade(s) would be – and I changed my mind. The DL360G Gen8 is, frankly, too noisy. It is great for testing things out, but with hybrid work it is a distraction at best and maddenly irritation at worst. Thus, buying the UniFi Aggregation Switch would serve no purpose – for now.
The other driving factor is that my “work” laptop, an old Lenovo Y50-70, started getting far too flaky. I think that there may be some cold solder joints but I am not set up to fix them. And it is getting old. I first thought that all the the Ubuntu 22.10 upgrade had gone sideways. There had been a lot of upgrades from Ubuntu 16.04 and I don’t limit myself to the LTS releases. When I went to do a fresh 22.10 install and the install would fail with not being able to find the Samsung Evo 850 SSD. I put the SSD in the MiniG3 and the Evo worked fine. The Samsung SSD from the MiniG3 showed the same issues in the Y50-70. After valiant service, I decided that the Y50 had to be put out to pasture.
I decided to replace it with my IdeaPad L340 (my now-old gaming laptop) with a minimal Windows 11 install – jury’s still out on Windows 11, but it seems to be incrementally improving; maybe Windows 12 will fix it 🙂 – for some specific Windows things. Most of the time it will be booted in the Ubuntu.
Of course, that meant the L340 needed a replacement. The Black Friday/Cyber Monday week sales were on and I decided on a Lenovo Legion 5 AMD. It is running an AMD Ryzen 5 6600H, RTX3060, 16GB DDR5 dual-channel RAM and a 512GB PCIe Gen4 SSD. This is my first AMD system in, what?, 30 years. My last one was an AM386-DX40. Besides, my son had decided he wanted to build his own gaming rig using the Ryzen 5 6600 🙂 It is a nice laptop – runs quick, battery life for web browsing, YouTube is about 4-5 hours (with a 80% “full” using Lenovo’s battery conservation).
What’s next? Not sure yet. I’m still waiting for ArmA 4…
I picked up an HP EliteDesk 705 G3 Mini PC as the potential third node for a Proxmox cluster. While I wouldn’t be using HA, I did not want to cause potential problems with a tied quorum vote. For under CDN$100 I got an AMD Pro A10-8770E, 8GB RAM and a 128GB (Samsung OEM) SSD. It is really small and uses next to no power. And, it has no noisy fans.
While the jury is still out on this approach because the DL360 G8’s fans are so annoying, I still installed Proxmox on the 705. Part of this was because I wanted to experiment with only having one NIC and using VLANs. Setting up the VLANs was no real issue – just a little more fooling around on the command line manually configuring the admin interface to work on a VLAN and allow the other VLAN bridges to be available. However, this really messed with my head as I would have the network interfaces drop offline when under heavy load. And sometimes for what seemed to be no reason at all! Of course, since this was my first time using a single NIC, I was placing the blame on me not setting up the VLANs correctly.
Maybe the issue was Proxmox (and Debian?). So, I put on Ubuntu 22.04 server and desktop as well as Linux Mint 21 to test that theory out. No VLANs, just a “normal” installation. Same issue: under load the NIC would go offline and the console (for server) would sow my field with salt – or rather a bunch of errors. Since this occurred with and without VLANs, the error had to be with something other than my VLAN configuration.
After much digging, there seems to be a (longterm?) nasty kernel bug with tg3 and the Broadcom BCM5762 NIC.
The solution that worked for me was to add iommu=pt to /etc/default/grub:
The just run update-grub and reboot. Problem fixed.
I read other suggestions on blacklisting tg3 in /etc/modprobe.d as that was the issue but that did not work for me.
A couple of notes on where I am with my ESXi to Proxmox move.
What I have done:
I finally bit the bullet and deleted all of my iSCSI LUNs for ESXi – they were simply wasting space at this point.
I also moved from iSCSI to NFS for my Proxmox VMs. The main difference between iSCSI and NFS is that iSCSI shares data on the block level, and NFS shares data on the file level. Performance is almost the same, but, in some situations, iSCSI can provide better results. For my purposes NFS is fine.
And NFS has the benefit that I can access the VMs on the NAS in Synology File Station. There is a level of comfort in being able to “see” all of my VMs without having to set up an iSCSI initiator.
I haven’t worked on network bonding for the NFS 10 GbE connections. Frankly, I do not think I will see any benefits and “added complexity equals added problems.”
Both my “production” (PROD) Proxmox server (DL380 G9) and my “development” (DEV) Proxmox server (DL360 G8) can access the same NFS shares. That means that my ISOs are shared as well as my Proxmox backups. This also means that I can create a VM on the DEV box and, by backing up the VM to the shared backup location, restore it to the PROD box. While not the smoothest method, it is a rather neat workaround. I do have to be careful about starting any VMs that are visible to both servers, otherwise “bad things” would happen.
I have moved a number of full VMs to LXC containers. Things like my reverse proxy, a DNS server, Home Assistant (I don’t really have any devices for this yet but with Z-Wave it will give me something to do 🙂 ), and Pi-Hole. There is really no need to have a full blown VM for these simple services. Proxmox makes it easy.
Pi-Hole – not part of Proxmox (other that it runs in an LXC container) but I finally got around to setting up Pi-Hole. A couple of additional configs were needed as Pi-Hole sits on one subnet while my wired and wireless devices sit on other subnets to get it working for my clients. And, WOW!, the amount of ads, etc. that are blocked are crazy. Here are the last 24 hour stats – 51.4% blocked requests?!?!?!
The To-Do (or think-about) list:
Adding a Unifi Link Aggregation switch – while both Proxmox nodes are connected directly to the Synology NAS, they cannot connect directly to each other for storage. One node is using the 172.16.10.0/24 subnet and the other 18.104.22.168/24 with one of the two10 GbE port of the NAS assigned to each. With the switch I can have them both on the same subnet (yes, I know that I could have “broken” the 172.16.10/0 subnet in two using /25).
Adding a “one litre” PC (such as a HP EliteDesk 705 G3) with Proxmox installed and setting up the DL360 and DL380 as a cluster using the 705 to complete the quorum. Allowing the 705 (or whatever) access to storage may be beneficial, but the 705 does not have 10 GbE. I *think* that the main nodes need to have storage on the same subnet to work correctly (I haven’t read up on that, yet). The jury is still out on that given the cost of electricity, heat (not so bad in winter) and noise (the rack sits behind me in my office).
Sometimes when you get to the end of the dock, you want to jump into the water. After getting comfortable (enough) with Proxmox I took the plunge and deleted ESXi 7 from my DL380 Gen9 and moved all of my VMs from the DL360G8 test server. Everything seems to work just fine.
I did not put the change the Smart Array P440ar Controller into IT mode. I left it in HBA mode with one mirror for Proxmox and another mirror for local ISOs. Now, that is a little overkill for the ISOs given they are enterprise SSDs. The other reason why I am not worried about HBA mode is that my VMs are on one of my NASes.
I did leave the (unconverted) ESXi VMs on the NAS and have the ESXi configuration backed up in case I need to revert. I like to have a solid fallback plan…
There is one major last thing I have to sort: I cannot get the bonded 10 GbE connections between the the DL380 and the RS1221+ to be stable. I am pretty sure that this is something with the bonding method I selected. There is no switch between the two – it is just NIC-to-NIC (times two). For now, I will let this sit for awhile.
Overall, I am impressed with Proxmox. VMs seem snappier that under ESXi. There are some quirks: there is no file manager as ESXi (really nice to have), the need to have different storage types for different purposes (head scratch) and the fact if you use LVM for iSCSI you cannot take snapshots (what?!?!).
I took the plunge over the last few nights and migrated all of my VMs from ESXi to Proxmox. Outside of the previously mentioned updates to the NIC configurations, everything went fine. You will need to follow the instructions for Windows guests per the instructions (e.g., make sure you manually modify the Windows’ pve file and change scsi0 to sata0). Once you do that, you will be able to boot successfully. You will also have to add the virtio drivers as well. This is a manual download outside of Proxmox.
I have the ESXi server shutdown so I can restart that if necessary. The only thing I will have to do if I need to go back to ESXi is to get the weather station data.
To do list:
Research how to use the 10GbE links working between the DL380G9 and the RS1221+. I’ll be taking my time with this one as I do not want to change the ESXi server to Proxmox until I have some more time with Proxmox and feel comfortable.
Get syslog sending to my DS216+II. I just started looking into this and it seems that, unlike ESXi, I need to do this at the underlying Proxmox Debian 11 operating system. Remember that ESXi (and XCP-NG) are Type 1 hypervisors and Proxmox is a Type 2 hypervisor.