XenServer 5.5 / 5.6 – Poor Network Performance with Broadcom NICs
For some unknown reason a standard install of XenServer 5.5 or 5.6 on servers with Broadcom based NICs suffer poor network performance within VMs. There have been various suggested fixes, for Windows guests some people suggest simply adding the ‘DisableTaskOffload’ with a value of 1 in HKLM\Software\System\CurrentControlSet\Services\TCPIP\Parameters but I have found this to be insufficient.
Through trial and error I have found the following combination resolves the error, certainly on Dell M610 blades:
- Windows VMs – set the DisableTaskOffload value to 1 (location as above)
- Disable TCP offload on all physical NICs in each XenServer host. From the command line run the following:
- xe pif-list host-name-label=<hostname> (make a note of the UUID for each physical interface)
- For each UUID run the following commands:
- xe pif-param-set other-config:ethtool-tx=”off” uuid=<physical interface>
- xe pif-param-set other-config:ethtool-rx=”off” uuid=<physical interface>
- Compile the Broadcom driver from the latest source.
- Download the latest Broadcom driver from www.broadcom.com and then the XenServer DDK (this is specific to the version of XenServer you are running) from MyCitrix. Note to download the DDK you will need to login.
- After downloading the DDK ISO expand it using 7-Zip then import the appliance using XenCentre. This will import the DDK VM, all you need to do is ensure that you have a DHCP server running on the network you map the VM to.
- Expand the ZIP file downloaded from Broadcom and upload the contents using WinSCP to /root
- Logon to the XenServer console then expand the tar.gz file, for example tar -xzvf <tar file name>
- Traverse the resulting directory structure, for example cd netxtreme2-5.2.55/bnx2-2.0.8e/src
- Whilst in the src directory run make clean then make build
- Using WinSCP copy the bnx2.ko file to your PC.
- Finally connect to each XenServer using WinSCP and switch to /lib/modules/<<<2.6xen>>>/kernel/drivers/net
- Rename the existing bnx2.ko to bnx2.ko.old
- Drag and drop the bnx2.ko file on your PC to /lib/modules/<<<2.6xen>>>/kernel/drivers/net
- From the XenServer console switch to /lib/modules/<<<2.6xen>>>/kernel/drivers/net and run chmod 0744 bnx2.ko
- Migrate any VMs to other hosts then reboot!
Interestingly I have encountered this issue with vanilla Xenserver 5.5 and 5.6, on each occasions the fix above worked. What is perhaps more interesting is that I have grabbed the latest source from Broadcom on each occasion and the recompiled driver worked – implying it’s not a one-off version issue with the Citrix supplied module. Comparing the the module sizes also show the Citrix supplied version is around 200Kb whilst the version I compiled weighed in at 500Kb – it makes you wonder what the difference is
Follow these instructions at your own risk! They work for me but could kill your Xenserver, you have been warned!!!
Hi there excellent post just wondering how the relibaility of the nevironmnet has been since you recompiled the driver and disbaled tcpip offload ?
Also what version driver from Broadcom did you recomple ?
Since using the recompiled driver it works without issue, performance is exactly as we’d expect. The XenServer hosts have not been rebooted since so we don’t seem to have any memory leaks or other issues. We used the 2.0.8e which was the latest version available from http://www.broadcom.com. I seem to recall it was only a couple of minor revisions ahead of the Dell supplied, without looking I think the supplied version was 2.0.8b. However, on the basis we encountered exactly the same issue with 5.5 and 5.6 it would imply the version of the broadcom source is somewhat irrelevant but rather Dells packaging/customisation process introduces a bug which is removed by compiling from the vanilla source.
Mate,
I have used some of your content and wrote my own blog. Let me know if you like or remove it
http://vikashkumarroy.blogspot.com/2010/06/update-nic-driver-on-xenserver50.html
Thanks,
Vikash
I came across your post in my many searches for a resolution on the internet. I’ve posted on the Citrix forums and thus far haven’t received a response that seems to fix my issue. Although they’ve all suggested what you’re suggesting, my post was originally for performance issues in my Win2K3 guests in regards to high disk read latency. Does this kind of fix to the NIC cards really have any bearing on something like disk read latency? I’m rather green with XenServer and was actually pondering paying for support when I was told by the majority not to.
Thanks in advance for your response.
Hi Mark,
In the first instance recompiling the NIC driver will not improve Disk I/O performance. If your SRs are local then I would suggest it will have no benefit. However if your storage is NFS/iSCSI based then your disk accessed via the NICs and hence any performance issues with the NICs can/will impact disk I/O. So, if you use NFS or iSCSI (sw initiator) I would suggest it is at least worth testing the recompilation of the NIC drivers.
Regards,
Nathan.
Hi, the command you have used for Xenserver on disables checksum offloading shouldn’t you the using the ethtool tso parameter instead?
@James
You may have a point as the xen vs. windows config does not tally but this combination certainly works for me and others