Options for the bonding driver are supplied as parameters to the bonding module at load time, or are specified via sysfs.

Module options may be given as command line arguments to the insmod or modprobe command, but are usually specified in either the /etc/modules.conf or /etc/modprobe.conf configuration file, or in a distro-specific configuration file (some of which are detailed in the next section).

Details on bonding support for sysfs is provided in the “Configuring Bonding Manually via Sysfs” section, below.

The available bonding driver parameters are listed below. If a parameter is not specified the default value is used. When initially configuring a bond, it is recommended “tail -f /var/log/messages” be run in a separate window to watch for bonding driver error messages.

It is critical that either the miimon or arp_interval and arp_ip_target parameters be specified, otherwise serious network degradation will occur during link failures. Very few devices do not support at least miimon, so there is really no reason not to use it.

Options with textual values will accept either the text name or, for backwards compatibility, the option value. E.g., “mode=802.3ad” and “mode=4” set the same mode.

Linux Bonding Options:

arp_interval

Specifies the ARP link monitoring frequency in milliseconds.

The ARP monitor works by periodically checking the slave devices to determine whether they have sent or received traffic recently (the precise criteria depends upon the bonding mode, and the state of the slave). Regular traffic is generated via ARP probes issued for the addresses specified by the arp_ip_target option.

This behavior can be modified by the arp_validate option, below.

If ARP monitoring is used in an etherchannel compatible mode (modes 0 and 2), the switch should be configured in a mode that evenly distributes packets across all links. If the
switch is configured to distribute the packets in an XOR fashion, all replies from the ARP targets will be received on the same link which could cause the other team members to
fail. ARP monitoring should not be used in conjunction with miimon. A value of 0 disables ARP monitoring. The default value is 0.

arp_ip_target

Specifies the IP addresses to use as ARP monitoring peers when arp_interval is > 0. These are the targets of the ARP request sent to determine the health of the link to the targets.
Specify these values in ddd.ddd.ddd.ddd format. Multiple IP addresses must be separated by a comma. At least one IP address must be given for ARP monitoring to function. The
maximum number of targets that can be specified is 16. The default value is no IP addresses.

arp_validate

Specifies whether or not ARP probes and replies should be validated in the active-backup mode. This causes the ARP monitor to examine the incoming ARP requests and replies, and only consider a slave to be up if it is receiving the appropriate ARP traffic.

Possible values are:

none or 0

No validation is performed. This is the default.

active or 1

Validation is performed only for the active slave.

backup or 2

Validation is performed only for backup slaves.

all or 3

Validation is performed for all slaves.

For the active slave, the validation checks ARP replies to confirm that they were generated by an arp_ip_target. Since backup slaves do not typically receive these replies, the validation performed for backup slaves is on the ARP request sent out via the active slave. It is possible that some switch or network configurations may result in situations
wherein the backup slaves do not receive the ARP requests; in such a situation, validation of backup slaves must be disabled.

This option is useful in network configurations in which multiple bonding hosts are concurrently issuing ARPs to one or more targets beyond a common switch. Should the link between the switch and target fail (but not the switch itself), the probe traffic generated by the multiple bonding instances will fool the standard ARP monitor into considering the links as still up. Use of the arp_validate option can resolve this, as
the ARP monitor will only consider ARP requests and replies associated with its own instance of bonding.

This option was added in bonding version 3.1.0.

downdelay

Specifies the time, in milliseconds, to wait before disabling a slave after a link failure has been detected. This option is only valid for the miimon link monitor. The downdelay
value should be a multiple of the miimon value; if not, it will be rounded down to the nearest multiple. The default
value is 0.

fail_over_mac

Specifies whether active-backup mode should set all slaves to the same MAC address (the traditional behavior), or, when enabled, change the bond’s MAC address when changing the active interface (i.e., fail over the MAC address itself).

Fail over MAC is useful for devices that cannot ever alter their MAC address, or for devices that refuse incoming broadcasts with their own source MAC (which interferes with
the ARP monitor).

The down side of fail over MAC is that every device on the network must be updated via gratuitous ARP, vs. just updating a switch or set of switches (which often takes place for any traffic, not just ARP traffic, if the switch snoops incoming traffic to update its tables) for the traditional method. If the gratuitous ARP is lost, communication may be disrupted.

When fail over MAC is used in conjuction with the mii monitor, devices which assert link up prior to being able to actually transmit and receive are particularly susecptible to loss of
the gratuitous ARP, and an appropriate updelay setting may be required.

A value of 0 disables fail over MAC, and is the default. A value of 1 enables fail over MAC. This option is enabled automatically if the first slave added cannot change its MAC address. This option may be modified via sysfs only when no slaves are present in the bond.

This option was added in bonding version 3.2.0.

lacp_rate

Option specifying the rate in which we’ll ask our link partner to transmit LACPDU packets in 802.3ad mode. Possible values
are:

slow or 0
Request partner to transmit LACPDUs every 30 seconds

fast or 1
Request partner to transmit LACPDUs every 1 second

The default is slow.

max_bonds

Specifies the number of bonding devices to create for this instance of the bonding driver. E.g., if max_bonds is 3, and the bonding driver is not already loaded, then bond0, bond1 and bond2 will be created. The default value is 1.

miimon

Specifies the MII link monitoring frequency in milliseconds. This determines how often the link state of each slave is inspected for link failures. A value of zero disables MII
link monitoring. A value of 100 is a good starting point. The use_carrier option, below, affects how the link state is determined. See the High Availability section for additional information. The default value is 0.

 

mode

Specifies one of the bonding policies. The default is
balance-rr (round robin). Possible values are:

balance-rr or 0

Round-robin policy: Transmit packets in sequential order from the first available slave through the last. This mode provides load balancing and fault tolerance.

active-backup or 1

Active-backup policy: Only one slave in the bond is active. A different slave becomes active if, and only if, the active slave fails. The bond’s MAC address is
externally visible on only one port (network adapter) to avoid confusing the switch.

In bonding version 2.6.2 or later, when a failover occurs in active-backup mode, bonding will issue one or more gratuitous ARPs on the newly active slave.
One gratuitous ARP is issued for the bonding master interface and each VLAN interfaces configured above it, provided that the interface has at least one IP
address configured. Gratuitous ARPs issued for VLAN interfaces are tagged with the appropriate VLAN id.

This mode provides fault tolerance. The primary option, documented below, affects the behavior of this
mode.

balance-xor or 2

XOR policy: Transmit based on the selected transmit hash policy. The default policy is a simple [(source  MAC address XOR’d with destination MAC address) modulo slave count]. Alternate transmit policies may be selected via the xmit_hash_policy option, described below.

This mode provides load balancing and fault tolerance.

broadcast or 3

Broadcast policy: transmits everything on all slave interfaces. This mode provides fault tolerance.

 

802.3ad or 4

IEEE 802.3ad Dynamic link aggregation. Creates aggregation groups that share the same speed and duplex settings. Utilizes all slaves in the active aggregator according to the 802.3ad specification.

Slave selection for outgoing traffic is done according to the transmit hash policy, which may be changed from the default simple XOR policy via the xmit_hash_policy option, documented below. Note that not all transmit
policies may be 802.3ad compliant, particularly in regards to the packet mis-ordering requirements of section 43.2.4 of the 802.3ad standard. Differing
peer implementations will have varying tolerances for noncompliance.

Prerequisites:

1. Ethtool support in the base drivers for retrieving the speed and duplex of each slave.

2. A switch that supports IEEE 802.3ad Dynamic link aggregation.

Most switches will require some type of configuration to enable 802.3ad mode.

balance-tlb or 5

Adaptive transmit load balancing: channel bonding that does not require any special switch support. The outgoing traffic is distributed according to the
current load (computed relative to the speed) on each slave. Incoming traffic is received by the current slave. If the receiving slave fails, another slave
takes over the MAC address of the failed receiving slave.

Prerequisite:

Ethtool support in the base drivers for retrieving the speed of each slave.

balance-alb or 6

Adaptive load balancing: includes balance-tlb plus receive load balancing (rlb) for IPV4 traffic, and does not require any special switch support. The receive load balancing is achieved by ARP negotiation.
The bonding driver intercepts the ARP Replies sent by the local system on their way out and overwrites the source hardware address with the unique hardware
address of one of the slaves in the bond such that different peers use different hardware addresses for the server.

Receive traffic from connections created by the server is also balanced. When the local system sends an ARP Request the bonding driver copies and saves the peer’s IP information from the ARP packet. When the ARP Reply arrives from the peer, its hardware address is retrieved and the bonding driver initiates an ARP
reply to this peer assigning it to one of the slaves in the bond. A problematic outcome of using ARP negotiation for balancing is that each time that an ARP request is broadcast it uses the hardware address of the bond. Hence, peers learn the hardware address of the bond and the balancing of receive traffic
collapses to the current slave. This is handled by sending updates (ARP Replies) to all the peers with their individually assigned hardware address such that
the traffic is redistributed. Receive traffic is also redistributed when a new slave is added to the bond and when an inactive slave is re-activated. The receive load is distributed sequentially (round robin) among the group of highest speed slaves in the bond.

When a link is reconnected or a new slave joins the bond the receive traffic is redistributed among all active slaves in the bond by initiating ARP Replies
with the selected MAC address to each of the clients. The updelay parameter (detailed below) must be set to a value equal or greater than the switch’s
forwarding delay so that the ARP Replies sent to the peers will not be blocked by the switch.

Prerequisites:

1. Ethtool support in the base drivers for retrieving the speed of each slave.

2. Base driver support for setting the hardware address of a device while it is open. This is required so that there will always be one slave in the
team using the bond hardware address (the curr_active_slave) while having a unique hardware address for each slave in the bond. If the
curr_active_slave fails its hardware address is swapped with the new curr_active_slave that was chosen.

primary

A string (eth0, eth2, etc) specifying which slave is the primary device. The specified device will always be the active slave while it is available. Only when the primary is
off-line will alternate devices be used. This is useful when one slave is preferred over another, e.g., when one slave has higher throughput than another.

The primary option is only valid for active-backup mode.

updelay

Specifies the time, in milliseconds, to wait before enabling a slave after a link recovery has been detected. This option is only valid for the miimon link monitor. The updelay value should be a multiple of the miimon value; if not, it will be rounded down to the nearest multiple. The default value is 0.

use_carrier

Specifies whether or not miimon should use MII or ETHTOOL ioctls vs. netif_carrier_ok() to determine the link status. The MII or ETHTOOL ioctls are less efficient and
utilize a deprecated calling sequence within the kernel. The netif_carrier_ok() relies on the device driver to maintain its state with netif_carrier_on/off; at this writing, most, but
not all, device drivers support this facility.

If bonding insists that the link is up when it should not be, it may be that your network device driver does not support netif_carrier_on/off. The default state for netif_carrier is “carrier on,” so if a driver does not support netif_carrier, it will appear as if the link is always up. In this case, setting use_carrier to 0 will cause bonding to revert to the
MII / ETHTOOL ioctl method to determine the link state.

A value of 1 enables the use of netif_carrier_ok(), a value of 0 will use the deprecated MII / ETHTOOL ioctls. The default value is 1.

xmit_hash_policy

Selects the transmit hash policy to use for slave selection in balance-xor and 802.3ad modes. Possible values are:

layer2

Uses XOR of hardware MAC addresses to generate the hash. The formula is

(source MAC XOR destination MAC) modulo slave count This algorithm will place all traffic to a particular network peer on the same slave.

This algorithm is 802.3ad compliant.

layer2+3

This policy uses a combination of layer2 and layer3 protocol information to generate the hash.

Uses XOR of hardware MAC addresses and IP addresses to generate the hash. The formula is

(((source IP XOR dest IP) AND 0xffff) XOR ( source MAC XOR destination MAC ))
modulo slave count

This algorithm will place all traffic to a particular network peer on the same slave. For non-IP traffic, the formula is the same as for the layer2 transmit hash policy.

This policy is intended to provide a more balanced distribution of traffic than layer2 alone, especially in environments where a layer3 gateway device is
required to reach most destinations.

This algorithm is 802.3ad complient.

layer3+4

This policy uses upper layer protocol information, when available, to generate the hash. This allows for traffic to a particular network peer to span multiple
slaves, although a single connection will not span multiple slaves.

The formula for unfragmented TCP and UDP packets is

((source port XOR dest port) XOR ((source IP XOR dest IP) AND 0xffff)
modulo slave count

For fragmented TCP or UDP packets and all other IP protocol traffic, the source and destination port information is omitted. For non-IP traffic, the
formula is the same as for the layer2 transmit hash policy.

This policy is intended to mimic the behavior of certain switches, notably Cisco switches with PFC2 as well as some Foundry and IBM products.

This algorithm is not fully 802.3ad compliant. A single TCP or UDP conversation containing both fragmented and unfragmented packets will see packets
striped across two interfaces. This may result in out of order delivery. Most traffic types will not meet this criteria, as TCP rarely fragments traffic, and
most UDP traffic is not involved in extended conversations. Other implementations of 802.3ad may or may not tolerate this noncompliance.

The default value is layer2. This option was added in bonding version 2.6.3. In earlier versions of bonding, this parameter does not exist, and the layer2 policy is the only policy. The layer2+3 value was added for bonding version 3.2.2.

Free Web Hosting