Fortigate Firewall HA

Commands:

get system status | grep HA

get system ha status

diagnose sys ha status

diag sys ha dump-by group << (see the uptime)

diag sys ha reset-uptime << Trigger failover, when override is disable (default)

exe ha manage [0 | 1] admin << Connect to standby unit

Another way to manual failover.

exe ha failover status

exe ha failover unset 1

exe ha failover set 1

2 to 4 FortiGate devices to form a cluster

A cluster includes one device that acts as the primary FortiGate (also called the active FortiGate). The primary synchronizes its configuration, session information, FIB entries, FortiGuard definitions, and other operation-related information to the secondary devices, which are also known as standby devices.

The cluster shares one or more heartbeat interfaces among all devices—also known as members—for

synchronizing data and monitoring the health of each member.

In either of A-P (active-passive) or A-A (active-active) HA operation modes, the operation information (configuration, sessions, FIB entries, and so on) of the primary FortiGate is synchronized with secondary devices.

In A-P mode, the primary FortiGate is the only FortiGate that actively processes traffic. Secondary FortiGate devices remain in passive mode, monitoring the status of the primary device.

If a problem is detected on the primary FortiGate, one of the secondary devices takes over the primary role.

Like A-P HA, in A-A HA, the operation-related data of the primary FortiGate is synchronized to the secondary FortiGate devices. Also, if a problem is detected on the primary device, one of the secondary devices takes over the role of the primary, to process the traffic.

However, one of the main differences from active-passive mode is that in active-active mode, all cluster

members can process traffic. That is, based on the HA settings and traffic type, the primary FortiGate can distribute sessions to the secondary devices.

FortiGate HA offers several solutions for adding redundancy in the case where a failure occurs on the FortiGate, or is detected by the FortiGate through monitored links, routes, and other health checks.

FortiGate Clustering Protocol (FGCP)

All FortiGates in the cluster must be the same model, same firmware installed, and same hardware configuration (such as the same number of hard disks).

Critical cluster components

The following are critical components in an HA cluster:

Heartbeat connections: members will use this to communicate with each other. In general, a two-member cluster is most common. We recommend double back-to-back heartbeat connections.

Identical connections for internal and external interfaces: as demonstrated in the topology, we recommend similar connections from each member to the switches for the cluster to function properly.

The following are best practices for general cluster operation:

Ensure that heartbeat communication is present.
Enable the session synchronization option in daily operation.
Monitor traffic flowing in and out of the interfaces.

FortiGate Session Life Support Protocol (FGSP)

The external load balancers or routers can distribute sessions among the FortiGates and the FGSP performs session synchronization

VRRP

FortiGates can function as primary or backup Virtual Router Redundancy Protocol (VRRP) routers. The FortiGates can quickly and easily integrate into a network that has already deployed VRRP.

Failover

FGCP provides failover protection in the following scenarios:

The active device loses power.
A monitored interface loses a connection.

Synchronizing the configuration

The following settings are not synchronized between cluster units:

The FortiGate host name
GUI Dashboard widgets
HA override
HA device priority
The virtual cluster priority
The HA priority setting for a ping server (or dead gateway detection) configuration
The system interface settings of the HA reserved management interface
The HA default route for the reserved management interface, set using the ha-mgmt-interface-gateway
option of the config system ha command

Active/Passive HA

2-4 devices
1 (prefer 2) HA link

Summary:

A-P HA only has one IP for each interface pair, active on the primary unit, when failover occurs, IP and virtual MAC address are moved to new primary unit.
Primary is the active (master, work) firewall, Secondary is the standby firewall.

HB link is between the two units.

By default, device has higher uptime (more than 5 minutes) is primary.

To prefer an unit as the primary whenever possible, we can enable Override and set higher priority on this unit.,

To check Device Priority before check uptime: enable HA Override and set Device Priority, this is not recommended, (Override is Preempt). HA Uptime is the device uptime difference when two units join the cluster, so one is always 0, another is only taken account when it is bigger than 300 seconds. HA uptime is NOT system uptime.

HA Override is disabled (by default), when two new units boot up at almost same time to build a cluster, Device Priority determine the primary unit; When one unit in cluster is failed and a replacement unit joins the cluster, the existing unit normally has longer uptime, so the replacement unit will join as Secondary.

in A-A HA, only sessions that are subject to proxy inspection are distributed to secondary devices. If you want to force the distribution of sessions that are subject to flow inspection or no inspection at all, then you must enable the load-balance-all setting under HA configuration—this setting is disabled by default.

Check HA uptime:

diag sys ha dump-by group

diag sys ha dump-by vcluster

'FGVMEVB0DJ6EXX57': ha_prio/o=0/0, link_failure=0, pingsvr_failure=0, flag=0x00000001, mem_failover=0, uptime/reset_cnt=0/0
'FGVMEVWNBN-LQQ11': ha_prio/o=1/1, link_failure=0, pingsvr_failure=0, flag=0x00000000, mem_failover=0, uptime/reset_cnt=7/0

0 is the device with lowest HA uptime. another one in this example has HA uptime 7 seconds longer.

If a monitoring interface fails, or a member reboots, the HA uptime for that member is reset to 0.

Command on Primary FW to reset HA uptime will trigger a manual failover

diag sys ha reset-uptime

Check system uptime:

get system performance status

===CLI======

1. Configure the first unit

FortiGate-VM64-KVM # config system ha
FortiGate-VM64-KVM (ha) # set group-id 10
FortiGate-VM64-KVM (ha) # set group-name FORTI_HA
FortiGate-VM64-KVM (ha) # set mode a-p
FortiGate-VM64-KVM (ha) # set password 123456
FortiGate-VM64-KVM (ha) # set hbdev port4 0
FortiGate-VM64-KVM (ha) # set priority 200
FortiGate-VM64-KVM (ha) # set monitor port4
FortiGate-VM64-KVM (ha) # end

FortiGate-VM64-KVM # config system interface
FortiGate-VM64-KVM (interface) # edit port1
FortiGate-VM64-KVM (port1) # set mode static
FortiGate-VM64-KVM (port1) # set ip 192.168.2.101/24
FortiGate-VM64-KVM (port1) # set allowaccess http
FortiGate-VM64-KVM # config router static
FortiGate-VM64-KVM (static) # edit 1
FortiGate-VM64-KVM (1) # set dst 0.0.0.0/0
FortiGate-VM64-KVM (1) # set gateway 192.168.2.1
FortiGate-VM64-KVM (1) # set device port1
FortiGate-VM64-KVM (1) # end
FortiGate-VM64-KVM # config system global
FortiGate-VM64-KVM (global) # set hostname Forti-HA-1
FortiGate-VM64-KVM (global) # end
Forti-HA-1 #

2. Configure the 2nd unit

FortiGate-VM64-KVM # config system ha
FortiGate-VM64-KVM (ha) # set group-id 10
FortiGate-VM64-KVM (ha) # set group-name FORTI_HA
FortiGate-VM64-KVM (ha) # set mode a-p
FortiGate-VM64-KVM (ha) # set password 123456
FortiGate-VM64-KVM (ha) # set hbdev port4 0
FortiGate-VM64-KVM (ha) # set priority 100
FortiGate-VM64-KVM (ha) # set monitor port4
FortiGate-VM64-KVM (ha) # end

FortiGate-VM64-KVM # config system global
FortiGate-VM64-KVM (global) # set hostname Forti-HA-2
FortiGate-VM64-KVM (global) # end

Forti-HA-2 #

Wait about one minute!!

===============CLI + GUI==========

1. Config IP on an interface for GUI access

FortiGate-VM64-KVM # config system interface
FortiGate-VM64-KVM (interface) # edit port2
FortiGate-VM64-KVM (port1) # set mode static
FortiGate-VM64-KVM (port1) # set ip 10.10.0.1/24
FortiGate-VM64-KVM (port1) # set allowaccess http https ssh ping

2. Login GUI change hostname

3. Set Active-Passive HA mode, increase priority from default 128 to 250, set Group name and password, enabled session pickup, specify heartbeat interfaces and its priority, change Group ID from cli if needed.

4. On unit-2, configure a temporary IP on an interface for GUI access

FortiGate-VM64-KVM # config system interface
FortiGate-VM64-KVM (interface) # edit port2
FortiGate-VM64-KVM (port1) # set mode static
FortiGate-VM64-KVM (port1) # set ip 10.10.0.2/24
FortiGate-VM64-KVM (port1) # set allowaccess http https ssh ping

5. Change 2nd unit hostname

6. Same with step 3 except Device priority changes from default 128 to 50. Connection to unit-2 will be lost since IP is overwritten by the unit-1.

7. Connect the primary unit to verify HA status, wait a few minutes will be like this.

Add HA Widget if is not on dashboard.

8. If need connect to secondary CLI:

on the primary unit

Unit-1 # exe ha manage [0 | 1] admin

Warning: Permanently added '169.254.0.1' (ED25519) to the list of known hosts.

admin@169.254.0.1's password:

Unit-2 #

Unit-2 # exit

Connection to 169.254.0.1 closed.

Unit-1#

================

Verification

diagnose sys ha status

diag sys ha dump-by group << (see the uptime)

get system ha status

Forti-HA-1 # diagnose sys ha status

HA information

Statistics

traffic.local = s:0 p:572 b:79778

traffic.total = s:0 p:572 b:79778

activity.ha_id_changes = 2

activity.fdb = c:0 q:0

Model=80001, Mode=2 Group=10 Debug=0

nvcluster=1, ses_pickup=0, delay=0

[Debug_Zone HA information]

HA group member information: is_manage_primary=1.

FGVMEVX455AXV9CD: Primary, serialno_prio=0, usr_priority=200, hostname=Forti-HA-1

FGVMEVNWQAHVSW68: Secondary, serialno_prio=1, usr_priority=100, hostname=Forti-HA-2

[Kernel HA information]

vcluster 1, state=work, primary_ip=169.254.0.1, primary_id=0:

FGVMEVX455AXV9CD: Primary, ha_prio/o_ha_prio=0/0

FGVMEVNWQAHVSW68: Secondary, ha_prio/o_ha_prio=1/1

Note:

The IP address that is assigned to a heartbeat interface depends on the serial number priority of the member. Higher serial number has serialno_prio=0, therefore has the 1st IP 169.254.0.1, When a Foritgate join or leave cluster, the IP may change.

Forti-HA-2 # diagnose sys ha status

HA information

Statistics

traffic.local = s:0 p:472 b:52695

traffic.total = s:0 p:472 b:52695

activity.ha_id_changes = 1

activity.fdb = c:0 q:0

Model=80001, Mode=2 Group=10 Debug=0

nvcluster=1, ses_pickup=0, delay=0

[Debug_Zone HA information]

HA group member information: is_manage_primary=0.

FGVMEVNWQAHVSW68: Secondary, serialno_prio=1, usr_priority=100, hostname=Forti-HA-2

FGVMEVX455AXV9CD: Primary, serialno_prio=0, usr_priority=200, hostname=Forti-HA-1

[Kernel HA information]

vcluster 1, state=standby, primary_ip=169.254.0.1, primary_id=0:

FGVMEVNWQAHVSW68: Secondary, ha_prio/o_ha_prio=1/1

FGVMEVX455AXV9CD: Primary, ha_prio/o_ha_prio=0/0

diag sys ha checksum show

Unit-1# get system ha status

HA Health Status: OK

Model: FortiGate-VM64-KVM

Mode: HA A-P

Group: 0

Debug: 0

Cluster Uptime: 0 days 4:34:22

Cluster state change time: 2021-09-05 08:43:00

Primary selected using:

<2021/09/05 08:43:00> FGVMEVN03HJ2E443 is selected as the primary because it has the largest value of uptime.

****in case manual failover
<2021/09/05 13:33:39> FGVMEVN03HJ2E443 is selected as the primary because it has EXE_FAIL_OVER flag set.

****

<2021/09/05 08:32:31> FGVMEVN03HJ2E443 is selected as the primary because it's the only member in the cluster.

ses_pickup: disable

override: disable

Configuration Status:

FGVMEVN03HJ2E443(updated 1 seconds ago): in-sync

FGVMEVQDPJRLEY84(updated 3 seconds ago): in-sync

System Usage stats:

FGVMEVN03HJ2E443(updated 1 seconds ago):

sessions=22, average-cpu-user/nice/system/idle=1%/0%/0%/98%, memory=66%

FGVMEVQDPJRLEY84(updated 3 seconds ago):

sessions=1, average-cpu-user/nice/system/idle=0%/0%/1%/98%, memory=65%

HBDEV stats:

FGVMEVN03HJ2E443(updated 1 seconds ago):

port3: physical/10000full, up, rx-bytes/packets/dropped/errors=32938300/97349/0/0, tx=40363497/100154/0/0

port4: physical/10000full, up, rx-bytes/packets/dropped/errors=29966288/79069/0/0, tx=31342655/82309/0/0

FGVMEVQDPJRLEY84(updated 3 seconds ago):

port3: physical/10000full, up, rx-bytes/packets/dropped/errors=39829733/98733/0/0, tx=32933781/97337/0/0

port4: physical/10000full, up, rx-bytes/packets/dropped/errors=30808797/80887/0/0, tx=29962119/79058/0/0

Primary : Unit-1 , FGVMEVN03HJ2E443, HA cluster index = 1

Secondary : Unit-2 , FGVMEVQDPJRLEY84, HA cluster index = 0

number of vcluster: 1

vcluster 1: work 169.254.0.2

Primary: FGVMEVN03HJ2E443, HA operating index = 0

Secondary: FGVMEVQDPJRLEY84, HA operating index = 1

4. When 1st unit is down

2nd unit becomes the primary unit, see also from GUI

Forti-HA-2 # diagnose sys ha status

HA information

Statistics

traffic.local = s:0 p:557 b:62150

traffic.total = s:0 p:557 b:62150

activity.ha_id_changes = 2

activity.fdb = c:0 q:0

Model=80001, Mode=2 Group=10 Debug=0

nvcluster=1, ses_pickup=0, delay=0

[Debug_Zone HA information]

HA group member information: is_manage_primary=1.

FGVMEVNWQAHVSW68: Primary, serialno_prio=0, usr_priority=100, hostname=Forti-HA-2

[Kernel HA information]

vcluster 1, state=work, primary_ip=169.254.0.1, primary_id=0:

FGVMEVNWQAHVSW68: Primary, ha_prio/o_ha_prio=0/0

Secondary # get sys ha status

HA Health Status:

ERROR: FGVMEVN03HJ2E443 is lost @ 2021/09/05 13:15:24

Model: FortiGate-VM64-KVM

Mode: HA A-P

Group: 0

Debug: 0

Cluster Uptime: 0 days 4:44:54

Cluster state change time: 2021-09-05 13:15:24

Primary selected using:

<2021/09/05 13:15:24> FGVMEVQDPJRLEY84 is selected as the primary because it's the only member in the cluster.

<2021/09/05 08:43:00> FGVMEVN03HJ2E443 is selected as the primary because it has the largest value of uptime.

ses_pickup: disable

override: disable

System Usage stats:

FGVMEVQDPJRLEY84(updated 0 seconds ago):

sessions=34, average-cpu-user/nice/system/idle=0%/0%/0%/96%, memory=66%

HBDEV stats:

FGVMEVQDPJRLEY84(updated 0 seconds ago):

port3: physical/10000full, up, rx-bytes/packets/dropped/errors=41279467/102130/0/0, tx=34251138/101291/0/0

port4: physical/10000full, up, rx-bytes/packets/dropped/errors=31817304/83534/0/0, tx=31167979/82233/0/0

Primary : Secondary , FGVMEVQDPJRLEY84, HA cluster index = 0

number of vcluster: 1

vcluster 1: work 169.254.0.1

Primary: FGVMEVQDPJRLEY84, HA operating index = 0

5. When the 1st unit come back, HA status doesn't change, no preempt.

6. Force (manual) failover

6.1 check current unit if failover flag is set
exe ha failover status

6.2 if failover flag is set, use unset command to trigger failover

exe ha failover unset 1

6.3 if failover flag is unset, use set command to trigger failover
exe ha failover set 1

Also can use this command when override is disabled

diag sys ha reset-uptime

7. HA dedicated management interface (out of band)

Example: Port 4 for HA dedicated management port

7.1 Set port4 alias MGMT

7.2 System HA

7.3 Go to port4 setting to config interface IP 10.1.2.101/24, this IP won't be synched to the peer.

7.2 Go peer CLI to configure peer interface MGMT IP
exe has manage 1 admin
configure system interface

edit port4

set ip 10.1.2.102/24
set allowaccess http https ssh ping

8. Run command on another unit
exe ha manage

9. Upgrade
upload new firmware to primary only.

10. Show failover history from HA Widget

11. Session failover (session-pickup)

config system ha

set session-pickup enable

end

12. In-band management

config system interface

edit <port name>
set management-ip <ip/mask>

==============

Primary Election

1. When Override is Enabled

config system ha

set override enable

end

Change HA priority to force a failover

Number of active monitored ports > Priority> HA Uptime > Serial Number

2. When Override is Disabled (Default)

Force a failover - diag sys ha reset-uptime

Uptime difference need be 5 minutes longer to be considered in election

Number of active monitored ports > HA Uptime > Priority > Serial Number

=================

HA Firmware Updates

1. Upload new firmware to the primary.

2. The cluster will upgrades all secondary.

3. A new primary is elected.
4. The cluster upgrade the former primary

========Notes====

1. config system interface
edit MGMT

set type loopback
set ip x.x.x.x/y
set vdom root.

config system ha
set ha-mgmt-status enable
config ha-mgmt-interfaces
edit 1
set interface MGMT

2. In GUI interface configuration, Enable [Dedicate Management Port]

same IP show up on both units.

=======

FW with dedicated MGMT port

config system dedicated-mgmt

show

get (by default its disabled, means MP and DP can access each other)

to enable it:

config system dedicated-mgmt

set status enable

set interface mgmt

end

then MGMT port is then removed from interface list, and moved to a new vdom "dmgmt-vdom"

Net Worker World

Search This Blog