Restore failed active firewall with new RMA device

Deepseek: 

When the **active** firewall fails in an Active/Passive HA pair, the procedure for bringing the RMA replacement online is different. Because the passive firewall has already taken over as active, your goal is to rebuild the failed unit so it can rejoin the pair as the new passive member.


For this scenario, the **Device State** file exported from the **currently active (formerly passive) firewall** is the correct and recommended method .


### 🔄 The RMA Recovery Process (Active Unit Failed)


Follow these steps to safely restore the replacement firewall:


#### 1. Prepare the Replacement Firewall

Before restoring any configuration, physically set up the new unit. It is critical that its base software matches the existing environment to ensure a smooth join.

- **Match Versions**: Install the same PAN-OS version and Application/Threat content database versions that are running on the current active firewall .

- **Transfer Licenses**: Register the new device and transfer the licenses from the failed (old) unit's serial number to the new one .


#### 2. Obtain the Device State from the Current Active Unit

Since the old active firewall is failed and unavailable, you will use the **currently active** (formerly passive) firewall as the source for the configuration.

- Log into the web interface of the **active firewall**.

- Navigate to **Device > Setup > Operations** and click **Export Device State** to save the file to your computer .


#### 3. Import and Stage the Replacement Unit

This step is crucial to prevent an IP conflict that would disrupt the network.

- On the **new (replacement) firewall**, import the device state file you just exported (**Device > Setup > Operations > Import Device State**) .

- **DO NOT commit immediately.** First, go to **Device > High Availability** and **uncheck "Enable Config Sync"**. Also, ensure the device priority is set to a higher number (e.g., 100) than the current active unit to ensure it stays passive. Change management interface IP, and change HA peer IP address to match the settings on the failed unit. 

- Perform a **Commit** on the replacement unit.


#### 4. Update Panorama (If Managed)

If your firewalls are managed by Panorama, you must tell Panorama to forget the old, failed serial number and look for the new one.

- Connect to Panorama via CLI and run the following command:

    ```text

    replace device old <old_failed_serial_number> new <new_replacement_serial_number>

    ```

- Commit the changes on Panorama .


#### 5. Rejoin the HA Pair

Now that the replacement unit has the correct config but is isolated, you can safely connect it back to the network and re-enable HA.

- Connect the network and HA cables to the replacement firewall.

- On the replacement unit, navigate back to **Device > High Availability** and **re-check "Enable Config Sync"** .

- Perform a final **Commit**. The replacement unit will now synchronize the remaining live session information from the active unit and join the cluster as the **Passive** firewall .


### 💡 Key Considerations for This Scenario


- **Zero Downtown**: Because the passive unit already took over, your network traffic was uninterrupted. The goal of this procedure is to restore redundancy without disrupting the current active unit .

- **Device State vs. Named Config**: The Device State is critical here because it retains the HA peer runtime and sync info. Using a simple named configuration backup would require you to manually re-establish the HA pairing and sync relationships .

- **LSVPN Dynamic Config**: If you use Large Scale VPN (LSVPN), note that a device state generated by Panorama may not include dynamic satellite information. However, because this is an HA pair, the new unit will automatically synchronize this dynamic runtime data from its active peer after it rejoins .



==========

Kimi:

When the **active** firewall in a Palo Alto Networks HA pair fails and needs RMA replacement, the procedure requires careful handling to avoid traffic disruption and configuration conflicts. Here's the comprehensive restore procedure:


## Critical Pre-Work Before Starting


1. **Ensure the surviving peer is stable** - Verify the passive firewall has taken over active duties and is processing traffic normally 

2. **Document current settings** - Take screenshots of:

   - HA General Settings (Device Priority, Preemption settings)

   - Management Interface IP configuration

   - Hostname of the failed unit 


## RMA Restore Procedure for Failed Active Firewall


### Phase 1: Prepare the Replacement Unit


1. **Register and license the new firewall**

   - Transfer licenses from the old serial number to the new one via Palo Alto Support Portal 

   - Retrieve licenses on the new device: `Device > Licenses > Retrieve license keys from license server` 


2. **Configure basic network access**

   - Set management IP, DNS, and gateway (use a **temporary unique IP** initially to avoid conflicts) 

   - Verify internet connectivity to `updates.paloaltonetworks.com`


3. **Match software versions**

   - Upgrade PAN-OS to match the surviving peer exactly 

   - Install matching content/threat database versions: `request content upgrade install version latest` 

   - Match any special settings (multi-vsys, jumbo frames, FIPS/CC mode) 


### Phase 2: Restore Configuration


**Option A: If you have a Device State backup from the failed unit**

1. Import the device state: `Device > Setup > Operations > Import device state` 

2. **Do NOT commit yet** - The import brings over the old management IP and hostname 


**Option B: If no backup available (use surviving peer's config)**

1. From the surviving active firewall, export device state

2. Import it to the replacement unit 


### Phase 3: Critical HA Safety Configuration


**Before connecting to the network, force the replacement to stay passive:**


1. **Disable Config Sync** on the replacement: `Device > High Availability > General > Setup` - uncheck "Enable Config Sync" 

2. **Disable Preemption**: `Device > High Availability > General > Election Settings` - uncheck "Preemptive" 

3. **Set highest Device Priority** (e.g., 255) so it stays passive - lower priority number wins election, so higher number = lower precedence 

4. **Change identity settings** back to the failed unit's original:

   - Management IP

   - Hostname  

   - HA settings (using screenshots taken earlier) 

5. **Commit force from CLI**:

   ```

   > configure

   # commit force

   ```


### Phase 4: Connect and Synchronize


1. **Connect only HA1 control link first** (do not connect data plane interfaces yet) 

2. **Verify configuration sync**:

   - Check Dashboard > High Availability widget shows "Running Configuration: synchronized" 

   - CLI: `show high-availability all | match "Running Configuration"` 

   - Verify no idmgr differences: `debug device-server dump idmgr high-availability state` 

3. **Enable Config Sync** on the replacement device 

4. **Synchronize from active**: From the surviving active unit, push configuration to peer if needed 

5. **Connect HA2 link** for session synchronization 

6. **Connect data plane interfaces** only after confirming sync is complete 


### Phase 5: Finalize


1. Re-enable preemption if required by your design 

2. Adjust device priority to match your original HA design 

3. Verify all traffic flows and VPN tunnels are working 


## Important Considerations


- **If managed by Panorama**: Replace the old serial number with the new one in Panorama CLI before the replacement connects: `replace device old <old SN> new <new SN>` 

- **Master Key**: If the original firewall had a changed Master Key, you must configure it on the replacement before importing config, or commits will fail 

- **IPSec/LSVPN**: If using LSVPN and replacing a satellite, manual re-authentication may be required 

- **GARP required**: If using NAT IPs in the same subnet as interfaces (except the interface IP itself), send gratuitous ARP: `test arp gratuitous ip <ip> interface <interface>` 


The key to this procedure is **keeping the replacement passive until fully synchronized** to prevent split-brain scenarios or configuration conflicts .


Comments