Magento 2: Delete Duplicate Product Attributes Options Programmatically

This article shows how you can delete duplicate product attribute options in Magento 2.

I am using a standalone script for this purpose. You can use this code in your custom module as well.

Initialize Object Manager and Set the Area Code

The following things are done in the below code:

– Initialize the object manager.
– Set the area code as adminhtml as we are editing the product.


<?php
error_reporting(1);
set_time_limit(0);
ini_set('memory_limit', '2048M');

use Magento\Framework\App\Bootstrap;
 
/**
 * If your external file is in root folder
 */
require __DIR__ . '/app/bootstrap.php';
 
/**
 * If your external file is NOT in root folder
 * Let's suppose, your file is inside a folder named 'xyz'
 *
 * And, let's suppose, your root directory path is
 * /var/www/html/magento2
 */
// $rootDirectoryPath = '/var/www/html/magento2';
// require $rootDirectoryPath . '/app/bootstrap.php';
 
$params = $_SERVER;
 
$bootstrap = Bootstrap::create(BP, $params);
 
$obj = $bootstrap->getObjectManager();
 
$state = $obj->get('Magento\Framework\App\State');
$state->setAreaCode('adminhtml');

Get all attributes


$allAttributes = $obj->get('Magento\Catalog\Model\ResourceModel\Product\Attribute\Collection')
                     ->getItems();

foreach ($allAttributes as $attribute) {
    var_dump($attribute->getData());
    //echo $attribute->getFrontendLabel() . ' - ' . $attribute->getAttributeCode() . '<br />';
}

Load product attribute by attribute code


$attributeCode = 'color';

$attribute = $obj->get('Magento\Eav\Model\Config')
                 ->getAttribute('catalog_product', $attributeCode);

var_dump($attribute->getData());

Output:


array (size=41)
  'attribute_id' => string '93' (length=2)
  'entity_type_id' => string '4' (length=1)
  'attribute_code' => string 'color' (length=5)
  'attribute_model' => null
  'backend_model' => null
  'backend_type' => string 'int' (length=3)
  'backend_table' => null
  'frontend_model' => null
  'frontend_input' => string 'select' (length=6)
  'frontend_label' => string 'Color' (length=5)
  'frontend_class' => null
  'source_model' => null
  'is_required' => string '0' (length=1)
  'is_user_defined' => string '1' (length=1)
  'default_value' => null
  'is_unique' => string '0' (length=1)
  'note' => null
  'frontend_input_renderer' => null
  'is_global' => string '1' (length=1)
  'is_visible' => string '1' (length=1)
  'is_searchable' => string '1' (length=1)
  'is_filterable' => string '0' (length=1)
  'is_comparable' => string '1' (length=1)
  'is_visible_on_front' => string '0' (length=1)
  'is_html_allowed_on_front' => string '0' (length=1)
  'is_used_for_price_rules' => string '0' (length=1)
  'is_filterable_in_search' => string '0' (length=1)
  'used_in_product_listing' => string '0' (length=1)
  'used_for_sort_by' => string '0' (length=1)
  'apply_to' => null
  'is_visible_in_advanced_search' => string '1' (length=1)
  'position' => string '0' (length=1)
  'is_wysiwyg_enabled' => string '0' (length=1)
  'is_used_for_promo_rules' => string '1' (length=1)
  'is_required_in_admin_store' => string '0' (length=1)
  'is_used_in_grid' => string '1' (length=1)
  'is_visible_in_grid' => string '0' (length=1)
  'is_filterable_in_grid' => string '1' (length=1)
  'search_weight' => string '1' (length=1)
  'additional_data' => null
  ...
  ...
  ...

Get all options of the loaded attribute


$options = $attribute->getSource()->getAllOptions();
var_dump($options); 

Output:


Array
(
    [0] => Array
        (
            [label] =>  
            [value] => 
        )

    [1] => Array
        (
            [value] => 557
            [label] => Berry
        )

    [2] => Array
        (
            [value] => 517
            [label] => Black
        )

    [3] => Array
        (
            [value] => 573
            [label] => Blue
        )
    ...
    ...
    ...

Get all products associated with each attribute option

Here, we loop through all the options of the attribute that we loaded in the above code. We then fetch the products associated with each attribute option.


foreach ($options as $option) {
    $products = $obj->get('Magento\Catalog\Model\Product')
                    ->getCollection()
                    ->addAttributeToFilter($attributeCode, $option['value']);

    $productIds = [];
    foreach ($products as $prod) {
        $productIds[] = $prod->getId();
    }

    echo 'Option ID: ' . $option['value'] . '<br />';
    echo 'Option Label: ' . $option['label'] . '<br />';
    echo 'Total Products: ' . $products->count() . '<br />';
    echo 'Product IDs: ' . implode(', ', $productIds) . '<br />';
    echo '<br />';
}

Output:


Option ID: 71
Option Label: Berry
Total Products: 1
Product IDs: 566

Option ID: 61
Option Label: Black
Total Products: 4
Product IDs: 269, 276, 864, 868

Option ID: 52
Option Label: Blue
Total Products: 8
Product IDs: 626, 637, 649, 659, 672, 690, 701, 861
    
...
...
...

Create an array to find out duplicate attribute options

Here, we loop through all the options of the attribute that we loaded in the above code. We then create a new array containing the attribute label as key and that key contains another array with the id of the attribute option and the total number of products associated with that attribute option id.

The array will be like this:


Array
(
    [AttributeLabel] => Array
                     (
                         [AttributeID] => ProductCount
                     )
)

Here’s the code:


$duplicateOptions = [];
foreach ($options as $option) {
    // because the first option can be blank
    if(!empty(trim($option['label']))) {
        $products = $obj->get('Magento\Catalog\Model\Product')
                        ->getCollection()
                        ->addAttributeToFilter($attributeCode, $option['value']);

        if ($products->count()) {
            $productsCount = $products->count();
        } else {
            $productsCount = 0;
        }

        $duplicateOptions[$option['label']][$option['value']] = $productsCount;
    }
}

echo '<pre>'; print_r($duplicateOptions); 

Output:


Array
(
    [Berry] => Array
        (
            [53] => 1
            [247] => 0
        )

    [Black] => Array
        (
            [13] => 42
        )

    [Blue] => Array
        (
            [61] => 4
        )

    [White] => Array
        (
            [16] => 0
            [246] => 0
        )

    ...
    ...
    ...

In the above output, you can see two items in the array key Berry. This indicates that the attribute option “Berry” has a duplicate entry, i.e. there are two entries with the same label as Berry.

Delete duplicate attribute option

Up to now, we have figured out the duplicate attribute options. Now, while deleting the duplicate attribute option, you need to be sure of the option that you need to delete.

It can be:

– newly added attribute options, i.e. the duplicate attribute options with the higher ID
– old attribute options, i.e. the duplicate attribute options with the lower ID
– duplicate attribute options with zero product count

For this example, I will try to delete the latest/newly added attribute options that are the duplicate one, i.e. I am assuming that some attribute options are recently added to the attribute but those attribute options were already present before in the attribute.

First of all,

– We will create a new key entry called delete inside the deleteOptions array.
– It will contain the values of the attribute options ID to be deleted.
– The attribute option ID containing zero product count is treated as the one to be deleted.


foreach ($duplicateOptions as $key => $value) {
    if (count($value) > 1) {
        foreach ($value as $k => $v) {
            if ($v == 0) {
                $duplicateOptions[$key]['delete'][] = $k;
            }
        }
    }
}
echo '<pre>'; print_r($duplicateOptions); 

Output:


Array
(
    [Berry] => Array
        (
            [53] => 1
            [247] => 0
            [delete] => Array
                (
                    [0] => 247
                )

        )

    [Black] => Array
        (
            [13] => 42
        )

    [White] => Array
        (
            [16] => 0
            [246] => 0
            [delete] => Array
                (
                    [0] => 16
                    [1] => 246
                )

        )

    ...
    ...
    ...

In the above output, for the attribute option White, both option ids are treated as the one to be deleted. That’s because both option ids had zero products associated with them.

In such a case, we will remove the attribute option id that we don’t want to delete from the “delete” key array.

As mentioned above, for this example, I have treated the newly/latest added option to be the duplicate one. So, I will be deleting the option with higher ID and NOT deleting the one with the lower ID value.


foreach ($duplicateOptions as $key => $value) {
    if (count($value) > 1) {
        foreach ($value as $k => $v) {
            if ($v == 0) {
                $duplicateOptions[$key]['delete'][] = $k;
            }
        }

        if (isset($duplicateOptions[$key]['delete'])) {
            /**
             * if multiple options are in delete array, 
             * i.e. multiple option ids have zero product count
             */ 
            if (count($duplicateOptions[$key]['delete']) > 1) { 
                sort($duplicateOptions[$key]['delete']); // sort the array
                array_shift($duplicateOptions[$key]['delete']); // remove the first item of the array

                /**
                 * if you assume that the older option id is the duplicate one
                 * then, you keep the older option id in the delete list 
                 * and remove the latest/newly-added attribute option from the delete list
                 */ 
                // rsort($duplicateOptions[$key]['delete']); // reverse sort the array
                // array_shift($duplicateOptions[$key]['delete']); // remove the first item of the array
            }
        }
    }
}
echo '<pre>'; print_r($duplicateOptions); 

Output:


Array
(
    [Berry] => Array
        (
            [53] => 1
            [247] => 0
            [delete] => Array
                (
                    [0] => 247
                )

        )

    [Black] => Array
        (
            [13] => 42
        )

    [White] => Array
        (
            [16] => 0
            [246] => 0
            [delete] => Array
                (
                    [0] => 246
                )

        )

    ...
    ...
    ...

Now, we actually delete the duplicate attribute options.


foreach ($duplicateOptions as $key => $value) {
    if (count($value) > 1) {
        foreach ($value as $k => $v) {
            if ($v == 0) {
                $duplicateOptions[$key]['delete'][] = $k;
            }
        }

        if (isset($duplicateOptions[$key]['delete'])) {
            /**
             * if multiple options are in delete array, 
             * i.e. multiple option ids have zero product count
             */ 
            if (count($duplicateOptions[$key]['delete']) > 1) { 
                sort($duplicateOptions[$key]['delete']); // sort the array
                array_shift($duplicateOptions[$key]['delete']); // remove the first item of the array

                /**
                 * if you assume that the older option id is the duplicate one
                 * then, you keep the older option id in the delete list 
                 * and remove the latest/newly-added attribute option from the delete list
                 */ 
                // rsort($duplicateOptions[$key]['delete']); // reverse sort the array
                // array_shift($duplicateOptions[$key]['delete']); // remove the first item of the array

                // DELETE DUPLICATE ATTRIBUTE OPTIONS
                foreach ($duplicateOptions[$key]['delete'] as $optionId) {
                    $optionModel = $obj->get('Magento\Eav\Model\Entity\Attribute\Option')->load($optionId);
                    try {
                        $optionModel->delete();
                        echo '<font color="green">"'.$key.' ('.$optionId.')" Option Deleted!</font><br />';
                    }
                    catch(Exception $e) {
                        echo '<font color="red">'. $e->getMessage() .'</font><br />';
                    }
                }
            }
        }
    }
}

Output:


"Berry (247)" duplicate option deleted!
"White (246)" duplicate option deleted!

Delete duplicate attribute options and save the actions in a log file

The delete code is same as above. We have just added the code to save the deleted message in a log file. The log file will be saved in var/log folder.


$file = 'var/log/delete_duplicate_attribute_options.log';
$handle = fopen($file, 'a') or die('Cannot open file:  '.$file); //implicitly creates file
$data = date('Y-m-d H:i:s');
$data .= "\n";
$data .= 'Attribute Code: ' . $attributeCode;
$data .= "\n\n";

foreach ($duplicateOptions as $key => $value) {
    if (count($value) > 1) {
        foreach ($value as $k => $v) {
            if ($v == 0) {
                $duplicateOptions[$key]['delete'][] = $k;
            }
        }

        if (isset($duplicateOptions[$key]['delete'])) {
            /**
             * if multiple options are in delete array, 
             * i.e. multiple option ids have zero product count
             */ 
            if (count($duplicateOptions[$key]['delete']) > 1) { 
                sort($duplicateOptions[$key]['delete']); // sort the array
                array_shift($duplicateOptions[$key]['delete']); // remove the first item of the array

                /**
                 * if you assume that the older option id is the duplicate one
                 * then, you keep the older option id in the delete list 
                 * and remove the latest/newly-added attribute option from the delete list
                 */ 
                // rsort($duplicateOptions[$key]['delete']); // reverse sort the array
                // array_shift($duplicateOptions[$key]['delete']); // remove the first item of the array
            }

            // DELETE DUPLICATE ATTRIBUTE OPTIONS
            foreach ($duplicateOptions[$key]['delete'] as $optionId) {
                $optionModel = $obj->get('Magento\Eav\Model\Entity\Attribute\Option')->load($optionId);
                try {
                    $optionModel->delete();
                    echo '<font color="green">"'.$key.' ('.$optionId.')" duplicate option deleted!</font><br />';
                    $data .= 'Option Value: ' . $key . ', ' . 'Option ID: ' . $optionId;
                    $data .= "\n";
                } catch(Exception $e) {
                    echo '<font color="red">'. $e->getMessage() .'</font><br />';
                    $data .= $e->getMessage();
                }
            }
        }
    }
}

$data .= "\n\n";
fwrite($handle, $data);
fclose($handle);

Output of the log file:


2018-11-08 07:54:49
Attribute Code: color

Option Value: Berry, Option ID: 247
Option Value: White, Option ID: 248

Hope this helps. Thanks.