Sorting multidimensional PHP arrays by object values with accented character
William L'Archeveque

William L'Archeveque @wlarch

About: Full-Stack Devops & Web Developer. Enthusiast Helper. Freelancer for 7 years, now having fun working at Centiva. Specialized in Laravel PHP, AWS Cloud and cartwheeling.

Location:
Montréal, QC
Joined:
Dec 17, 2020

Sorting multidimensional PHP arrays by object values with accented character

Publish Date: Jan 24 '21
17 4

When working on a multilingual website, it often happens that we need to deal with special and accented characters. In Québec (Canada), Websites are most of the time bilingual because we speak French and English. This can cause a headache to developers when dealing with sorting arrays by alphabetical order due to “caractères spéciaux” (french for special characters).


I developed a few methods that can help overcome the difficulties of multidimensional array sorting.

The problem with sorting special characters

The best way to explain the problem is with an example. Let’s say we have a multidimensional array of category objects returned by an API that we want ordered alphabetically by category name for a specific language :

<?php

require 'StringHelper.php';

$helper = new StringHelper();
$categories = [];

$news = new stdClass();
$news->names = ["fr" => "Actualités", "en" => "News"];
$categories[] = $news;

$sports = new stdClass();
$sports->names = ["fr" => "Sports", "en" => "Sports"];
$categories[] = $sports;

$home = new stdClass();
$home->names = ["fr" => "Accueil", "en" => "Home"];
$categories[] = $home;

$events = new stdClass();
$events->names = ["fr" => "Événements", "en" => "Events"];
$categories[] = $events;

$special = new stdClass();
$special->names = ["fr" => "Spécial", "en" => "Special"];
$categories[] = $special;

// Alphabetically sorted result in french : Accueil, Actualités, Événement, Spécial, Sports
// Alphabetically sorted result in english : Events, Home, News, Special, Sports

// Original order
var_dump($categories);

$helper->alphabeticalCompareArrayByKey($categories, 'names', 'fr');

// Alphabetical order (french) by names
var_dump($categories);
Enter fullscreen mode Exit fullscreen mode

It’s a challenge to order this array correctly because of the accented characters and because of the way the array is formatted (with an array of objects).

The solution for sorting an array of objects

I have solved this problem by creating two different functions in a StringHelper.php class that can be used in the application :

<?php

class StringHelper
{

 /**
  * Compare an associative multidimensionnal array by specific object value
  *
  * @param array &$array     Reference of the array to sort
  * @param string $element   Element to order by specific object key from
  * @param string $key       Sort array by this specified key of element
  * @return void
  */
  public static function alphabeticalCompareArrayByKey(&$array, string $element, string $key){
      usort($array, function($a, $b) use ($element, $key) {
          return strcasecmp(self::transliterateString($a->{$element}->[$key]), self::transliterateString($b->{$element}->[$key]));
      });
  }

  /**
   * Replace accented caracters in string
   *
   * Example :
   * echo transliterateString('Événenement'); // evenement
   *
   * @param string  String with accented caracters
   * @return string  Transliterated string
   */
  public static function transliterateString($string)
  {
      $transliterationTable = ['á' => 'a', 'Á' => 'A', 'à' => 'a', 'À' => 'A', 'ă' => 'a', 'Ă' => 'A', 'â' => 'a', 'Â' => 'A', 'å' => 'a', 'Å' => 'A', 'ã' => 'a', 'Ã' => 'A', 'ą' => 'a', 'Ą' => 'A', 'ā' => 'a', 'Ā' => 'A', 'ä' => 'ae', 'Ä' => 'AE', 'æ' => 'ae', 'Æ' => 'AE', 'ḃ' => 'b', 'Ḃ' => 'B', 'ć' => 'c', 'Ć' => 'C', 'ĉ' => 'c', 'Ĉ' => 'C', 'č' => 'c', 'Č' => 'C', 'ċ' => 'c', 'Ċ' => 'C', 'ç' => 'c', 'Ç' => 'C', 'ď' => 'd', 'Ď' => 'D', 'ḋ' => 'd', 'Ḋ' => 'D', 'đ' => 'd', 'Đ' => 'D', 'ð' => 'dh', 'Ð' => 'Dh', 'é' => 'e', 'É' => 'E', 'è' => 'e', 'È' => 'E', 'ĕ' => 'e', 'Ĕ' => 'E', 'ê' => 'e', 'Ê' => 'E', 'ě' => 'e', 'Ě' => 'E', 'ë' => 'e', 'Ë' => 'E', 'ė' => 'e', 'Ė' => 'E', 'ę' => 'e', 'Ę' => 'E', 'ē' => 'e', 'Ē' => 'E', 'ḟ' => 'f', 'Ḟ' => 'F', 'ƒ' => 'f', 'Ƒ' => 'F', 'ğ' => 'g', 'Ğ' => 'G', 'ĝ' => 'g', 'Ĝ' => 'G', 'ġ' => 'g', 'Ġ' => 'G', 'ģ' => 'g', 'Ģ' => 'G', 'ĥ' => 'h', 'Ĥ' => 'H', 'ħ' => 'h', 'Ħ' => 'H', 'í' => 'i', 'Í' => 'I', 'ì' => 'i', 'Ì' => 'I', 'î' => 'i', 'Î' => 'I', 'ï' => 'i', 'Ï' => 'I', 'ĩ' => 'i', 'Ĩ' => 'I', 'į' => 'i', 'Į' => 'I', 'ī' => 'i', 'Ī' => 'I', 'ĵ' => 'j', 'Ĵ' => 'J', 'ķ' => 'k', 'Ķ' => 'K', 'ĺ' => 'l', 'Ĺ' => 'L', 'ľ' => 'l', 'Ľ' => 'L', 'ļ' => 'l', 'Ļ' => 'L', 'ł' => 'l', 'Ł' => 'L', 'ṁ' => 'm', 'Ṁ' => 'M', 'ń' => 'n', 'Ń' => 'N', 'ň' => 'n', 'Ň' => 'N', 'ñ' => 'n', 'Ñ' => 'N', 'ņ' => 'n', 'Ņ' => 'N', 'ó' => 'o', 'Ó' => 'O', 'ò' => 'o', 'Ò' => 'O', 'ô' => 'o', 'Ô' => 'O', 'ő' => 'o', 'Ő' => 'O', 'õ' => 'o', 'Õ' => 'O', 'ø' => 'oe', 'Ø' => 'OE', 'ō' => 'o', 'Ō' => 'O', 'ơ' => 'o', 'Ơ' => 'O', 'ö' => 'oe', 'Ö' => 'OE', 'ṗ' => 'p', 'Ṗ' => 'P', 'ŕ' => 'r', 'Ŕ' => 'R', 'ř' => 'r', 'Ř' => 'R', 'ŗ' => 'r', 'Ŗ' => 'R', 'ś' => 's', 'Ś' => 'S', 'ŝ' => 's', 'Ŝ' => 'S', 'š' => 's', 'Š' => 'S', 'ṡ' => 's', 'Ṡ' => 'S', 'ş' => 's', 'Ş' => 'S', 'ș' => 's', 'Ș' => 'S', 'ß' => 'SS', 'ť' => 't', 'Ť' => 'T', 'ṫ' => 't', 'Ṫ' => 'T', 'ţ' => 't', 'Ţ' => 'T', 'ț' => 't', 'Ț' => 'T', 'ŧ' => 't', 'Ŧ' => 'T', 'ú' => 'u', 'Ú' => 'U', 'ù' => 'u', 'Ù' => 'U', 'ŭ' => 'u', 'Ŭ' => 'U', 'û' => 'u', 'Û' => 'U', 'ů' => 'u', 'Ů' => 'U', 'ű' => 'u', 'Ű' => 'U', 'ũ' => 'u', 'Ũ' => 'U', 'ų' => 'u', 'Ų' => 'U', 'ū' => 'u', 'Ū' => 'U', 'ư' => 'u', 'Ư' => 'U', 'ü' => 'ue', 'Ü' => 'UE', 'ẃ' => 'w', 'Ẃ' => 'W', 'ẁ' => 'w', 'Ẁ' => 'W', 'ŵ' => 'w', 'Ŵ' => 'W', 'ẅ' => 'w', 'Ẅ' => 'W', 'ý' => 'y', 'Ý' => 'Y', 'ỳ' => 'y', 'Ỳ' => 'Y', 'ŷ' => 'y', 'Ŷ' => 'Y', 'ÿ' => 'y', 'Ÿ' => 'Y', 'ź' => 'z', 'Ź' => 'Z', 'ž' => 'z', 'Ž' => 'Z', 'ż' => 'z', 'Ż' => 'Z', 'þ' => 'th', 'Þ' => 'Th', 'µ' => 'u', 'а' => 'a', 'А' => 'a', 'б' => 'b', 'Б' => 'b', 'в' => 'v', 'В' => 'v', 'г' => 'g', 'Г' => 'g', 'д' => 'd', 'Д' => 'd', 'е' => 'e', 'Е' => 'e', 'ё' => 'e', 'Ё' => 'e', 'ж' => 'zh', 'Ж' => 'zh', 'з' => 'z', 'З' => 'z', 'и' => 'i', 'И' => 'i', 'й' => 'j', 'Й' => 'j', 'к' => 'k', 'К' => 'k', 'л' => 'l', 'Л' => 'l', 'м' => 'm', 'М' => 'm', 'н' => 'n', 'Н' => 'n', 'о' => 'o', 'О' => 'o', 'п' => 'p', 'П' => 'p', 'р' => 'r', 'Р' => 'r', 'с' => 's', 'С' => 's', 'т' => 't', 'Т' => 't', 'у' => 'u', 'У' => 'u', 'ф' => 'f', 'Ф' => 'f', 'х' => 'h', 'Х' => 'h', 'ц' => 'c', 'Ц' => 'c', 'ч' => 'ch', 'Ч' => 'ch', 'ш' => 'sh', 'Ш' => 'sh', 'щ' => 'sch', 'Щ' => 'sch', 'ъ' => '', 'Ъ' => '', 'ы' => 'y', 'Ы' => 'y', 'ь' => '', 'Ь' => '', 'э' => 'e', 'Э' => 'e', 'ю' => 'ju', 'Ю' => 'ju', 'я' => 'ja', 'Я' => 'ja'];

      $transliteratedString = str_replace(array_keys($transliterationTable), array_values($transliterationTable), $string);

      return trim(strtolower($transliteratedString));
  }
}
Enter fullscreen mode Exit fullscreen mode

The two functions look complicated at first sight, but they are not really.

Explanation and result (sorting the multidimensional array) 🧙‍♂️

The first function receives 3 parameters, the multidimensional array reference (or array of objects), the element and key that are used to sort the array.

Let’s say we want to order the previous array by french names. The parameters would be :

alphabeticalCompareArrayByKey($categories, 'names', 'fr');

As we are sending the array as a reference, no need to reassign it to a variable. The usort function sorts an array by values using a user-defined comparison function.

Our comparison function is a binary safe case-insensitive string comparison : strcasecmp().

Within the comparison we make sure the accented characters are replaced with the adequate ones (ie. é = e, â = a).

Our comparison will then be successful. 💪

The previous example only works with an array of objects, but you could easily adapt it to compare an array of array by modifying the strcasecmp() part by :

return strcasecmp(self::transliterateString($a[$element][$key]), self::transliterateString($b[$element][$key]));
Enter fullscreen mode Exit fullscreen mode

Let me know if this article helped you to sort your sorting problems!


Cover Image : Edu Grande (@edgr) from Unsplash

Comments 4 total

  • James Robb
    James RobbJan 24, 2021

    Nice, one thing I’d change is to get the characters by char code instead of manually writing them out because then you can’t miss any by accident if you use char code ranges for each language.

    • William L'Archeveque
      William L'ArchevequeJan 25, 2021

      Thanks, that is a good idea. The characters array was found somewhere on the Internet and did fit my needs for French accented characters. Although, you can find characters for multiple others languages in it. Getting those by char code would be optimal!

  • Jon Randy 🎖️
    Jon Randy 🎖️Jan 24, 2021

    I think you mean 'accented'

    • William L'Archeveque
      William L'ArchevequeJan 25, 2021

      Thanks! I was convinced I was writing it the correct way. In French, accented is written "accentué".

Add comment