Skip to content
67 Bricks blog

67 Bricks blog

Experienced technology and product experts working with data, content and media businesses to grow their value

Month: June 2010

Comparing characters without accents – making é and e the same

For a recent project, we needed to compare words without paying attention to their accents. In Java, you can do it like this:

String normalizedText = Normalizer.normalize(text, Normalizer.Form.NFD)
                                  .replaceAll("p{InCombiningDiacriticalMarks}+", "");
Author InigoPosted on June 14, 2010December 16, 2021Categories Uncategorized

Recent Posts

  • Coding principles 8: Favour small branches, merged often
  • Coding principles 7: Avoid “just in case” code
  • Wishing upon a star
  • Coding principles 6: Don’t reinvent the wheel
  • On jspawnhelper and automatic updates

Archives

  • May 2025
  • March 2025
  • December 2024
  • November 2024
  • July 2024
  • June 2024
  • March 2024
  • February 2024
  • January 2024
  • September 2023
  • August 2023
  • June 2023
  • April 2023
  • January 2023
  • November 2022
  • October 2022
  • August 2022
  • July 2022
  • June 2022
  • May 2022
  • April 2022
  • March 2022
  • February 2022
  • January 2022
  • December 2021
  • November 2021
  • January 2021
  • June 2020
  • March 2020
  • January 2020
  • December 2019
  • September 2019
  • June 2019
  • May 2019
  • January 2019
  • November 2018
  • October 2018
  • September 2018
  • July 2018
  • June 2018
  • May 2018
  • April 2018
  • February 2018
  • December 2017
  • November 2017
  • October 2017
  • August 2017
  • July 2017
  • June 2017
  • May 2017
  • March 2017
  • February 2017
  • January 2017
  • December 2016
  • November 2016
  • October 2016
  • September 2016
  • August 2016
  • June 2016
  • May 2016
  • March 2016
  • February 2016
  • January 2016
  • September 2015
  • July 2015
  • May 2015
  • April 2015
  • March 2015
  • February 2015
  • January 2015
  • June 2010
  • June 2009
  • February 2009
  • July 2008
  • May 2008
  • April 2008
  • March 2008
  • December 2007
  • November 2007
  • September 2007
  • August 2007

Categories

  • blogging
  • Blogroll
  • csharp
  • diversity
  • fsharp
  • Generative AI
  • java
  • knowledge management
  • knowledge retention
  • machine learning
  • marklogic
  • nut and bolts
  • people
  • regex
  • scala
  • Search
  • software development
  • teams
  • text processing
  • Uncategorized
  • war stories
  • XML
  • xquery

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org
67 Bricks blog Proudly powered by WordPress