Python string comparison is the process of comparing two strings to determine their relative order or equality. This guide will discuss three main methods to do string comparison in Python and introduce various other modules that can be used for the same purpose.
What is string in Python?
String comparison is one of the most common operations in programming. A string can be a collection of letters, numbers, symbols, or blank spaces. Although the way you want to compare strings depends on the use case at hand, the most common method is to compare the characters of each string from left to right. While languages like C and C++ use ASCII codes for string comparison, Python uses Unicode values to compare characters.
How does string comparison work in Python and when to use it?
There are many occasions when we need to perform string comparisons. The following incidents are some common examples.
Equality check - Sometimes, you need to verify whether two strings represent the same text. For example, when you want to authenticate a user by comparing the password they entered to the login form with the password stored in the database.
Sorting and Ordering - You can use Python string comparison when you need to sort strings. For example, when arranging a list of names alphabetically, you need to compare which name comes earlier and which comes later by comparing each character of the name.
Searching and Matching - When you want to search specific substrings within a larger string, you can use string comparison. It allows you to locate and identify matching patterns or sequences.
Conditional Branching - Developers use string comparisons in conditional statements like if-else and switch statements. By comparing strings, you can control the program's flow and execute different code blocks based on specific conditions.
Deploy and scale your Python projects effortlessly on Cherry Servers' robust and cost-effective dedicated or virtual servers. Benefit from an open cloud ecosystem with seamless API & integrations and a Python library.
Python string comparison: different methods
Now that you know when and why Python string comparison is useful, let's see what methods are available.
Method 1: Using operators for Python string comparison
You can compare strings using Python comparison operators. Those are:
-
"=="
(equal to), -
"!="
(not equal to), -
"<"
(less than), -
">"
(greater than), -
"<="
(less than or equal to), or -
">="
(greater than or equal to)
These operators compare characters of each string one by one, left to right, and return a boolean value (True or False) as the result of the comparison.
Character comparison is done based on their Unicode values, which makes the comparison operation case-sensitive. For instance, "apple" and "Apple" are not considered the same due to their different Unicode values.
Let’s see the Unicode values of lowercase letters, uppercase letters, and numbers commonly used in strings.
- Numbers (0 to 9) - Unicode values: U+0030 to U+0039
- Uppercase letters (A to Z) - Unicode values: U+0041 to U+005A
- Lowercase letters (a to z) - Unicode values: U+0061 to U+007A
As you can see, the numbers have the lowest Unicode values among the three types. When it comes to letters, uppercase letters have lower values than lowercase letters.
Let's see how the comparison operators work with an example.
string1 = "apple"
string2 = "Apple"
print(string1 == string2)
print(string1 != string2)
print(string1 < string2)
print(string1 > string2)
print(string1 <= string2)
print(string1 >= string2)
Output
In this example, "A" has a lower Unicode value than "a." Therefore, string2 is smaller than string1, which is why you get the above result.
Case-insensitive string comparison in Python
In our above examples, we use the case of letters for string comparison. However, if you want the string comparison to be case-insensitive, use the lower() method with the strings you are comparing. Here is an example.
string1 = "hello"
string2 = "Hello"
print(string1.lower() == string2.lower())
print(string1.lower() < string2.lower())
Output
In this example, the lower() method converts the complete strings to their lowercase form before the comparison. That's why you get True with the ==
operator and False with the <
operator.
Method 2: Python user-defined function
In the above examples, we compared strings based on their Unicode values. However, if you want to compare strings based on a different criterion, you will need to write a custom Python user-defined function.
Here, we will create a function that compares strings based on the number of digits they contain.
def string_comparison(string1, string2):
count1 = 0
count2 = 0
for j in range(len(string1)):
if string1[j] >= "0" and string1[j] <= "9":
count1 += 1
for j in range(len(string2)):
if string2[j] >= "0" and string2[j] <= "9":
count2 += 1
return count1 == count2
print(string_comparison("hello", "world"))
print(string_comparison("apple", "1234"))
print(string_comparison("hello123", "world456"))
Output
Here, both the "hello" and "world" strings have zero digits. Therefore, the first comparison returns True.
"apple" has no digits, and "1234" has four digits. Therefore, the second comparison returns False.
Similarly, when comparing "hello123" and "world456", the function returns True as they have the same number of digits.
Method 3: String comparison using Python str method
You can also compare strings with the startswith()
and endswith()
methods. Here are some examples of using Python str methods.
Example 1
str = 'Hello World'
print(string.startswith('Hello'))
Output
Example 2
str = 'Hello World'
print(string.endswith('World'))
Output
In the above code samples, the startswith()
method is used to check if the string "str" starts with the prefix 'Hello.' It returns True because the statement string does indeed start with 'Hello.'
Similarly, the endswith()
method is used to check if the string “str” ends with the suffix 'World.' It also returns True since the statement string ends with 'World.'
Discover how Caxita, an online travel engine application company with high workloads, eliminated downtime and allowed engineers to focus on development thanks to Cherry Servers' robust bare metal cloud and 24/7 technical support.
Method 4: Other methods to compare strings in Python
Python is mainly used by data scientists and ML developers because it has many built-in functions for various algorithms. Similarly, Python has modules and external libraries built to compare strings using more advanced algorithms. While discussing them in great detail is beyond the scope of this article, it would be helpful to acknowledge their existence.
Difflib - The Python standard library includes the Difflib module, which offers various algorithms such as Ratcliff/Obershelp and the longest common subsequence for comparing strings.
Fuzzywuzzy - The Fuzzywuzzy library is a widely used Python package that builds upon the Difflib module. It specializes in matching strings and computing their likeness with one another, mainly through Levenshtein distance.
Python-Levenshtein - The Python-Levenshtein module offers quick access to the Levenshtein distance algorithm. This particular algorithm measures the differences between two strings, calculated by the smallest number of edits (substitutions, deletions, or insertions) required to transform one string into the other.
Final thoughts
Developers need to understand Python string comparison techniques to create reliable and efficient code. Practicing with diverse techniques allows coders to optimize their code and guarantee consistent and accurate string comparisons, improve their Python programming skills, and create more powerful and effective programs.