Python Raw Strings

Created with Sketch.

Python Raw Strings

Summary: in this tutorial, you will learn about the Python raw strings and how to use them to handle strings that treat the backslashes as literal characters.

Introduction the Python raw strings

In Python, when you prefix a string with the letter r or R such as r'...' and R'...', that string becomes a raw string. Unlike a regular string, a raw string treats the backslashes (\) as literal characters.

Raw strings are useful when you deal with strings that have many backslashes, for example, regular expressions or directory paths on Windows.

To represent special characters such as tabs and newlines, Python uses the backslash (\) to signify the start of an escape sequence. For example:

s = 'lang\tver\nPython\t3'
print(s)

Code language: Python (python)

Output:

lang ver
Python 3

Code language: Python (python)

However, raw strings treat the backslash (\) as a literal character. For example:

s = r'lang\tver\nPython\t3'
print(s)

Code language: Python (python)

Output:

lang\tver\nPython\t3

Code language: Python (python)

A raw string is like its regular string with the backslash (\) represented as double backslashes (\\):

s1 = r'lang\tver\nPython\t3'
s2 = 'lang\\tver\\nPython\\t3'

print(s1 == s2) # True

Code language: Python (python)

In a regular string, Python counts an escape sequence as a single character:

s = '\n'
print(len(s)) # 1

Code language: Python (python)

However, in a raw string, Python counts the backslash (\) as one character:

s = r'\n'
print(len(s)) # 2

Code language: Python (python)

Since the backslash (\) escapes the single quote (') or double quotes ("), a raw string cannot end with an odd number of backslashes.

For example:

s = r'\'

Code language: Python (python)

Error:

SyntaxError: EOL while scanning string literal

Code language: Python (python)

Or

s = r'\\\'

Code language: Python (python)

Error:

SyntaxError: EOL while scanning string literal

Code language: Python (python)

Use raw strings to handle file path on Windows

Windows OS uses backslashes to separate paths. For example:

c:\user\tasks\new

Code language: Python (python)

If you use this path as a regular string, Python will issue a number of errors:

dir_path = 'c:\user\tasks\new'

Code language: Python (python)

Error:

SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \uXXXX escape

Code language: Python (python)

Python treats \u in the path as a Unicode escape but couldn’t decode it.

Now, if you escape the first backslash, you’ll have other issues:

dir_path = 'c:\\user\tasks\new'
print(dir_path)

Code language: Python (python)

Output:

c:\user asks
ew

Code language: Python (python)

In this example, the \t is a tab and \n is the newline.

To make it easy, you can turn the path into a raw string like this:

dir_path = r'c:\user\tasks\new'
print(dir_path)

Code language: Python (python)

Convert a regular string into a raw string

To convert a regular string into a raw string, you use the built-in repr() function. For example:

s = '\n'
raw_string = repr(s)

print(raw_string)

Code language: Python (python)

Output:

'\n'

Code language: Python (python)

Note that the result raw string has the quote at the beginning and end of the string. To remove them, you can use slices:

s = '\n'
raw_string = repr(s)[1:-1]
print(raw_string)

Code language: Python (python)

Summary

  • Prefix a literal string by the letter r or R to turn it into a raw string.
  • Raw strings treat backslash a literal character.

Leave a Reply

Your email address will not be published. Required fields are marked *