Home Python string formatting: % vs. .format
Reply: 12

Python string formatting: % vs. .format

NorthIsUp
1#
NorthIsUp Published in 2011-02-22 18:46:42Z

Python 2.6 introduced the str.format() method with a slightly different syntax from the existing % operator. Which is better and for what situations?

  1. The following uses each method and has the same outcome, so what is the difference?

    #!/usr/bin/python
    sub1 = "python string!"
    sub2 = "an arg"
    
    a = "i am a %s" % sub1
    b = "i am a {0}".format(sub1)
    
    c = "with %(kwarg)s!" % {'kwarg':sub2}
    d = "with {kwarg}!".format(kwarg=sub2)
    
    print a    # "i am a python string!"
    print b    # "i am a python string!"
    print c    # "with an arg!"
    print d    # "with an arg!"
    
  2. Furthermore when does string formatting occur in Python? For example, if my logging level is set to HIGH will I still take a hit for performing the following % operation? And if so, is there a way to avoid this?

    log.debug("some debug info: %s" % some_info)
    
Dave
2#
Dave Reply to 2015-05-01 12:12:38Z

To answer your first question... .format just seems more sophisticated in many ways. An annoying thing about % is also how it can either take a variable or a tuple. You'd think the following would always work:

"hi there %s" % name

yet, if name happens to be (1, 2, 3), it will throw a TypeError. To guarantee that it always prints, you'd need to do

"hi there %s" % (name,)   # supply the single argument as a single-item tuple

which is just ugly. .format doesn't have those issues. Also in the second example you gave, the .format example is much cleaner looking.

Why would you not use it?

  • not knowing about it (me before reading this)
  • having to be compatible with Python 2.5

To answer your second question, string formatting happens at the same time as any other operation - when the string formatting expression is evaluated. And Python, not being a lazy language, evaluates expressions before calling functions, so in your log.debug example, the expression "some debug info: %s"%some_infowill first evaluate to, e.g. "some debug info: roflcopters are active", then that string will be passed to log.debug().

the Tin Man
3#
the Tin Man Reply to 2015-01-14 19:28:08Z

Assuming you're using Python's logging module, you can pass the string formatting arguments as arguments to the .debug() method rather than doing the formatting yourself:

log.debug("some debug info: %s", some_info)

which avoids doing the formatting unless the logger actually logs something.

lcltj
4#
lcltj Reply to 2017-10-26 18:11:52Z

% gives better performance than format from my test.

Test code:

Python 2.7.2:

import timeit
print 'format:', timeit.timeit("'{}{}{}'.format(1, 1.23, 'hello')")
print '%:', timeit.timeit("'%s%s%s' % (1, 1.23, 'hello')")

Result:

> format: 0.470329046249
> %: 0.357107877731

Python 3.5.2

import timeit
print('format:', timeit.timeit("'{}{}{}'.format(1, 1.23, 'hello')"))
print('%:', timeit.timeit("'%s%s%s' % (1, 1.23, 'hello')"))

Result

> format: 0.5864730989560485
> %: 0.013593495357781649

It looks in Python2, the difference is small whereas in Python3, % is much faster than format.

Thanks @Chris Cogdon for the sample code.

the Tin Man
5#
the Tin Man Reply to 2015-01-14 19:24:14Z

Something that the modulo operator ( % ) can't do, afaik:

tu = (12,45,22222,103,6)
print '{0} {2} {1} {2} {3} {2} {4} {2}'.format(*tu)

result

12 22222 45 22222 103 22222 6 22222

Very useful.

Another point: format(), being a function, can be used as an argument in other functions:

li = [12,45,78,784,2,69,1254,4785,984]
print map('the number is {}'.format,li)   

print

from datetime import datetime,timedelta

once_upon_a_time = datetime(2010, 7, 1, 12, 0, 0)
delta = timedelta(days=13, hours=8,  minutes=20)

gen =(once_upon_a_time +x*delta for x in xrange(20))

print '\n'.join(map('{:%Y-%m-%d %H:%M:%S}'.format, gen))

Results in:

['the number is 12', 'the number is 45', 'the number is 78', 'the number is 784', 'the number is 2', 'the number is 69', 'the number is 1254', 'the number is 4785', 'the number is 984']

2010-07-01 12:00:00
2010-07-14 20:20:00
2010-07-28 04:40:00
2010-08-10 13:00:00
2010-08-23 21:20:00
2010-09-06 05:40:00
2010-09-19 14:00:00
2010-10-02 22:20:00
2010-10-16 06:40:00
2010-10-29 15:00:00
2010-11-11 23:20:00
2010-11-25 07:40:00
2010-12-08 16:00:00
2010-12-22 00:20:00
2011-01-04 08:40:00
2011-01-17 17:00:00
2011-01-31 01:20:00
2011-02-13 09:40:00
2011-02-26 18:00:00
2011-03-12 02:20:00
Jean-François Corbett
6#
Jean-François Corbett Reply to 2017-02-13 15:51:53Z

PEP 3101 proposes the replacement of the % operator with the new, advanced string formatting in Python 3, where it would be the default.

the Tin Man
7#
the Tin Man Reply to 2015-01-14 19:27:33Z

But please be careful, just now I've discovered one issue when trying to replace all % with .format in existing code: '{}'.format(unicode_string) will try to encode unicode_string and will probably fail.

Just look at this Python interactive session log:

Python 2.7.2 (default, Aug 27 2012, 19:52:55) 
[GCC 4.1.2 20080704 (Red Hat 4.1.2-48)] on linux2
; s='й'
; u=u'й'
; s
'\xd0\xb9'
; u
u'\u0439'

s is just a string (called 'byte array' in Python3) and u is a Unicode string (called 'string' in Python3):

; '%s' % s
'\xd0\xb9'
; '%s' % u
u'\u0439'

When you give a Unicode object as a parameter to % operator it will produce a Unicode string even if the original string wasn't Unicode:

; '{}'.format(s)
'\xd0\xb9'
; '{}'.format(u)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'latin-1' codec can't encode character u'\u0439' in position 0: ordinal not in range(256)

but the .format function will raise "UnicodeEncodeError":

; u'{}'.format(s)
u'\xd0\xb9'
; u'{}'.format(u)
u'\u0439'

and it will work with a Unicode argument fine only if the original string was Unicode.

; '{}'.format(u'i')
'i'

or if argument string can be converted to a string (so called 'byte array')

the Tin Man
8#
the Tin Man Reply to 2015-01-14 19:24:41Z

As I discovered today, the old way of formatting strings via % doesn't support Decimal, Python's module for decimal fixed point and floating point arithmetic, out of the box.

Example (using Python 3.3.5):

#!/usr/bin/env python3

from decimal import *

getcontext().prec = 50
d = Decimal('3.12375239e-24') # no magic number, I rather produced it by banging my head on my keyboard

print('%.50f' % d)
print('{0:.50f}'.format(d))

Output:

0.00000000000000000000000312375239000000009907464850 0.00000000000000000000000312375239000000000000000000

There surely might be work-arounds but you still might consider using the format() method right away.

David Sanders
9#
David Sanders Reply to 2014-10-21 19:55:49Z

As a side note, you don't have to take a performance hit to use new style formatting with logging. You can pass any object to logging.debug, logging.info, etc. that implements the __str__ magic method. When the logging module has decided that it must emit your message object (whatever it is), it calls str(message_object) before doing so. So you could do something like this:

import logging


class NewStyleLogMessage(object):
    def __init__(self, message, *args, **kwargs):
        self.message = message
        self.args = args
        self.kwargs = kwargs

    def __str__(self):
        args = (i() if callable(i) else i for i in self.args)
        kwargs = dict((k, v() if callable(v) else v) for k, v in self.kwargs.items())

        return self.message.format(*args, **kwargs)

N = NewStyleLogMessage

# Neither one of these messages are formatted (or calculated) until they're
# needed

# Emits "Lazily formatted log entry: 123 foo" in log
logging.debug(N('Lazily formatted log entry: {0} {keyword}', 123, keyword='foo'))


def expensive_func():
    # Do something that takes a long time...
    return 'foo'

# Emits "Expensive log entry: foo" in log
logging.debug(N('Expensive log entry: {keyword}', keyword=expensive_func))

This is all described in the Python 3 documentation (https://docs.python.org/3/howto/logging-cookbook.html#formatting-styles). However, it will work with Python 2.6 as well (https://docs.python.org/2.6/library/logging.html#using-arbitrary-objects-as-messages).

One of the advantages of using this technique, other than the fact that it's formatting-style agnostic, is that it allows for lazy values e.g. the function expensive_func above. This provides a more elegant alternative to the advice being given in the Python docs here: https://docs.python.org/2.6/library/logging.html#optimization.

matiasg
10#
matiasg Reply to 2015-06-19 16:42:39Z

Yet another advantage of .format (which I don't see in the answers): it can take object properties.

In [12]: class A(object):
   ....:     def __init__(self, x, y):
   ....:         self.x = x
   ....:         self.y = y
   ....:         

In [13]: a = A(2,3)

In [14]: 'x is {0.x}, y is {0.y}'.format(a)
Out[14]: 'x is 2, y is 3'

Or, as a keyword argument:

In [15]: 'x is {a.x}, y is {a.y}'.format(a=a)
Out[15]: 'x is 2, y is 3'

This is not possible with % as far as I can tell.

Jorge Leitão
11#
Jorge Leitão Reply to 2015-04-09 20:41:40Z

One situation where % may help is when you are formatting regex expressions. For example,

'{type_names} [a-z]{2}'.format(type_names='triangle|square')

raises IndexError. In this situation, you can use:

'%(type_names)s [a-z]{2}' % {'type_names': 'triangle|square'}

This avoids writing the regex as '{type_names} [a-z]{{2}}'. This can be useful when you have two regexes, where one is used alone without format, but the concatenation of both is formatted.

user1767754
12#
user1767754 Reply to 2017-12-20 06:22:25Z

As of Python 3.6 you can use f-strings to substitute variables into strings by name:

>>> origin = "London"
>>> destination = "Paris"
>>> f"from {origin} to {destination}"
'from London to Paris'

Note the f" prefix. If you try this in Python 3.5 or earlier, you'll get a SyntaxError.

See https://docs.python.org/3.6/reference/lexical_analysis.html#f-strings.

Roushan45
13#
Roushan45 Reply to 2018-02-14 22:42:32Z

Prior to >=python3.6

s1='albha'
s2='beta'

f'{s1}{s2:>10}'

#output
'albha      beta'
You need to login account before you can post.

About| Privacy statement| Terms of Service| Advertising| Contact us| Help| Sitemap|
Processed in 0.351009 second(s) , Gzip On .

© 2016 Powered by mzan.com design MATCHINFO