Home Are the days of passing const std::string & as a parameter over?
Reply: 12

Are the days of passing const std::string & as a parameter over?

Benj
1#
Benj Published in 2012-04-19 15:20:57Z

I heard a recent talk by Herb Sutter who suggested that the reasons to pass std::vector and std::string by const & are largely gone. He suggested that writing a function such as the following is now preferable:

std::string do_something ( std::string inval )
{
   std::string return_val;
   // ... do stuff ...
   return return_val;
}

I understand that the return_val will be an rvalue at the point the function returns and can therefore be returned using move semantics, which are very cheap. However, inval is still much larger than the size of a reference (which is usually implemented as a pointer). This is because a std::string has various components including a pointer into the heap and a member char[] for short string optimization. So it seems to me that passing by reference is still a good idea.

Can anyone explain why Herb might have said this?

Community
2#
Community Reply to 2017-05-23 12:18:24Z

The reason Herb said what he said is because of cases like this.

Let's say I have function A which calls function B, which calls function C. And A passes a string through B and into C. A does not know or care about C; all A knows about is B. That is, C is an implementation detail of B.

Let's say that A is defined as follows:

void A()
{
  B("value");
}

If B and C take the string by const&, then it looks something like this:

void B(const std::string &str)
{
  C(str);
}

void C(const std::string &str)
{
  //Do something with `str`. Does not store it.
}

All well and good. You're just passing pointers around, no copying, no moving, everyone's happy. C takes a const& because it doesn't store the string. It simply uses it.

Now, I want to make one simple change: C needs to store the string somewhere.

void C(const std::string &str)
{
  //Do something with `str`.
  m_str = str;
}

Hello, copy constructor and potential memory allocation (ignore the Short String Optimization (SSO)). C++11's move semantics are supposed to make it possible to remove needless copy-constructing, right? And A passes a temporary; there's no reason why C should have to copy the data. It should just abscond with what was given to it.

Except it can't. Because it takes a const&.

If I change C to take its parameter by value, that just causes B to do the copy into that parameter; I gain nothing.

So if I had just passed str by value through all of the functions, relying on std::move to shuffle the data around, we wouldn't have this problem. If someone wants to hold on to it, they can. If they don't, oh well.

Is it more expensive? Yes; moving into a value is more expensive than using references. Is it less expensive than the copy? Not for small strings with SSO. Is it worth doing?

It depends on your use case. How much do you hate memory allocations?

Community
3#
Community Reply to 2017-05-23 11:55:03Z

std::string is not Plain Old Data(POD), and its raw size is not the most relevant thing ever. For example, if you pass in a string which is above the length of SSO and allocated on the heap, I would expect the copy constructor to not copy the SSO storage.

The reason this is recommended is because inval is constructed from the argument expression, and thus is always moved or copied as appropriate- there is no performance loss, assuming that you need ownership of the argument. If you don't, a const reference could still be the better way to go.

bames53
4#
bames53 Reply to 2012-04-19 15:50:31Z

Unless you actually need a copy it's still reasonable to take const &. For example:

bool isprint(std::string const &s) {
    return all_of(begin(s),end(s),(bool(*)(char))isprint);
}

If you change this to take the string by value then you'll end up moving or copying the parameter, and there's no need for that. Not only is copy/move likely more expensive, but it also introduces a new potential failure; the copy/move could throw an exception (e.g., allocation during copy could fail) whereas taking a reference to an existing value can't.

If you do need a copy then passing and returning by value is usually (always?) the best option. In fact I generally wouldn't worry about it in C++03 unless you find that extra copies actually causes a performance problem. Copy elision seems pretty reliable on modern compilers. I think people's skepticism and insistence that you have to check your table of compiler support for RVO is mostly obsolete nowadays.


In short, C++11 doesn't really change anything in this regard except for people that didn't trust copy elision.

BЈовић
5#
BЈовић Reply to 2012-04-19 16:05:05Z

This highly depends on the compiler's implementation.

However, it also depends on what you use.

Lets consider next functions :

bool foo1( const std::string v )
{
  return v.empty();
}
bool foo2( const std::string & v )
{
  return v.empty();
}

These functions are implemented in a separate compilation unit in order to avoid inlining. Then :
1. If you pass a literal to these two functions, you will not see much difference in performances. In both cases, a string object has to be created
2. If you pass another std::string object, foo2 will outperform foo1, because foo1 will do a deep copy.

On my PC, using g++ 4.6.1, I got these results :

  • variable by reference: 1000000000 iterations -> time elapsed: 2.25912 sec
  • variable by value: 1000000000 iterations -> time elapsed: 27.2259 sec
  • literal by reference: 100000000 iterations -> time elapsed: 9.10319 sec
  • literal by value: 100000000 iterations -> time elapsed: 8.62659 sec
Community
6#
Community Reply to 2017-05-23 12:02:48Z

I've copy/pasted the answer from this question here, and changed the names and spelling to fit this question.

Here is code to measure what is being asked:

#include <iostream>

struct string
{
    string() {}
    string(const string&) {std::cout << "string(const string&)\n";}
    string& operator=(const string&) {std::cout << "string& operator=(const string&)\n";return *this;}
#if (__has_feature(cxx_rvalue_references))
    string(string&&) {std::cout << "string(string&&)\n";}
    string& operator=(string&&) {std::cout << "string& operator=(string&&)\n";return *this;}
#endif

};

#if PROCESS == 1

string
do_something(string inval)
{
    // do stuff
    return inval;
}

#elif PROCESS == 2

string
do_something(const string& inval)
{
    string return_val = inval;
    // do stuff
    return return_val; 
}

#if (__has_feature(cxx_rvalue_references))

string
do_something(string&& inval)
{
    // do stuff
    return std::move(inval);
}

#endif

#endif

string source() {return string();}

int main()
{
    std::cout << "do_something with lvalue:\n\n";
    string x;
    string t = do_something(x);
#if (__has_feature(cxx_rvalue_references))
    std::cout << "\ndo_something with xvalue:\n\n";
    string u = do_something(std::move(x));
#endif
    std::cout << "\ndo_something with prvalue:\n\n";
    string v = do_something(source());
}

For me this outputs:

$ clang++ -std=c++11 -stdlib=libc++ -DPROCESS=1 test.cpp
$ a.out
do_something with lvalue:

string(const string&)
string(string&&)

do_something with xvalue:

string(string&&)
string(string&&)

do_something with prvalue:

string(string&&)
$ clang++ -std=c++11 -stdlib=libc++ -DPROCESS=2 test.cpp
$ a.out
do_something with lvalue:

string(const string&)

do_something with xvalue:

string(string&&)

do_something with prvalue:

string(string&&)

The table below summarizes my results (using clang -std=c++11). The first number is the number of copy constructions and the second number is the number of move constructions:

+----+--------+--------+---------+
|    | lvalue | xvalue | prvalue |
+----+--------+--------+---------+
| p1 |  1/1   |  0/2   |   0/1   |
+----+--------+--------+---------+
| p2 |  1/0   |  0/1   |   0/1   |
+----+--------+--------+---------+

The pass-by-value solution requires only one overload but costs an extra move construction when passing lvalues and xvalues. This may or may not be acceptable for any given situation. Both solutions have advantages and disadvantages.

justin
7#
justin Reply to 2012-04-20 00:45:20Z

Are the days of passing const std::string & as a parameter over?

No. Many people take this advice (including Dave Abrahams') beyond the domain it applies to, and simplify it to apply to all std::string parameters -- Always passing std::string by value is not a "best practice" for any and all arbitrary parameters and applications because the optimizations these talks/articles focus on apply only to a restricted set of cases.

If you're returning a value, mutating the parameter, or taking the value, then passing by value could save expensive copying and offer syntactical convenience.

As ever, passing by const reference saves much copying when you don't need a copy.

Now to the specific example:

However inval is still quite a lot larger than the size of a reference (which is usually implemented as a pointer). This is because a std::string has various components including a pointer into the heap and a member char[] for short string optimization. So it seems to me that passing by reference is still a good idea. Can anyone explain why Herb might have said this?

If stack size is a concern (and assuming this is not inlined/optimized), return_val + inval > return_val -- IOW, peak stack usage can be reduced by passing by value here (note: oversimplification of ABIs). Meanwhile, passing by const reference can disable the optimizations. The primary reason here is not to avoid stack growth, but to ensure the optimization can be performed where it is applicable.

The days of passing by const reference aren't over -- the rules just more complicated than they once were. If performance is important, you'll be wise to consider how you pass these types, based on the details you use in your implementations.

digital_infinity
8#
digital_infinity Reply to 2012-04-25 12:03:06Z

IMO using the C++ reference for std::string is a quick and short local optimization, while using passing by value could be (or not) a better global optimization.

So the answer is: it depends on circumstances:

  1. If you write all the code from the outside to the inside functions, you know what the code does, you can use the reference const std::string &.
  2. If you write the library code or use heavily library code where strings are passed, you likely gain more in global sense by trusting std::string copy constructor behavior.
LearnCocos2D
9#
LearnCocos2D Reply to 2015-08-27 13:59:05Z

Short answer: NO! Long answer:

  • If you won't modify the string (treat is as read-only), pass it as const ref&.
    (the const ref& obviously needs to stay within scope while the function that uses it executes)
  • If you plan to modify it or you know it will get out of scope (threads), pass it as a value, don't copy the const ref& inside your function body.

There was a post on cpp-next.com called "Want speed, pass by value!". The TL;DR:

Guideline: Don’t copy your function arguments. Instead, pass them by value and let the compiler do the copying.

TRANSLATION of ^

Don’t copy your function arguments --- means: if you plan to modify the argument value by copying it to an internal variable, just use a value argument instead.

So, don't do this:

std::string function(const std::string& aString){
    auto vString(aString);
    vString.clear();
    return vString;
}

do this:

std::string function(std::string aString){
    aString.clear();
    return aString;
}

When you need to modify the argument value in your function body.

You just need to be aware how you plan to use the argument in the function body. Read-only or NOT... and if it sticks within scope.

Yakk
10#
Yakk Reply to 2015-01-26 03:09:35Z

Almost.

There is a TS for basic_string_view<?> which, if approved and folded into C++17, will bring us down to basically one narrow use case for std::string const& parameters.

The existence of move semantics has eliminated one use case for std::string const& -- if you are planning on storing the parameter, taking a std::string by value is more optimal, as you can move out of the parameter.

If someone called your function with a raw C "string" this means only one std::string buffer is ever allocated, as opposed to two in the std::string const& case.

However, if you don't intend to make a copy, taking by std::string const& is still useful in C++14.

With std::string_view, so long as you aren't passing said string to an API that expects C-style '\0'-terminated character buffers, you can more efficiently get std::string like functionality without risking any allocation. A raw C string can even be turned into a std::string_view without any allocation or character copying.

At that point, the use for std::string const& is when you aren't copying the data wholesale, and are going to pass it on to a C-style API that expects a null terminated buffer, and you need the higher level string functions that std::string provides. In practice, this is a rare set of requirements.

Erik Aronesty
11#
Erik Aronesty Reply to 2015-02-20 14:36:49Z

The problem is that "const" is a non-granular qualifier. What is usually meant by "const string ref" is "don't modify this string", not "don't modify the reference count". There is simply no way, in C++, to say which members are "const". They either all are, or none of them are.

In order to hack around this language issue, STL could allow "C()" in your example to make a move-semantic copy anyway, and dutifully ignore the "const" with regard to the reference count (and therefore assuming it wasn't declared const because it was mem-mapped or nano-thready or whatever). As long as it was well-specified, this would be fine.

Since STL doesn't, I have a version of a string that const_casts<> away the reference counter, and - lo and behold - you can freely pass cmstring's as const references, and make copies of them in deep functions, all day long, with no leaks or issues.

Since C++ offers no const granularity here, writing up a good specification and making a shiny new "const movable string" (cmstring) object is the best solution I've seen.

Community
12#
Community Reply to 2017-05-23 11:47:32Z

Herb Sutter is still on record, along with Bjarne Stroustroup, in recommending const std::string& as a parameter type; see https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#Rf-in .

There is a pitfall not mentioned in any of the other answers here: if you pass a string literal to a const std::string& parameter, it will pass a reference to a temporary string, created on-the-fly to hold the characters of the literal. If you then save that reference, it will be invalid once the temporary string is deallocated. To be safe, you must save a copy, not the reference. The problem stems from the fact that string literals are const char[N] types, requiring promotion to std::string.

The code below illustrates the pitfall and the workaround, along with a minor efficiency option -- overloading with a const char* method, as described at Is there a way to pass a string literal as reference in C++.

(Note: Sutter & Stroustroup advise that if you keep a copy of the string, also provide an overloaded function with a && parameter and std::move() it.)

#include <string>
#include <iostream>
class WidgetBadRef {
public:
    WidgetBadRef(const std::string& s) : myStrRef(s)  // copy the reference...
    {}

    const std::string& myStrRef;    // might be a reference to a temporary (oops!)
};

class WidgetSafeCopy {
public:
    WidgetSafeCopy(const std::string& s) : myStrCopy(s)
            // constructor for string references; copy the string
    {std::cout << "const std::string& constructor\n";}

    WidgetSafeCopy(const char* cs) : myStrCopy(cs)
            // constructor for string literals (and char arrays);
            // for minor efficiency only;
            // create the std::string directly from the chars
    {std::cout << "const char * constructor\n";}

    const std::string myStrCopy;    // save a copy, not a reference!
};

int main() {
    WidgetBadRef w1("First string");
    WidgetSafeCopy w2("Second string"); // uses the const char* constructor, no temp string
    WidgetSafeCopy w3(w2.myStrCopy);    // uses the String reference constructor
    std::cout << w1.myStrRef << "\n";   // garbage out
    std::cout << w2.myStrCopy << "\n";  // OK
    std::cout << w3.myStrCopy << "\n";  // OK
}

OUTPUT:

const char * constructor
const std::string& constructor

Second string
Second string
Ton van den Heuvel
13#
Ton van den Heuvel Reply to 2017-10-12 15:41:52Z

There is no silver bullet. Like always, it depends on your use case.

In my case, I tend to use value parameters where I have a function that takes so-called sink arguments. The value of a sink argument is copied in the function body. You pass by value in this case so that you can move construct or move assign from the passed argument. See also: Should I always move on `sink` constructor or setter arguments?.

In other cases, you can always come up with a scenario where having a const refeference parameter is more efficient than having a value parameter, in particular when the argument to the function is an lvalue with expensive copy semantics. Passing an rvalue to a const reference is never bad, it merely extends the lifetime of the temporary, with the drawback that you can not safely assume that the const reference is still valid after the function call (so do not copy the reference!).

You need to login account before you can post.

About| Privacy statement| Terms of Service| Advertising| Contact us| Help| Sitemap|
Processed in 0.396583 second(s) , Gzip On .

© 2016 Powered by mzan.com design MATCHINFO