Path: senator-bedfellow.mit.edu!bloom-beacon.mit.edu!nycmny1-snh1.gtei.net!cpk-news-hub1.bbnplanet.com!news.gtei.net!newscon02.news.prodigy.com!newsmst01.news.prodigy.com!prodigy.com!postmaster.news.prodigy.com!newssvr12.news.prodigy.com.POSTED!not-for-mail
From: cline@parashift.com
Sender: cline@parashift.com
Newsgroups: comp.lang.c++,comp.answers,news.answers,alt.comp.lang.learn.c-c++
Subject: C++ FAQ (part 10 of 12)
Summary: Please read this before posting to comp.lang.c++
Followup-To: comp.lang.c++
Reply-To: cline@parashift.com (Marshall Cline)
Distribution: world
Approved: news-answers-request@mit.edu
Expires: +1 month
Lines: 1049
Message-ID: <uKfB8.11127$5P.1715488907@newssvr12.news.prodigy.com>
NNTP-Posting-Host: 66.140.56.188
X-Complaints-To: abuse@prodigy.net
X-Trace: newssvr12.news.prodigy.com 1020626970 ST000 66.140.56.188 (Sun, 05 May 2002 15:29:30 EDT)
NNTP-Posting-Date: Sun, 05 May 2002 15:29:30 EDT
Organization: Prodigy Internet http://www.prodigy.com
X-UserInfo1: SCSGWXCEGZRCRVXXBZK\_RLAPJT@QDDMEPWXODMMHXMTWA]EPUW[AKK[J\]^HVKHG^EWZHBLO^[\NH_AZFWGN^\DHNVMX_DHHX[FSQKBOTS@@BP^]C@RHS_AGDDC[AJM_T[GZNRNZAY]GNCPBDYKOLK^_CZFWPGHZIXW@C[AFKBBQS@E@DAZ]VDFUNTQQ]FN
Date: Sun, 05 May 2002 19:29:30 GMT
Xref: senator-bedfellow.mit.edu comp.lang.c++:643629 comp.answers:49835 news.answers:229580 alt.comp.lang.learn.c-c++:121856

Archive-name: C++-faq/part10
Posting-Frequency: monthly
Last-modified: May 3, 2002
URL: http://www.parashift.com/c++-faq-lite/

AUTHOR: Marshall Cline / cline@parashift.com / 972-931-9470

COPYRIGHT: This posting is part of "C++ FAQ Lite."  The entire "C++ FAQ Lite"
document is Copyright(C)1991-2002 Marshall Cline, Ph.D., cline@parashift.com.
All rights reserved.  Copying is permitted only under designated situations.
For details, see section [1].

NO WARRANTY: THIS WORK IS PROVIDED ON AN "AS IS" BASIS.  THE AUTHOR PROVIDES NO
WARRANTY WHATSOEVER, EITHER EXPRESS OR IMPLIED, REGARDING THE WORK, INCLUDING
WARRANTIES WITH RESPECT TO ITS MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR
PURPOSE.

C++-FAQ-Lite != C++-FAQ-Book: This document, C++ FAQ Lite, is not the same as
the C++ FAQ Book.  The book (C++ FAQs, Cline and Lomow, Addison-Wesley) is 500%
larger than this document, and is available in bookstores.  For details, see
section [3].

==============================================================================

SECTION [26]: Coding standards


[26.1] What are some good C++ coding standards?

Thank you for reading this answer rather than just trying to set your own
coding standards.

But beware that some people on comp.lang.c++ are very sensitive on this issue.
Nearly every software engineer has, at some point, been exploited by someone
who used coding standards as a "power play." Furthermore some attempts to set
C++ coding standards have been made by those who didn't know what they were
talking about, so the standards end up being based on what was the
state-of-the-art when the standards setters were writing code.  Such
impositions generate an attitude of mistrust for coding standards.

Obviously anyone who asks this question wants to be trained so they don't run
off on their own ignorance, but nonetheless posting a question such as this one
to comp.lang.c++ tends to generate more heat than light.

==============================================================================

[26.2] Are coding standards necessary? Are they sufficient?

Coding standards do not make non-OO programmers into OO programmers; only
training and experience do that.  If coding standards have merit, it is that
they discourage the petty fragmentation that occurs when large organizations
coordinate the activities of diverse groups of programmers.

But you really want more than a coding standard.  The structure provided by
coding standards gives neophytes one less degree of freedom to worry about,
which is good.  However pragmatic guidelines should go well beyond
pretty-printing standards.  Organizations need a consistent philosophy of
design and implementation.  E.g., strong or weak typing? references or pointers
in interfaces? stream I/O or stdio? should C++ code call C code? vice versa?
how should ABCs[22.3] be used? should inheritance be used as an implementation
technique or as a specification technique? what testing strategy should be
employed? inspection strategy? should interfaces uniformly have a get() and/or
set() member function for each data member? should interfaces be designed from
the outside-in or the inside-out? should errors be handled by try/catch/throw
or by return codes? etc.

What is needed is a "pseudo standard" for detailed design. I recommend a
three-pronged approach to achieving this standardization: training,
mentoring[27.1], and libraries.  Training provides "intense instruction,"
mentoring allows OO to be caught rather than just taught, and high quality C++
class libraries provide "long term instruction." There is a thriving commercial
market for all three kinds of "training." Advice by organizations who have been
through the mill is consistent: Buy, Don't Build. Buy libraries, buy training,
buy tools, buy consulting.  Companies who have attempted to become a
self-taught tool-shop as well as an application/system shop have found success
difficult.

Few argue that coding standards are "ideal," or even "good," however they are
necessary in the kind of organizations/situations described above.

The following FAQs provide some basic guidance in conventions and styles.

==============================================================================

[26.3] Should our organization determine coding standards from our C
       experience?

No!

No matter how vast your C experience, no matter how advanced your C expertise,
being a good C programmer does not make you a good C++ programmer.  Converting
from C to C++ is more than just learning the syntax and semantics of the ++
part of C++.  Organizations who want the promise of OO, but who fail to put the
"OO" into "OO programming", are fooling themselves; the balance sheet will show
their folly.

C++ coding standards should be tempered by C++ experts.  Asking comp.lang.c++
is a start.  Seek out experts who can help guide you away from pitfalls.  Get
training.  Buy libraries and see if "good" libraries pass your coding
standards.  Do not set standards by yourself unless you have considerable
experience in C++.  Having no standard is better than having a bad standard,
since improper "official" positions "harden" bad brain traces.  There is a
thriving market for both C++ training and libraries from which to pull
expertise.

One more thing: whenever something is in demand, the potential for charlatans
increases.  Look before you leap.  Also ask for student-reviews from past
companies, since not even expertise makes someone a good communicator.
Finally, select a practitioner who can teach, not a full time teacher who has a
passing knowledge of the language/paradigm.

==============================================================================

[26.4] What's the difference between <xxx> and <xxx.h> headers?

The headers in ISO Standard C++ don't have a .h suffix.  This is something the
standards committee changed from former practice.  The details are different
between headers that existed in C and those that are specific to C++.

The C++ standard library is guaranteed to have 18 standard headers from the C
language.  These headers come in two standard flavors, <cxxx> and <xxx.h>
(where xxx is the basename of the header, such as stdio, stdlib, etc).  These
two flavors are identical except the <cxxx> versions provide their declarations
in the std namespace only, and the <xxx.h> versions make them available both in
std namespace and in the global namespace.  The committee did it this way so
that existing C code could continue to be compiled in C++.  However the <xxx.h>
versions are deprecated, meaning they are standard now but might not be part of
the standard in future revisions.  (See clause D.5 of the ISO C++
standard[6.12].)

The C++ standard library is also guaranteed to have 32 additional standard
headers that have no direct counterparts in C, such as <iostream>, <string>,
and <new>.  You may see things like #include <iostream.h> and so on in old
code, and some compiler vendors offer .h versions for that reason.  But be
careful: the .h versions, if available, may differ from the standard versions.
And if you compile some units of a program with, for example, <iostream> and
others with <iostream.h>, the program may not work.

For new projects, use only the <xxx> headers, not the <xxx.h> headers.

When modifying or extending existing code that uses the old header names, you
should probably follow the practice in that code unless there's some important
reason to switch to the standard headers (such as a facility available in
standard <iostream> that was not available in the vendor's <iostream.h>).  If
you need to standardize existing code, make sure to change all C++ headers in
all program units including external libraries that get linked in to the final
executable.

All of this affects the standard headers only.  You're free to name your own
headers anything you like; see [26.8].

==============================================================================

[26.5] Is the ?: operator evil since it can be used to create unreadable code?

No, but as always, remember that readability is one of the most important
things.

Some people feel the ?: ternary operator should be avoided because they find it
confusing at times compared to the good old if statement.  In many cases ?:
tends to make your code more difficult to read (and therefore you should
replace those usages of ?: with if statements), but there are times when the ?:
operator is clearer since it can emphasize what's really happening, rather than
the fact that there's an if in there somewhere.

Let's start with a really simple case.  Suppose you need to print the result of
a function call.  In that case you should put the real goal (printing) at the
beginning of the line, and bury the function call within the line since it's
relatively incidental (this left-right thing is based on the intuitive notion
that most developers think the first thing on a line is the most important
thing):

 // Preferred (emphasizes the major goal -- printing):
 std::cout << funct();

 // Not as good (emphasizes the minor goal -- a function call):
 functAndPrintOn(std::cout);

Now let's extend this idea to the ?: operator.  Suppose your real goal is to
print something, but you need to do some incidental decision logic to figure
out what should be printed.  Since the printing is the most important thing
conceptually, we prefer to put it first on the line, and we prefer to bury the
incidental decision logic.  In the example code below, variable n represents
the number of senders of a message; the message itself is being printed to
std::cout:

 int n = /*...*/;   // number of senders

 // Preferred (emphasizes the major goal -- printing):
 std::cout << "Please get back to " << (n==1 ? "me" : "us") << " soon!\n";

 // Not as good (emphasizes the minor goal -- a decision):
 std::cout << "Please get back to ";
 if (n == 1)
   std::cout << "me";
 else
   std::cout << "us";
 std::cout << " soon!\n";

All that being said, you can get pretty outrageous and unreadable code ("write
only code") using various combinations of ?:, &&, ||, etc.  For example,

 // Preferred (obvious meaning):
 if (f())
   g();

 // Not as good (harder to understand):
 f() && g();

Personally I think the explicit if example is clearer since it emphasizes the
major thing that's going on (a decision based on the result of calling f())
rather than the minor thing (calling f()).  In other words, the use of if here
is good for precisely the same reason that it was bad above: we want to major
on the majors and minor on the minors.

In any event, don't forget that readability is the goal (at least it's one of
the goals).  Your goal should not be to avoid certain syntactic constructs such
as ?: or && or || or if -- or even goto.  If you sink to the level of a
"Standards Bigot," you'll ultimately embarass yourself since there are always
counterexamples to any syntax-based rule.  If on the other hand you emphasize
broad goals and guidelines (e.g., "major on the majors," or "put the most
important thing first on the line," or even "make sure your code is obvious and
readable"), you're usually much better off.

Code must be written to be read, not by the compiler, but by another human
being.

==============================================================================

[26.6] Should I declare locals in the middle of a function or at the top?

Declare near first use.

An object is initialized (constructed) the moment it is declared.  If you don't
have enough information to initialize an object until half way down the
function, you should create it half way down the function when it can be
initialized correctly.  Don't initialize it to an "empty" value at the top then
"assign" it later.  The reason for this is runtime performance.  Building an
object correctly is faster than building it incorrectly and remodeling it
later.  Simple examples show a factor of 350% speed hit for simple classes like
String.  Your mileage may vary; surely the overall system degradation will be
less that 350%, but there will be degradation.  Unnecessary degradation.

A common retort to the above is: "we'll provide set() member functions for
every datum in our objects so the cost of construction will be spread out."
This is worse than the performance overhead, since now you're introducing a
maintenance nightmare.  Providing a set() member function for every datum is
tantamount to public data: you've exposed your implementation technique to the
world.  The only thing you've hidden is the physical names of your member
objects, but the fact that you're using a List and a String and a float, for
example, is open for all to see.

Bottom line: Locals should be declared near their first use.  Sorry that this
isn't familiar to C experts, but new doesn't necessarily mean bad.

==============================================================================

[26.7] What source-file-name convention is best? foo.cpp? foo.C? foo.cc?

If you already have a convention, use it.  If not, consult your compiler to see
what the compiler expects.  Typical answers are: .cpp, .C, .cc, or .cxx
(naturally the .C extension assumes a case-sensitive file system to distinguish
.C from .c).

We've often used .cpp for our C++ source files, and we have also used .C.  In
the latter case, when porting to case-insensitive file systems you need to tell
the compiler to treat .c files as if they were C++ source files (e.g., -Tdp for
IBM CSet++, -cpp for Zortech C++, -P for Borland C++, etc.).

The point is that none of these filename extensions are uniformly superior to
the others.  We generally use whichever technique is preferred by our customer
(again, these issues are dominated by business considerations, not by technical
considerations).

==============================================================================

[26.8] What header-file-name convention is best? foo.H? foo.hh? foo.hpp?

If you already have a convention, use it.  If not, and if you don't need your
editor to distinguish between C and C++ files, simply use .h.  Otherwise use
whatever the editor wants, such as .H, .hh, or .hpp.

We've tended to use either .hpp or .h for our C++ header files.

==============================================================================

[26.9] Are there any lint-like guidelines for C++?

Yes, there are some practices which are generally considered dangerous.
However none of these are universally "bad," since situations arise when even
the worst of these is needed:
 * A class Fred's assignment operator should return *this as a Fred& (allows
   chaining of assignments)
 * A class with any virtual[20] functions ought to have a virtual
   destructor[20.5]
 * A class with any of {destructor, assignment operator, copy constructor}
   generally needs all 3
 * A class Fred's copy constructor and assignment operator should have const in
   the parameter: respectively Fred::Fred(const Fred&) and
   Fred& Fred::operator= (const Fred&)
 * When initializing an object's member objects in the constructor, always use
   initialization lists rather than assignment.  The performance difference for
   user-defined classes can be substantial (3x!)
 * Assignment operators should make sure that self assignment[12.1] does
   nothing, otherwise you may have a disaster[12.2].  In some cases, this may
   require you to add an explicit test to your assignment operators[12.3].
 * In classes that define both += and +, a += b and a = a + b should generally
   do the same thing; ditto for the other identities of built-in/intrinsic
   types (e.g., a += 1 and ++a; p[i] and *(p+i); etc).  This can be enforced by
   writing the binary operations using the op= forms.  E.g.,

    Fred operator+ (const Fred& a, const Fred& b)
    {
      Fred ans = a;
      ans += b;
      return ans;
    }

   This way the "constructive" binary operators don't even need to be
friends[14].  But it is sometimes possible to more efficiently implement common
operations (e.g., if class Fred is actually std::string, and += has to
reallocate/copy string memory, it may be better to know the eventual length
from the beginning).

==============================================================================

[26.10] Why do people worry so much about pointer casts and/or reference casts?

Because they're evil! (Use them sparingly and with great care.)

For some reason, programmers are sloppy in their use of pointer casts.  They
cast this to that all over the place, then they wonder why things don't quite
work right.  Here's the worst thing: when the compiler gives them an error
message, they add a cast to "shut the compiler up," then they "test it" to see
if it seems to work.  If you have a lot of pointer casts or reference casts,
read on.

The compiler will often be silent when you're doing pointer-casts and/or
reference casts.  Pointer-casts (and reference-casts) tend to shut the compiler
up.  I think of them as a filter on error messages: the compiler wants to
complain because it sees you're doing something stupid, but it also sees that
it's not allowed to complain due to your pointer-cast, so it drops the error
message into the bit-bucket.  It's like putting duct tape on the compiler's
mouth: it's trying to tell you something important, but you've intentionally
shut it up.

A pointer-cast says to the compiler, "Stop thinking and start generating code;
I'm smart, you're dumb; I'm big, you're little; I know what I'm doing so just
pretend this is assembly language and generate the code." The compiler pretty
much blindly generates code when you start casting -- you are taking control
(and responsibility!) for the outcome.  The compiler and the language reduce
(and in some cases eliminate!) the guarantees you get as to what will happen.
You're on your own.

By way of analogy, even if it's legal to juggle chainsaws, it's stupid.  If
something goes wrong, don't bother complaining to the chainsaw manufacturer --
you did something they didn't guarantee would work.  You're on your own.

(To be completely fair, the language does give you some guarantees when you
cast, at least in a limited subset of casts.  For example, it's guaranteed to
work as you'd expect if the cast happens to be from an object-pointer (a
pointer to a piece of data, as opposed to a pointer-to-function or
pointer-to-member) to type void* and back to the same type of object-pointer.
But in a lot of cases you're on your own.)

==============================================================================

[26.11] Which is better: identifier names that_look_like_this or identifier
        names thatLookLikeThis?

It's a precedent thing.  If you have a Pascal or Smalltalk background,
youProbablySquashNamesTogether like this.  If you have an Ada background,
You_Probably_Use_A_Large_Number_Of_Underscores like this.  If you have a
Microsoft Windows background, you probably prefer the "Hungarian" style which
means you jkuidsPrefix vndskaIdentifiers ncqWith ksldjfTheir nmdsadType.  And
then there are the folks with a Unix C background, who abbr evthng n use vry
srt idntfr nms.  (AND THE FORTRN PRGMRS LIMIT EVRYTH TO SIX LETTRS.)

So there is no universal standard.  If your project team has a particular
coding standard for identifier names, use it.  But starting another Jihad over
this will create a lot more heat than light.  From a business perspective,
there are only two things that matter: The code should be generally readable,
and everyone on the team should use the same style.

Other than that, th difs r minr.

One more thing: don't import a coding style onto platform-specific code where
it is foreign.  For example, a coding style that seems natural while using a
Microsoft library might look bizarre and random while using a UNIX library.
Don't do it.  Allow different styles for different platforms.  (Just in case
someone out there isn't reading carefully, don't send me email about the case
of common code that is designed to be used/ported to several platforms, since
that code wouldn't be platform-specific, so the above "allow different styles"
guideline doesn't even apply.)

Okay, one more.  Really.  Don't fight the coding styles used by automatically
generated code (e.g., by tools that generate code).  Some people treat coding
standards with religious zeal, and they try to get tools to generate code in
their local style.  Forget it: if a tool generates code in a different style,
don't worry about it.  Remember money and time?!? This whole coding standard
thing was supposed to save money and time; don't turn it into a "money pit."

==============================================================================

[26.12] Are there any other sources of coding standards? [UPDATED!]

[Recently added several new links and updated numerous changed URLs thanks to
John Vorwald (in 4/02).]

Yep, there are several.

Here are a few sources that you might be able to use as starting points for
developing your organization's coding standards (in random order):
 * cdfsga.fnal.gov/computing/coding_guidelines/CodingGuidelines.html
 * www.nfra.nl/~seg/cppStdDoc.html
 * www.cs.umd.edu/users/cml/resources/cstyle
 * www.cs.rice.edu/~dwallach/CPlusPlusStyle.html
 * cpptips.hyperformix.com/conventions/cppconventions_1.html
 * www.objectmentor.com/resources/articles/naming.htm
 * www.arcticlabs.com/codingstandards/
 * www.possibility.com/cpp/CppCodingStandard.html
 * www.cs.umd.edu/users/cml/cstyle/Wildfire-C++Style.html
 * The Ellemtel coding guidelines are available at
   - www.cs.umd.edu/users/cml/cstyle/Ellemtel-rules.html
   - www.doc.ic.ac.uk/lab/cplus/c++.rules/
   - www.mgl.co.uk/people/kirit/cpprules.html

Note: I do NOT warrant or endorse these URLs and/or their contents.  They are
listed as a public service only.  I haven't checked their details, so I don't
know if they'll help you or hurt you.  Caveat emptor.

==============================================================================

[26.13] Should I use "unusual" syntax? [NEW!]

[Recently created (in 4/02).]

Only when there is a compelling reason to do so.  In other words, only when
there is no "normal" syntax that will produce the same end-result.

Software decisions should be made based on money.  Unless you're in an ivory
tower somewhere, when you do something that increases costs, increases risks,
increases time, or, in a constrained environment, increases the product's
space/speed costs, you've done something "bad." In your mind you should
translate all these things into money.

Because of this pragmatic, money-oriented view of software, programmers should
avoid non-mainstream syntax whenever there exists a "normal" syntax that would
be equivalent.  If a programmer does something obscure, other programmers are
confused; that costs money.  These other programmers will probably introduce
bugs (costs money), take longer to maintain the thing (money), have a harder
time changing it (missing market windows = money), have a harder time
optimizing it (in a constrained environment, somebody will have to spend money
for more memory, a faster CPU, and/or a bigger battery), and perhaps have angry
customers (money).  It's a risk-reward thing: using abnormal syntax carries a
risk, but when an equivalent, "normal" syntax would do the same thing, there is
no "reward" to ameliorate that risk.

For example, the techniques used in the Obfuscated C Code Contest
<http://www.ioccc.org/> are, to be polite, non-normal.  Yes many of them are
legal, but not everything that is legal is moral.  Using strange techniques
will confuse other programmers.  Some programmers love to "show off" how far
they can push the envelope, but that puts their ego above money, and that's
unprofessional.  Frankly anybody who does that ought to be fired.  (And if you
think I'm being "mean" or "cruel," I suggest you get an attitude adjustment.
Remember this: your company hired you to help it, not to hurt it, and anybody
who puts their own personal ego-trips above their company's best interest
simply ought to be shown the door.)

As an example of non-mainstream syntax, it's not "normal" to use the ?:
operator as a statement.  (Some people don't even like it as an expression, but
everyone must admit that there are a lot of uses of ?: out there, so it is
"normal" (as an expression) whether people like it or not[26.5].) Here is an
example of using using ?: as a statement:

 blah();
 blah();
 xyz() ? foo() : bar();  // should replace with if/else
 blah();
 blah();

Same goes with using || and && as if they are "if-not" and "if" statements,
respectively.  Yes, those are idioms in Perl, but C++ is not Perl and using
these as replacements for if statements (as opposed to using them as
expressions) is just not "normal" in C++.  Example:

 foo() || bar();  // should replace with if (!foo()) bar();
 foo() && bar();  // should replace with if (foo()) bar();

Here's another example that seems to work and may even be legal, but it's
certainly not normal:

 void f(const& MyClass x)  // use const MyClass& x instead
 {
   ...
 }

==============================================================================

SECTION [27]: Learning OO/C++


[27.1] What is mentoring?

It's the most important tool in learning OO.

Object-oriented thinking is caught, not just taught.  Get cozy with someone who
really knows what they're talking about, and try to get inside their head and
watch them solve problems.  Listen.  Learn by emulating.

If you're working for a company, get them to bring someone in who can act as a
mentor and guide.  We've seen gobs and gobs of money wasted by companies who
"saved money" by simply buying their employees a book ("Here's a book; read it
over the weekend; on Monday you'll be an OO developer").

==============================================================================

[27.2] Should I learn C before I learn OO/C++?

Don't bother.

If your ultimate goal is to learn OO/C++ and you don't already know C, reading
books or taking courses in C will not only waste your time, but it will teach
you a bunch of things that you'll explicitly have to un-learn when you finally
get back on track and learn OO/C++ (e.g., malloc()[16.3], printf()[15.1],
unnecessary use of switch statements[20], error-code exception handling[17],
unnecessary use of #define macros[9.3], etc.).

If you want to learn OO/C++, learn OO/C++.  Taking time out to learn C will
waste your time and confuse you.

==============================================================================

[27.3] Should I learn Smalltalk before I learn OO/C++?

Don't bother.

If your ultimate goal is to learn OO/C++ and you don't already know Smalltalk,
reading books or taking courses in Smalltalk will not only waste your time, but
it will teach you a bunch of things that you'll explicitly have to un-learn
when you finally get back on track and learn OO/C++ (e.g., dynamic
typing[28.3], non-subtyping inheritance[28.5], error-code exception
handling[17], etc.).

Knowing a "pure" OO language doesn't make the transition to OO/C++ any easier.
This is not a theory; we have trained and mentored literally thousands of
software professionals in OO.  In fact, Smalltalk experience can make it harder
for some people: they need to unlearn some rather deep notions about typing and
inheritance in addition to needing to learn new syntax and idioms.  This
unlearning process is especially painful and slow for those who cling to
Smalltalk with religious zeal ("C++ is not like Smalltalk, therefore C++ is
evil").

If you want to learn OO/C++, learn OO/C++.  Taking time out to learn Smalltalk
will waste your time and confuse you.

Note: I sit on both the ANSI C++ (X3J16) and ANSI Smalltalk (X3J20)
standardization committees[6.11].  I am not a language bigot[6.4].  I'm not
saying C++ is better or worse than Smalltalk; I'm simply saying that they are
different[28.1].

==============================================================================

[27.4] Should I buy one book, or several?

At least three.

There are three categories of insight and knowledge in OO programming using
C++.  You should get a great book from each category, not an okay book that
tries to do an okay job at everything.  The three OO/C++ programming categories
are:
 * C++ legality guides -- what you can and can't do in C++[27.6].
 * C++ morality guides -- what you should and shouldn't do in C++[27.5].
 * Programming-by-example guides -- show lots of examples, normally making
   liberal use of the C++ standard library[27.7].

Legality guides describe all language features with roughly the same level of
emphasis; morality guides focus on those language features that you will use
most often in typical programming tasks.  Legality guides tell you how to get a
given feature past the compiler; morality guides tell you whether or not to use
that feature in the first place.

Meta comments:
 * Don't trade off these categories against each other.  You shouldn't argue in
   favor of one category over the other.  They dove-tail.
 * The "legality" and "morality" categories are both required.  You must have a
   good grasp of both what can be done and what should be done.

There is a fourth category you should consider in addition to the above three:
OO Design books[27.8].  These are books that focus on how to think and design
with objects.

==============================================================================

[27.5] What are some best-of-breed C++ morality guides?

Here's my personal (subjective and selective) short-list of must-read C++
morality guides, alphabetically by author:
 * Cline, Lomow, and Girou, C++ FAQs, Second Edition, 587 pgs, Addison-Wesley,
   1999, ISBN 0-201-30983-1.  Covers around 500 topics in a FAQ-like Q&A
   format.
 * Meyers, Effective C++, Second Edition, 224 pgs, Addison-Wesley, 1998, ISBN
   0-201-92488-9.  Covers 50 topics in a short essay format.
 * Meyers, More Effective C++, 336 pgs, Addison-Wesley, 1996, ISBN
   0-201-63371-X.  Covers 35 topics in a short essay format.

Similarities: All three books are extensively illustrated with code examples.
All three are excellent, insightful, useful, gold plated books.  All three have
excellent sales records.

Differences: Cline/Lomow/Girou's examples are complete, working programs rather
than code fragments or standalone classes.  Meyers contains numerous
line-drawings that illustrate the points.

==============================================================================

[27.6] What are some best-of-breed C++ legality guides?

Here's my personal (subjective and selective) short-list of must-read C++
legality guides, alphabetically by author:
 * Lippman and Lajoie, C++ Primer, Third Edition, 1237 pgs, Addison-Wesley,
   1998, ISBN 0-201-82470-1.  Very readable/approachable.
 * Stroustrup, The C++ Programming Language, Third Edition, 911 pgs,
   Addison-Wesley, 1998, ISBN 0-201-88954-4.  Covers a lot of ground.

Similarities: Both books are excellent overviews of almost every language
feature.  I reviewed them for back-to-back issues of C++ Report, and I said
that they are both top notch, gold plated, excellent books.  Both have
excellent sales records.

Differences: If you don't know C, Lippman's book is better for you.  If you
know C and you want to cover a lot of ground quickly, Stroustrup's book is
better for you.

==============================================================================

[27.7] What are some best-of-breed C++ programming-by-example guides?

Here's my personal (subjective and selective) short-list of must-read C++
programming-by-example guides:
 * Koenig and Moo, Accelerated C++, 336 pgs, Addison-Wesley, 2000, ISBN
   0-201-70353-X.  Lots of examples using the standard C++ library.  Truly a
   programming-by-example book.
 * Musser, STL Tutorial and Reference Guide, Addison-Wesley, 2001.  Lots of
   examples showing how to use the STL portion of the standard C++ library,
   plus lots of nitty gritty detail.

==============================================================================

[27.8] Are there other OO books that are relevant to OO/C++?

Yes! Tons!

The morality[27.5], legality[27.6], and by-example[27.7] categories listed
above were for OO programming.  The areas of OO analysis and OO design are also
relevant, and have their own best-of-breed books.

There are tons and tons of good books in these other areas.  The seminal book
on OO design patterns is (in my personal, subjective and selective, opinion) a
must-read book: Gamma et al., Design Patterns, 395 pgs, Addison-Wesley, 1995,
ISBN 0-201-63361-2.  Describes "patterns" that commonly show up in good OO
designs.  You must read this book if you intend to do OO design work.

==============================================================================

SECTION [28]: Learning C++ if you already know Smalltalk


[28.1] What's the difference between C++ and Smalltalk?

Both fully support the OO paradigm.  Neither is categorically and universally
"better" than the other[6.4].  But there are differences.  The most important
differences are:
 * Static typing vs. dynamic typing[28.2]
 * Whether inheritance must be used only for subtyping[28.5]
 * Value vs. reference semantics[29]

Note: Many new C++ programmers come from a Smalltalk background.  If that's
you, this section will tell you the most important things you need know to make
your transition.  Please don't get the notion that either language is somehow
"inferior" or "bad"[6.4], or that this section is promoting one language over
the other (I am not a language bigot; I serve on both the ANSI C++ and ANSI
Smalltalk standardization committees[6.11]).  Instead, this section is designed
to help you understand (and embrace!) the differences.

==============================================================================

[28.2] What is "static typing," and how is it similar/dissimilar to Smalltalk?

Static typing says the compiler checks the type safety of every operation
statically (at compile-time), rather than to generate code which will check
things at run-time.  For example, with static typing, the signature matching
for function arguments is checked at compile time, not at run-time.  An
improper match is flagged as an error by the compiler, not by the run-time
system.

In OO code, the most common "typing mismatch" is invoking a member function
against an object which isn't prepared to handle the operation.  E.g., if class
Fred has member function f() but not g(), and fred is an instance of class
Fred, then fred.f() is legal and fred.g() is illegal.  C++ (statically typed)
catches the error at compile time, and Smalltalk (dynamically typed) catches
the error at run-time.  (Technically speaking, C++ is like Pascal --pseudo
statically typed-- since pointer casts and unions can be used to violate the
typing system; which reminds me: use pointer casts[26.10] and unions only as
often as you use gotos).

==============================================================================

[28.3] Which is a better fit for C++: "static typing" or "dynamic typing"?

[For context, please read the previous FAQ[28.2]].

If you want to use C++ most effectively, use it as a statically typed language.

C++ is flexible enough that you can (via pointer casts, unions, and #define
macros) make it "look" like Smalltalk.  But don't.  Which reminds me: try to
avoid #define: it is evil in 4 different ways: evil#1[9.3], evil#2[35.2],
evil#3[35.3], and evil#4[35.4].

There are places where pointer casts and unions are necessary and even
wholesome, but they should be used carefully and sparingly.  A pointer cast
tells the compiler to believe you.  An incorrect pointer cast might corrupt
your heap, scribble into memory owned by other objects, call nonexistent member
functions, and cause general failures.  It's not a pretty sight[26.10].  If you
avoid these and related constructs, you can make your C++ code both safer and
faster, since anything that can be checked at compile time is something that
doesn't have to be done at run-time.

If you're interested in using a pointer cast, use the new style pointer casts.
The most common example of these is to change old-style pointer casts such as
(X*)p into new-style dynamic casts such as dynamic_cast<X*>(p), where p is a
pointer and X is a type.  In addition to dynamic_cast, there is static_cast and
const_cast, but dynamic_cast is the one that simulates most of the advantages
of dynamic typing (the other is the typeid() construct; for example,
typeid(*p).name() will return the name of the type of *p).

==============================================================================

[28.4] How do you use inheritance in C++, and is that different from Smalltalk?

Some people believe that the purpose of inheritance is code reuse.  In C++,
this is wrong.  Stated plainly, "inheritance is not for code reuse."

The purpose of inheritance in C++ is to express interface compliance
(subtyping), not to get code reuse.  In C++, code reuse usually comes via
composition rather than via inheritance.  In other words, inheritance is mainly
a specification technique rather than an implementation technique.

This is a major difference with Smalltalk, where there is only one form of
inheritance (C++ provides private inheritance to mean "share the code but don't
conform to the interface", and public inheritance to mean "kind-of").  The
Smalltalk language proper (as opposed to coding practice) allows you to have
the effect of "hiding" an inherited method by providing an override that calls
the "does not understand" method.  Furthermore Smalltalk allows a conceptual
"is-a" relationship to exist apart from the inheritance hierarchy (subtypes
don't have to be derived classes; e.g., you can make something that is-a Stack
yet doesn't inherit from class Stack).

In contrast, C++ is more restrictive about inheritance: there's no way to make
a "conceptual is-a" relationship without using inheritance (the C++ work-around
is to separate interface from implementation via ABCs[22.3]).  The C++ compiler
exploits the added semantic information associated with public inheritance to
provide static typing.

==============================================================================

[28.5] What are the practical consequences of differences in Smalltalk/C++
       inheritance?

[For context, please read the previous FAQ[28.4]].

Smalltalk lets you make a subtype that isn't a derived class, and allows you to
make a derived class that isn't a subtype.  This allows Smalltalk programmers
to be very carefree in putting data (bits, representation, data structure) into
a class (e.g., you might put a linked list into class Stack).  After all, if
someone wants an array-based-Stack, they don't have to inherit from Stack; they
could inherit such a class from Array if desired, even though an
ArrayBasedStack is not a kind-of Array!

In C++, you can't be nearly as carefree.  Only mechanism (member function
code), but not representation (data bits) can be overridden in derived classes.
Therefore you're usually better off not putting the data structure in a class.
This leads to a stronger reliance on abstract base classes[22.3].

I like to think of the difference between an ATV and a Maseratti.  An ATV (all
terrain vehicle) is more fun, since you can "play around" by driving through
fields, streams, sidewalks, and the like.  A Maseratti, on the other hand, gets
you there faster, but it forces you to stay on the road.  My advice to C++
programmers is simple: stay on the road.  Even if you're one of those people
who like the "expressive freedom" to drive through the bushes, don't do it in
C++; it's not a good fit.

==============================================================================

SECTION [29]: Reference and value semantics


[29.1] What is value and/or reference semantics, and which is best in C++?

With reference semantics, assignment is a pointer-copy (i.e., a reference).
Value (or "copy") semantics mean assignment copies the value, not just the
pointer.  C++ gives you the choice: use the assignment operator to copy the
value (copy/value semantics), or use a pointer-copy to copy a pointer
(reference semantics).  C++ allows you to override the assignment operator to
do anything your heart desires, however the default (and most common) choice is
to copy the value.

Pros of reference semantics: flexibility and dynamic binding (you get dynamic
binding in C++ only when you pass by pointer or pass by reference, not when you
pass by value).

Pros of value semantics: speed.  "Speed" seems like an odd benefit for a
feature that requires an object (vs. a pointer) to be copied, but the fact of
the matter is that one usually accesses an object more than one copies the
object, so the cost of the occasional copies is (usually) more than offset by
the benefit of having an actual object rather than a pointer to an object.

There are three cases when you have an actual object as opposed to a pointer to
an object: local objects, global/static objects, and fully contained member
objects in a class.  The most important of these is the last ("composition").

More info about copy-vs-reference semantics is given in the next FAQs.  Please
read them all to get a balanced perspective.  The first few have intentionally
been slanted toward value semantics, so if you only read the first few of the
following FAQs, you'll get a warped perspective.

Assignment has other issues (e.g., shallow vs. deep copy) which are not covered
here.

==============================================================================

[29.2] What is "virtual data," and how-can / why-would I use it in C++?

virtual data allows a derived class to change the exact class of a base class's
member object.  virtual data isn't strictly "supported" by C++, however it can
be simulated in C++.  It ain't pretty, but it works.

To simulate virtual data in C++, the base class must have a pointer to the
member object, and the derived class must provide a new object to be pointed to
by the base class's pointer.  The base class would also have one or more normal
constructors that provide their own referent (again via new), and the base
class's destructor would delete the referent.

For example, class Stack might have an Array member object (using a pointer),
and derived class StretchableStack might override the base class member data
from Array to StretchableArray.  For this to work, StretchableArray would have
to inherit from Array, so Stack would have an Array*.  Stack's normal
constructors would initialize this Array* with a new Array, but Stack would
also have a (possibly protected) constructor that would accept an Array* from a
derived class.  StretchableStack's constructor would provide a
new StretchableArray to this special constructor.

Pros:
 * Easier implementation of StretchableStack (most of the code is inherited)
 * Users can pass a StretchableStack as a kind-of Stack

Cons:
 * Adds an extra layer of indirection to access the Array
 * Adds some extra freestore allocation overhead (both new and delete)
 * Adds some extra dynamic binding overhead (reason given in next FAQ)

In other words, we succeeded at making our job easier as the implementer of
StretchableStack, but all our users pay for it[29.5].  Unfortunately the extra
overhead was imposed on both users of StretchableStack and on users of Stack.

Please read the rest of this section.  (You will not get a balanced perspective
without the others.)

==============================================================================

[29.3] What's the difference between virtual data and dynamic data?

The easiest way to see the distinction is by an analogy with virtual
functions[20]: A virtual member function means the declaration (signature) must
stay the same in derived classes, but the definition (body) can be overridden.
The overriddenness of an inherited member function is a static property of the
derived class; it doesn't change dynamically throughout the life of any
particular object, nor is it possible for distinct objects of the derived class
to have distinct definitions of the member function.

Now go back and re-read the previous paragraph, but make these substitutions:
 * "member function" --> "member object"
 * "signature" --> "type"
 * "body" --> "exact class"

After this, you'll have a working definition of virtual data.

Another way to look at this is to distinguish "per-object" member functions
from "dynamic" member functions.  A "per-object" member function is a member
function that is potentially different in any given instance of an object, and
could be implemented by burying a function pointer in the object; this pointer
could be const, since the pointer will never be changed throughout the object's
life.  A "dynamic" member function is a member function that will change
dynamically over time; this could also be implemented by a function pointer,
but the function pointer would not be const.

Extending the analogy, this gives us three distinct concepts for data members:
 * virtual data: the definition (class) of the member object is overridable in
   derived classes provided its declaration ("type") remains the same, and this
   overriddenness is a static property of the derived class
 * per-object-data: any given object of a class can instantiate a different
   conformal (same type) member object upon initialization (usually a "wrapper"
   object), and the exact class of the member object is a static property of
   the object that wraps it
 * dynamic-data: the member object's exact class can change dynamically over
   time

The reason they all look so much the same is that none of this is "supported"
in C++.  It's all merely "allowed," and in this case, the mechanism for faking
each of these is the same: a pointer to a (probably abstract) base class.  In a
language that made these "first class" abstraction mechanisms, the difference
would be more striking, since they'd each have a different syntactic variant.

==============================================================================

[29.4] Should I normally use pointers to freestore allocated objects for my
       data members, or should I use "composition"?

Composition.

Your member objects should normally be "contained" in the composite object (but
not always; "wrapper" objects are a good example of where you want a
pointer/reference; also the N-to-1-uses-a relationship needs something like a
pointer/reference).

There are three reasons why fully contained member objects ("composition") has
better performance than pointers to freestore-allocated member objects:
 * Extra layer of indirection every time you need to access the member object
 * Extra freestore allocations (new in constructor, delete in destructor)
 * Extra dynamic binding (reason given below)

==============================================================================

[29.5] What are relative costs of the 3 performance hits associated with
       allocating member objects from the freestore?

The three performance hits are enumerated in the previous FAQ:
 * By itself, an extra layer of indirection is small potatoes
 * Freestore allocations can be a performance issue (the performance of the
   typical implementation of malloc() degrades when there are many allocations;
   OO software can easily become "freestore bound" unless you're careful)
 * The extra dynamic binding comes from having a pointer rather than an object.
   Whenever the C++ compiler can know an object's exact class, virtual[20]
   function calls can be statically bound, which allows inlining.  Inlining
   allows zillions (would you believe half a dozen :-) optimization
   opportunities such as procedural integration, register lifetime issues, etc.
   The C++ compiler can know an object's exact class in three circumstances:
   local variables, global/static variables, and fully-contained member objects

Thus fully-contained member objects allow significant optimizations that
wouldn't be possible under the "member objects-by-pointer" approach.  This is
the main reason that languages which enforce reference-semantics have
"inherent" performance challenges.

Note: Please read the next three FAQs to get a balanced perspective!

==============================================================================

[29.6] Are "inline virtual" member functions ever actually "inlined"?

Occasionally...

When the object is referenced via a pointer or a reference, a call to a
virtual[20] function cannot be inlined, since the call must be resolved
dynamically.  Reason: the compiler can't know which actual code to call until
run-time (i.e., dynamically), since the code may be from a derived class that
was created after the caller was compiled.

Therefore the only time an inline virtual call can be inlined is when the
compiler knows the "exact class" of the object which is the target of the
virtual function call.  This can happen only when the compiler has an actual
object rather than a pointer or reference to an object.  I.e., either with a
local object, a global/static object, or a fully contained object inside a
composite.

Note that the difference between inlining and non-inlining is normally much
more significant than the difference between a regular function call and a
virtual function call.  For example, the difference between a regular function
call and a virtual function call is often just two extra memory references, but
the difference between an inline function and a non-inline function can be as
much as an order of magnitude (for zillions of calls to insignificant member
functions, loss of inlining virtual functions can result in 25X speed
degradation! [Doug Lea, "Customization in C++," proc Usenix C++ 1990]).

A practical consequence of this insight: don't get bogged down in the endless
debates (or sales tactics!) of compiler/language vendors who compare the cost
of a virtual function call on their language/compiler with the same on another
language/compiler.  Such comparisons are largely meaningless when compared with
the ability of the language/compiler to "inline expand" member function calls.
I.e., many language implementation vendors make a big stink about how good
their dispatch strategy is, but if these implementations don't inline member
function calls, the overall system performance would be poor, since it is
inlining --not dispatching-- that has the greatest performance impact.

Note: Please read the next two FAQs to see the other side of this coin!

==============================================================================

[29.7] Sounds like I should never use reference semantics, right?

Wrong.

Reference semantics are A Good Thing.  We can't live without pointers.  We just
don't want our s/w to be One Gigantic Rats Nest Of Pointers.  In C++, you can
pick and choose where you want reference semantics (pointers/references) and
where you'd like value semantics (where objects physically contain other
objects etc).  In a large system, there should be a balance.  However if you
implement absolutely everything as a pointer, you'll get enormous speed hits.

Objects near the problem skin are larger than higher level objects.  The
identity of these "problem space" abstractions is usually more important than
their "value." Thus reference semantics should be used for problem-space
objects.

Note that these problem space objects are normally at a higher level of
abstraction than the solution space objects, so the problem space objects
normally have a relatively lower frequency of interaction.  Therefore C++ gives
us an ideal situation: we choose reference semantics for objects that need
unique identity or that are too large to copy, and we can choose value
semantics for the others.  Thus the highest frequency objects will end up with
value semantics, since we install flexibility where it doesn't hurt us (only),
and we install performance where we need it most!

These are some of the many issues the come into play with real OO design.
OO/C++ mastery takes time and high quality training.  If you want a powerful
tool, you've got to invest.

Don't stop now! Read the next FAQ too!!

==============================================================================

[29.8] Does the poor performance of reference semantics mean I should
       pass-by-value?

Nope.

The previous FAQ were talking about member objects, not parameters.  Generally,
objects that are part of an inheritance hierarchy should be passed by reference
or by pointer, not by value, since only then do you get the (desired) dynamic
binding (pass-by-value doesn't mix with inheritance, since larger derived class
objects get sliced[20.6] when passed by value as a base class object).

Unless compelling reasons are given to the contrary, member objects should be
by value and parameters should be by reference.  The discussion in the previous
few FAQs indicates some of the "compelling reasons" for when member objects
should be by reference.

==============================================================================

