DRAFT

Permission is granted to any individual or institution to use, copy, modify, and distribute this document, provided that this complete copyright and permission notice is maintained intact in all copies.

Ellemtel Telecommunication Systems Laboratories makes no representations about the suitability of this document or the examples described herein for any purpose. It is provided "as is" without any expressed or implied warranty.

Cornell University and the CLEO collaboration make no representations about the suitability of this document or the examples described herein for any purpose. It is provided "as is" without any expressed or implied warranty.

$Id: c++std.html,v 1.10 1996/04/23 19:49:02 dsr Exp $

Introduction
Rules
Recommendations
Portability Recommendations
Style Recommendations

Introduction

The base document for the CLEO III C++ coding standard is assumed to be the ISO/ANSI C++ standard. Since that standard does not exist yet, this draft is based on the April 1995 X3J16 working paper. Not all features referenced are currently available in all (or, in some cases, any) C++ compiler; a future version of this document may indicate appropriate workarounds (where possible).

Rules

Every time a rule is broken must be clearly documented. No exceptions.

"This module violates rule 1" is not allowed.

Every file that contains source code must begin with a CLEO standard header with all fields filled in.

Suitable source file skeletons can be found in /cleo/util/skeletons. A completed header should look something like this (not quite identical to what is in /cleo/util/templates--fix):

// -*- C++ -*-
//
// Package:     Tracker
// Module:	Helix
//
// Description: Encapsulates a 5-parameter helix.
//
// Usage:
//      [class and public member function
//       documentation goes here]
//
// Author:      A. Trackfinder
// Created:     Thu Jan  1 00:00:01 EST 1996
// $Id$
//
// Revision history
//
// $Log$

Every include file must be protected against multiple inclusion with a #define of the form <package>_<module>_H, e.g.
```
#if !defined(TRACKER_HELIX_H)
#define TRACKER_HELIX_H
// ...
#endif
```
All comments and names are to be written in English.
Never use ".." in #include directives.
The identifier of every globally visible class, enumeration type, type definition, function, constant, and variable in a class library should be encapsulated in the unique namespace for that library.
Do not use names that differ only by the use of uppercase and lowercase letters.
Do not use identifiers which begin with two underscores (`__') or one underscore (`_') followed by a capital letter. Do not use identifiers with global or file scope which begin with one underscore (`_').

These are all reserved to the compiler by the ANSI standard.

All classes should be declared following the CLEO standard class skeleton.

Public, protected, and private are to declared explicitly in that order; do not use default access control. By placing the public section first, everything that is of interest to a user is gathered in the beginning of the class definition. The protected section may be of interest to designers when deriving from the class. The private section contains details that should have the least general interest.

class Helix {
   // friend classes and functions
   public:
   // constants, enums and typedefs
   // Constructors and destructor
      Helix();
      virtual ~Helix();

   // member functions
   // const member functions
   // static member functions
   protected:
   // protected member functions
   // protected const member functions
   private:
   // Constructors and destructor
      Helix(const Helix&);
   // assignment operator(s)
      const Helix& operator=( const Helix& );
   // private member functions
   // private const member functions
   // data members
   // static data members
};

Suitable skeletons can be found in /cleo/util/skeletons.

No non-empty member functions are to be defined within the class definition.

Class definitions are less compact and more difficult to read when they include definitions of member functions. It is also easier to change an inline member function to an ordinary member function if the definition of the inline function is placed outside of the class definition.

The only exception is for "trivial" functions which are guaranteed to not change, such as accessor functions to data whose representation is fixed, or forwarding functions.
All class member data should be private.

A public variable represents a violation of one of the basic principles of object-oriented programming, namely, encapsulation of data. For example, if there is a class of the type BankAccount, in which account_balance is a public variable, the value of this variable may be changed by any user of the class. However, if the variable has been declared private, its value may be changed only by the member functions of the class.

If public data is avoided, its internal representation may be changed without users of the class having to modify their code. A principle of class design is to maintain the stability of the public interface of the class. The implementation of a class should not be a concern for its users.

The use of protected variables in a class is not recommended, since these variables become visible to any derived classes. The names or types of variables in a base class may then not be changed since the derived classes may depend on them. If a derived class must access data in its base class, make a special protected interface in the base class containing functions which return private data. This solution would not imply any degradation of performance if the functions are defined inline.
A member function that does not affect the state of an object is to be declared const.

Member functions declared as const may not modify non-mutable member data and are the only functions which may be invoked on a const object. A const declaration is an excellent insurance that objects will not be modified when they should not be.
const member functions may not alter any data that changes the behavior of the object.

It is possible for an object to have member data that can be changed without changing the significant behavior of the object, such as use counters used for profiling, or caches. Such member data should be declared with the mutable keyword.
Constructors and destructors must not be inline.

Since a constructor or destructor always invoke the constructors or destructors of its base classes, inlining these functions can cause hidden code bloat.
All classes should declare a copy constructor, an assignment operator from its own type, a default constructor and a destructor. Any that are not used should be declared private and not defined.

If any of these is not supplied, the compiler will generate a default version. Since the default is not always appropriate, explicit functions should be supplied, or a comment should be made indicating that the default is appropriate.

In particular, a copy constructor is recommended to avoid surprises when an object is initialized using an object of the same type. If an object manages the allocation and deallocation of an object on the heap (the managing object has a pointer to the object to be created by the class' constructor), only the value of the pointer will be copied. This can lead to two invocations of the destructor for the same object (on the heap), probably resulting in a run-time error. The same applies to the assignment operator.

Any of these that are not used should be declared private and not defined, so that the compiler or linker will complain if one of them is inadverdantly used or an implicit use is generated by the compiler.
All classes which might possibly be used as base classes must define a virtual destructor.

If a class, having virtual functions but without virtual destructors, is used as a base class, there may be a surprise if pointers to the class are used. If such a pointer is assigned to an instance of a derived class and delete is then used on this pointer, only the base class' destructor will be invoked. If the program depends on the derived class' destructor being invoked, the program will fail.

In general, all classes should define a virtual destructor unless the added size of the virtual function table pointer would be unacceptable
An assignment operator which performs a destructive action must be protected from performing this action on the object it is operating on.

A common error is assigning an object to itself (a = a, but usually not that obviously). Normally, destructors for instances which are allocated dynamically are invoked before assignment takes place. If an object is assigned to itself, the instance variables will be destroyed before they are assigned. This may lead to strange run-time errors. If a = a is detected, the assigned object should not be changed, e.g.
```
const T&
T::operator=(const T& rhs)
{
   if (this != &rhs) {
      delete m_t;
      m_t = rhs.m_t;
      // rest of assignment...
   }
   return *this;
}
```
A public member function must never return a non-const reference or pointer to private member data or to data outside an object.

This rule may be violated by member functions which relinquish ownership of an object, e.g. from container classes. In all cases, it must be clear where ownership of every object resides.
Do not use unspecified function arguments (ellipsis notation, varargs).

Unspecified function arguments break type safety. Use overloaded functions or default arguments instead.
The names of formal arguments to functions are to be specified and are to be the same both in the function declaration and in the function definition.

Function argument names should be considered to be part of the documentation of the class interface. Function argument names should be chosen accordingly, and should appear in the declaration as that is where the class interface should be documented.

Argument names may be omitted in the declaration if the usage is trivially obvious. However, this practice is discouraged where it may lead to ambiguity, for example
```
    void f(const T); // is this 'const int T'
                     //      or 'const T x'?
```
Always specify the return type of a function explicitly.

Implicit typing should always be avoided. Similarly, unsigned int x; is preferred to unsigned x;
A function must never return a reference or a pointer to a local variable.

If a function returns a reference or a pointer to a local variable, the memory to which it refers will already have been deallocated when the reference or pointer is used.
Avoid using #define wherever possible.

Use inline functions (where necessary) for short functions; use const or enum to define constants. Acceptable uses for #define are to protect header files from multiple inclusion and for "feature-test" defines.
Avoid the use of "magic" numeric values in code; use symbolic values instead.

Numerical values in code ("Magic Numbers") should be viewed with suspicion. They can be the cause of difficult problems when it becomes necessary to change a value. A large amount of code can be dependent on such a value never changing, the value can be used at a number of places in the code (it may be difficult to locate all of them), and values as such are rather anonymous (it may be that every `2' in the code should not be changed to a `3').

Magic numbers also provide no information about why that numbers is used there. Well chosen symbolic names are much more useful for the reader and maintainer.
Use initialization instead of assignment wherever possible.

By always initializing variables, instead of assigning values to them before they are first used, the code is made more efficient since no temporary objects are created for the initialization. For objects having large amounts of data, this can result in significantly faster code.
List members in an constructor initialization list in the order in which they are declared.

Initialization lists are executed in the order the members were declared in, not the order they are listed in. Listing initializers in declaration order avoids any confusion.
Avoid explicit casts between classes.

C style casts should be avoided at all times. Where casts are necessary, dynamic_cast is preferred since it is type safe. static_cast and const_cast may be used sparingly (but appropriate use of mutable is preferred to const_cast). reinterpret_cast should be avoided whenever possible.
Do not rely on implicit type conversion functions in function argument lists.

C++ is lenient concerning the variables that may be used as arguments to functions. If there is no function which exactly matches the types of the arguments, the compiler attempts to convert types to find a match. If more than one matching function is found, a compilation error may result. Existing code which the compiler had allowed may suddenly contain errors after a new implicit type conversion is added to the code.

Another unpredictable effect of implicit type conversions is that temporary objects may be created during the conversion. The temporary object is then the argument to the function, not the original object.

Be careful with constructors that use only one argument, since this introduces a new type conversion which the compiler can unexpectedly use. Single argument constructors that should not be used for implicit type conversions should be declared with the explicit keyword.
Never convert a const to a non-const.

Data members that should be modifiable even in const objects should be declared mutable.
The code following a case label or group of case labels must be terminated by a break statement.

If the code which follows a case label is not terminated by break, the execution continues after the next case label. This program flow is confusing and hence should be avoided.

When several case labels are followed by the same block of code, only one break statement is needed.
A switch statement must contain a default case.
Never use goto.

Goto breaks the control flow and can lead to code that is difficult to comprehend.
Do not use malloc, realloc or free.

In C, malloc, realloc and free are used to allocate memory dynamically on the heap. This may lead to conflicts with the use of the new and delete operators in C++, and can limit the effectiveness of overloading global new with a custom allocator.
Do not allocate memory and assume that it will be deallocated for you.

However, in some places, it may be documented that an object will be freed e.g. at the end of the event or end of run.

Recommendations

Give a file a name that is unique in as large a context as possible.

[specify rule for constructing filename from library prefix]

[how does this interact with the library standard?]
An include file for a class declaration should have a file name of the form class name + extension. Use uppercase and lowercase letters in the same way as in the source code.
When possible, do not #include the definitions of classes that are only accessed via pointers (*) or references (&).

Using a forward declaration instead can reduce the number of include files that need to be parsed and reduce the impact of changes.
Write some descriptive comments before every function. [describe, provide examples]
Use // for comments.
Every implementation file should declare a local constant string to contain the revision control system ID.

[should be in skeleton, class should have a member function to dump the RCS ID to an ostream]
Variables are to be declared with the smallest possible scope.

Variables should be declared as needed and with the smallest feasible scope. This avoids cluttering the variable namespace unecessarily, helps keep declaration and use close together, and ensures that objects are destroyed (and their resources freed) as soon as possible.
A variable with a large scope should have a fully descriptive name. Choose variable names that suggest the usage.
The flow control primitives if, else, while, for and do should be followed by a block, even if it is an empty block.

At times, everything that is to be done in a loop may be easily written on one line in the loop statement itself. It may then be tempting to conclude the statement with a semicolon at the end of the line. This may lead to misunderstanding since, when reading the code, it is easy to miss such a semicolon, or forget to add braces when adding statements to the block. It is better to place an empty block after the statement to make completely clear what the code is doing.
```
   // No block at all - No!
   while ( /* Something */ );

   // Single statement - No!
   while ( /* something */ )
      statement;

   // ok
   while ( /* Something */ ) {}

   // ok
   while ( /* something */ ) {
      statement;
   }
```
Think about alternatives before declaring friends of a class.

Friends can violate encapsulation. While there are situations where friends are appropriate, their use should be limited as much as possible. The use of many friends may indicate that the modularity of the system is poor. When friends are used, it is best to limit the scope by making particular member functions friends instead of the entire class.
Avoid the use of external static objects in constructors and destructors.

The order of initialization of static objects defined in different compilation units is not fixed by the language, so a static object may not depend on the previous initialization of a static object in a different compilation unit.
An assignment operator ought to return a const reference to the assigning object (usually *this).

Returning a const reference is most compatible with the behavior of built-in types.
Use operator overloading sparingly and in a uniform manner.

Operator overloading has both advantages and disadvantages. One advantage is that code which uses a class with overloaded operators can be written more compactly (and possibly more readably). Another advantage is that the semantics can be both simple and natural. One disadvantage in overloading operators is that it is easy to misunderstand the meaning of an overloaded operator if the programmer has not used natural semantics.
When an operator with an opposite (such as == and !=) is defined, the opposite should also be defined. When an operator with associated operators (such as +, -, +=, -=) is defined, all the associated operators should be defined.

Designing a class library is like designing a language! If you use operator overloading, use it in a uniform manner; do not use it if it can easily give rise to misunderstanding.
Avoid functions with many arguments.

Functions having long lists of arguments look complicated, are difficult to read, and can indicate poor design. In addition, they are difficult to use and to maintain.
An object in a function argument list should normally be passed as a reference, unless the function stores the value, in which case it should be passed as a pointer. Use constant references (const & in the formal argument list) when possible.

By default C++, passes arguments by value. Function arguments are copied to the stack via invocations of copy constructors, which can be expensive for large objects, and destructors are invoked when exiting the function. const & arguments mean that only a reference to the object in question is placed on the stack (call-by-reference) and that the object's state (its instance variables) cannot be modified.

By using references instead of pointers as function arguments, code can be made more readable, especially within the function. A disadvantage is that it is not easy to see which functions change the values of their arguments. Member functions which store pointers which have been provided as arguments should document this clearly by declaring the argument as a pointer instead of as a reference.
Wherever possible, allocated memory should be stored in an object and deallocated in the destructor.

The ANSI auto_ptr template class can often be used to make this automatic:
```
void f()
{
   auto_ptr<T> p1(new T);
   // ...
}
```
The T pointed to by p1 will be automatically deleted when p1 goes out of scope (when f() exits). This style of "declaration as allocation" is very useful when exceptions are used.
Use Assert to test class invariants in the InvariantTest member function.
- InvariantTest should be in the CLEO standard class skeleton
- Assert should be a template that throws an exception instead of aborting. Should be incorporated into the CLEO exception hierarchy.
When overloading functions, all variations should have the same semantics.

Overloading of functions can be a powerful tool for creating a family of related functions that only differ as to the type of data provided as arguments. If not used properly (such as using functions with the same name for different purposes), they can cause considerable confusion.
inline functions only when necessary.

Access and forwarding functions may be inline if they are unlikely to ever change. Excessive use of inlining will force excessive recompilation when the implementation of those inline functions changes.
Prefer polymorphism to switch statements in class member functions.
Use unsigned for variables which cannot have negative values.
Use inclusive lower limits and exclusive upper limits.

Instead of saying that x is in the interval x>=23 and x<=42, use the limits x>=23 and x<43. The following useful properties follow:
- The size of the interval between the limits is the difference between the limits.
- The limits are equal if the interval is empty.
- The upper limit is never less than the lower limit.
By being consistent in this regard, many difficult errors will be avoided.
Do not use a bare pointer in the condition of an if statement.

Use if (0 == p), not if (!p)
Use parentheses to clarify the order of evaluation for operators in expressions.

There are a number of common pitfalls having to do with the order of evaluation for operators in an expression. In doubtful cases, always use parentheses to clarify the order of evaluation.
Avoid global data.
Use standard library functions.

Portability Recommendations

Code that requires a type of a particular size should typedef a local type from a system typedef of the appropriate size. Do not assume that the built-in types have particular sizes. [cleotypes.inc?]
Use explicit type conversions for arithmetic using signed and unsigned values.
Do not assume that you know how an instance of a data type is represented in memory.
Do not assume that long, float, double or long double types may begin at arbitrary addresses.
Do not depend on underflow or overflow functioning in any special way. [what can one depend on?]
Do not assume that the operands in an expression are evaluated in a definite order. [value modified twice, functions with side effects]
Do not assume that you know how the invocation mechanism for a function is implemented.
Do not assume that static objects in different files are initialized in any special order.

Style Recommendations

Programming style is to some extent a matter of individual taste that should not be dictated by inflexible rules. However, there are advantages to a common style for a large collaborative group, or a recommended style for inexperienced programmers. Accordingly, CLEO members are encouraged to consider these recommendations.