The syntax of C++ is the set of rules defining how a C++ program is written and compiled.
C++ syntax is largely inherited from the syntax of its ancestor language C, and has influenced the syntax of several later languages including but not limited to Java, C#, and Rust.
Much of C++'s syntax aligns with C syntax, as C++ provides backwards compatibility with C.
The C++ "Hello, World!" program program is as follows:
Prior to C++23, the "Hello, world!" program used iostreams.
An identifier is the name of an element in the code. There are certain standard naming conventions to follow when selecting names for elements. Identifiers in C++ are case-sensitive.
An identifier can contain:
An identifier cannot:
The identifier <code>nullptr</code> is not a reserved word, but is a global constant that refers to a null pointer literal. Similarly, the words <code>true</code> and <code>false</code> refer to the Boolean values true and false respectively.
The following words may not be used as identifier names or redefined, of which there are 80.
The keyword <code>restrict</code>, though present in C, is not standard in C++, though some compilers may support it. The keyword <code>fortran</code>, a conditionally supported keyword in C which denotes linkage for the Fortran programming language, is conditionally supported in C++.
The following words refer to literal values used by the language, of which there are 3.
C++ also defines 8 global objects (all residing in namespace <code>std</code> and defined in <code><iostream></code>):
The following words are reserved keywords, but are used as alternative spellings for operators and tokens that use non-ISO646 characters, of which there are 11.
The following words may be used as identifier names, but bear special meanings in certain contexts, of which there are 4.
The following tokens are recognised by the preprocessor in the context of preprocessor directives, of which there are 20.
The following macros are defined in the C/C++ standard library:
The following keywords are keywords in some C++ technical specifications, but not the main language itself.
The former reflection technical specification proposed the keyword <code>reflexpr</code>, however C++26's finalised reflection instead used a new operator <code>^^</code> instead of <code>reflexpr</code>.
The transactional memory technical specification introduces these keywords, of which there are 4.
The following identifiers with special meaning are introduced, of which there are 2.
The C++ standard reserves the namespaces <code>std</code> (for C++ standard library symbols) and <code>posix</code> (unused, but presumably for POSIX-related symbols). It is undefined behaviour to add declarations or definitions to these namespaces, with the exception of adding template specialisations to symbols in namespace <code>std</code>. The C++ standard further reserves the module names matching <code>std</code> and <code>std.*</code>.
The separators and signify a code block and a new scope. Class members and the body of a method are examples of what can live inside these braces in various contexts.
Inside of method bodies, braces may be used to create new scopes, as follows:
C++ has two kinds of comments: traditional comments and end-of-line comments.
Traditional comments, also known as block comments, start with <code>/*</code> and end with <code>*/</code>, they may span across multiple lines.
End-of-line comments start with <code>//</code> and extend to the end of the current line.
Documentation comments in the source files are processed by the external Doxygen tool to generate documentation. This type of comment is identical to traditional comments, except it starts with <code>/**</code> and follows conventions defined by the Doxygen tool. Technically, these comments are a special kind of traditional comment and they are not specifically defined in the language specification.
Much like in C, the parameters given on a command line are passed to a C++ program with two predefined variables - the count of the command-line arguments in and the individual arguments as character strings in the pointer array . So the command:
myFilt p1 p2 p3
results in something like:
While individual strings are arrays of contiguous characters, there is no guarantee that the strings are stored as a contiguous group.
The name of the program, , may be useful when printing diagnostic messages or for making one binary serve multiple purposes. The individual values of the parameters may be accessed with , , and , as shown in the following program:
C++ introduces object-oriented programming (OOP) features to C. It offers classes, which provide the four features commonly present in OOP (and some non-OOP) languages: abstraction, encapsulation, inheritance, and polymorphism. One distinguishing feature of classes compared to classes in other programming languages is support for deterministic destructors, which in turn provide support for the Resource Acquisition is Initialization (RAII) concept.
As in C, C++ supports four types of memory management: static storage duration objects, thread storage duration objects, automatic storage duration objects, and dynamic storage duration objects.
Static storage duration objects are created before <code>main()</code> is entered (see exceptions below) and destroyed in reverse order of creation after <code>main()</code> exits. The exact order of creation is not specified by the standard (though there are some rules defined below) to allow implementations some freedom in how to organize their implementation. More formally, objects of this type have a lifespan that "shall last for the duration of the program".
Static storage duration objects are initialized in two phases. First, "static initialization" is performed, and only after all static initialization is performed, "dynamic initialization" is performed. In static initialization, all objects are first initialized with zeros; after that, all objects that have a constant initialization phase are initialized with the constant expression (i.e. variables initialized with a literal or <code>constexpr</code>). Though it is not specified in the standard, the static initialization phase can be completed at compile time and saved in the data partition of the executable. Dynamic initialization involves all object initialization done via a constructor or function call (unless the function is marked with <code>constexpr</code>, in C++11). The dynamic initialization order is defined as the order of declaration within the compilation unit (i.e. the same file). No guarantees are provided about the order of initialization between compilation units.
Variables of this type are very similar to static storage duration objects. The main difference is the creation time is just before thread creation, and destruction is done after the thread has been joined.
The most common variable types in C++ are local variables inside a function or block, and temporary variables. The common feature about automatic variables is that they have a lifetime that is limited to the scope of the variable. They are created and potentially initialized at the point of declaration (see below for details) and destroyed in the reverse order of creation when the scope is left. This is implemented by allocation on the stack.
Local variables are created as the point of execution passes the declaration point. If the variable has a constructor or initializer this is used to define the initial state of the object. Local variables are destroyed when the local block or function that they are declared in is closed. C++ destructors for local variables are called at the end of the object lifetime, allowing a discipline for automatic resource management termed RAII, which is widely used in C++.
Member variables are created when the parent object is created. Array members are initialized from 0 to the last member of the array in order. Member variables are destroyed when the parent object is destroyed in the reverse order of creation. i.e. If the parent is an "automatic object" then it will be destroyed when it goes out of scope which triggers the destruction of all its members.
Temporary variables are created as the result of expression evaluation and are destroyed when the statement containing the expression has been fully evaluated (usually at the <code>;</code> at the end of a statement).
These objects have a dynamic lifespan and can be created directly with a call to and destroyed explicitly with a call to . C++ also supports <code>malloc</code> and <code>free</code>, from C, but these are not compatible with and . Use of returns an address to the allocated memory. The C++ Core Guidelines advise against using directly for creating dynamic objects in favor of smart pointers through for single ownership and for reference-counted multiple ownership, which were introduced in C++11.
C++ is often considered to be a superset of C but this is not strictly true. Most C code can easily be made to compile correctly in C++ but there are a few differences that cause some valid C code to be invalid or behave differently in C++. For example, C allows implicit conversion from to other pointer types but C++ does not (for type safety reasons). Also, C++ defines many new keywords, such as and , which may be used as identifiers (for example, variable names) in a C program.
Some incompatibilities have been removed by the 1999 revision of the C standard (C99), which now supports C++ features such as line comments () and declarations mixed with code. On the other hand, C99 introduced a number of new features that C++ did not support that were incompatible or redundant in C++, such as variable-length arrays, native complex-number types (however, the class in the C++ standard library provides similar functionality, although not code-compatible), designated initializers, compound literals, and the keyword. Some of the C99-introduced features were included in the subsequent version of the C++ standard, C++11 (out of those which were not redundant). However, the C++11 standard introduces new incompatibilities, such as disallowing assignment of a string literal to a character pointer, which remains valid C.
To intermix C and C++ code, any function declaration or definition that is to be called from/used both in C and C++ must be declared with C linkage by placing it within an block. Such a function may not rely on features depending on name mangling (i.e., function overloading).
Programs developed in C or C++ often utilize inline assembly to take advantage of its low-level functionalities, greater speed, and enhanced control compared to high-level programming languages when optimizing for performance is essential. C++ provides support for embedding assembly language using asm declarations, but the compatibility of inline assembly varies significantly between compilers and architectures. Unlike high-level language features such as Python or Java, assembly code is highly dependent on the underlying processor and compiler implementation.
Different C++ compilers implement inline assembly in distinct ways.
C++ provides two primary methods of integrating ASM code.
1. Standalone assembly files â Assembly code is written separately and linked with C++ code.
2. Inline assembly â Assembly code is embedded within C++ code using compiler-specific extensions.
Example Code for ASM Compatibility
Encapsulation is the hiding of information to ensure that data structures and operators are used as intended and to make the usage model more obvious to the developer. C++ provides the ability to define classes and functions as its primary encapsulation mechanisms. Within a class, members can be declared as either public, protected, or private to explicitly enforce encapsulation. A public member of the class is accessible to any function. A private member is accessible only to functions that are members of that class and to functions and classes explicitly granted access permission by the class ("friends"). A protected member is accessible to members of classes that inherit from the class in addition to the class itself and any friends.
The object-oriented principle ensures the encapsulation of all and only the functions that access the internal representation of a type. C++ supports this principle via member functions and friend functions, but it does not enforce it. Programmers can declare parts or all of the representation of a type to be public, and they are allowed to make public entities not part of the representation of a type. Therefore, C++ supports not just object-oriented programming, but other decomposition paradigms such as modular programming.
It is generally considered good practice to make all data private or protected, and to make public only those functions that are part of a minimal interface for users of the class. This can hide the details of data implementation, allowing the designer to later fundamentally change the implementation without changing the interface in any way.
Inheritance allows one data type to acquire properties of other data types. Inheritance from a base class may be declared as public, protected, or private. This access specifier determines whether unrelated and derived classes can access the inherited public and protected members of the base class. Only public inheritance corresponds to what is usually meant by "inheritance". The other two forms are much less frequently used. If the access specifier is omitted, a "class" inherits privately, while a "struct" inherits publicly. Base classes may be declared as virtual; this is called virtual inheritance. Virtual inheritance ensures that only one instance of a base class exists in the inheritance graph, avoiding some of the ambiguity problems of multiple inheritance.
Multiple inheritance is a C++ feature allowing a class to be derived from more than one base class; this allows for more elaborate inheritance relationships. For example, a "Flying Cat" class can inherit from both "Cat" and "Flying Mammal". Some other languages, such as C# or Java, accomplish something similar (although more limited) by allowing inheritance of multiple interfaces while restricting the number of base classes to one (interfaces, unlike classes, provide only declarations of member functions, no implementation or member data). An interface as in C# and Java can be defined in as a class containing only pure virtual functions, often known as an abstract base class or "ABC". The member functions of such an abstract base class are normally explicitly defined in the derived class, not inherited implicitly. C++ virtual inheritance exhibits an ambiguity resolution feature called dominance.
C++ provides more than 35 operators, covering basic arithmetic, bit manipulation, indirection, comparisons, logical operations and others. Almost all operators can be overloaded for user-defined types, with a few notable exceptions such as member access (<code>.</code> and <code>.*</code>) and the conditional operator. The rich set of overloadable operators is central to making user-defined types in C++ seem like built-in types.
Overloadable operators are also an essential part of many advanced C++ programming techniques, such as smart pointers. Overloading an operator does not change the precedence of calculations involving the operator, nor does it change the number of operands that the operator uses (any operand may however be ignored by the operator, though it will be evaluated prior to execution). Overloaded "<code>&&</code>" and "<code>||</code>" operators lose their short-circuit evaluation property.
Polymorphism enables one common interface for many implementations, and for objects to act differently under different circumstances.
C++ supports several kinds of static (resolved at compile-time) and dynamic (resolved at run-time) polymorphisms, supported by the language features described above. Compile-time polymorphism does not allow for certain run-time decisions, while runtime polymorphism typically incurs a performance penalty.
Variable pointers and references to a base class type in C++ can also refer to objects of any derived classes of that type. This allows arrays and other kinds of containers to hold pointers to objects of differing types (references cannot be directly held in containers). This enables dynamic (run-time) polymorphism, where the referred objects can behave differently, depending on their (actual, derived) types.
C++ also provides the operator, which allows code to safely attempt conversion of an object, via a base reference/pointer, to a more derived type: downcasting. The attempt is necessary as often one does not know which derived type is referenced. (Upcasting, conversion to a more general type, can always be checked/performed at compile-time via , as ancestral classes are specified in the derived class's interface, visible to all callers.) relies on run-time type information (RTTI), metadata in the program that enables differentiating types and their relationships. If a to a pointer fails, the result is the constant, whereas if the destination is a reference (which cannot be null), the cast throws an exception. Objects known to be of a certain derived type can be cast to that with , bypassing RTTI and the safe runtime type-checking of , so this should be used only if the programmer is very confident the cast is, and will always be, valid.
Ordinarily, when a function in a derived class overrides a function in a base class, the function to call is determined by the type of the object. A given function is overridden when there exists no difference in the number or type of parameters between two or more definitions of that function. Hence, at compile time, it may not be possible to determine the type of the object and therefore the correct function to call, given only a base class pointer; the decision is therefore put off until runtime. This is called dynamic dispatch. Virtual member functions or methods allow the most specific implementation of the function to be called, according to the actual run-time type of the object. In C++ implementations, this is commonly done using virtual function tables. If the object type is known, this may be bypassed by prepending a fully qualified class name before the function call, but in general calls to virtual functions are resolved at run time.
In addition to standard member functions, operator overloads and destructors can be virtual. An inexact rule based on practical experience states that if any function in the class is virtual, the destructor should be as well. As the type of an object at its creation is known at compile time, constructors, and by extension copy constructors, cannot be virtual. Nonetheless, a situation may arise where a copy of an object needs to be created when a pointer to a derived object is passed as a pointer to a base object. In such a case, a common solution is to create a (or similar) virtual function that creates and returns a copy of the derived class when called.
A member function can also be made "pure virtual" by appending it with after the closing parenthesis and before the semicolon. A class containing a pure virtual function is called an abstract class. Objects cannot be created from an abstract class; they can only be derived from. Any derived class inherits the virtual function as pure and must provide a non-pure definition of it (and all other pure virtual functions) before objects of the derived class can be created. A program that attempts to create an object of a class with a pure virtual member function or inherited pure virtual member function is ill-formed.
Function overloading allows programs to declare multiple functions having the same name but with different arguments (i.e. ad hoc polymorphism). The functions are distinguished by the number or types of their formal parameters. Thus, the same function name can refer to different functions depending on the context in which it is used. The type returned by the function is not used to distinguish overloaded functions and differing return types would result in a compile-time error message.
When declaring a function, a programmer can specify for one or more parameters a default value. Doing so allows the parameters with defaults to optionally be omitted when the function is called, in which case the default arguments will be used. When a function is called with fewer arguments than there are declared parameters, explicit arguments are matched to parameters in left-to-right order, with any unmatched parameters at the end of the parameter list being assigned their default arguments. In many cases, specifying default arguments in a single function declaration is preferable to providing overloaded function definitions with different numbers of parameters.
C++ templates enable generic programming. supports function, class, alias, and variable templates. Templates may be parameterized by types, compile-time constants, and other templates. Templates are implemented by instantiation at compile-time. To instantiate a template, compilers substitute specific arguments for a template's parameters to generate a concrete function or class instance. Some substitutions are not possible; these are eliminated by an overload resolution policy described by the phrase "Substitution failure is not an error" (SFINAE). Templates are a powerful tool that can be used for generic programming, template metaprogramming, and code optimization, but this power implies a cost. Template use may increase object code size, because each template instantiation produces a copy of the template code: one for each set of template arguments, however, this is the same or smaller amount of code that would be generated if the code were written by hand. This is in contrast to run-time generics seen in other languages (e.g., Java) where at compile-time the type is erased and a single template body is preserved.
Templates are different from macros: while both of these compile-time language features enable conditional compilation, templates are not restricted to lexical substitution. Templates are aware of the semantics and type system of their companion language, as well as all compile-time type definitions, and can perform high-level operations including programmatic flow control based on evaluation of strictly type-checked parameters. Macros are capable of conditional control over compilation based on predetermined criteria, but cannot instantiate new types, recurse, or perform type evaluation and in effect are limited to pre-compilation text-substitution and text-inclusion/exclusion. In other words, macros can control compilation flow based on pre-defined symbols but cannot, unlike templates, independently instantiate new symbols. Templates are a tool for static polymorphism (see below) and generic programming.
In addition, templates are a compile-time mechanism in C++ that is Turing-complete, meaning that any computation expressible by a computer program can be computed, in some form, by a template metaprogram before runtime.
In summary, a template is a compile-time parameterized function or class written without knowledge of the specific arguments used to instantiate it. After instantiation, the resulting code is equivalent to code written specifically for the passed arguments. In this manner, templates provide a way to decouple generic, broadly applicable aspects of functions and classes (encoded in templates) from specific aspects (encoded in template parameters) without sacrificing performance due to abstraction.
Templates in C++ provide a sophisticated mechanism for writing generic, polymorphic code (i.e. parametric polymorphism). In particular, through the curiously recurring template pattern, it is possible to implement a form of static polymorphism that closely mimics the syntax for overriding virtual functions. Because C++ templates are type-aware and Turing-complete, they can also be used to let the compiler resolve recursive conditionals and generate substantial programs through template metaprogramming. Contrary to some opinion, template code will not generate a bulk code after compilation with the proper compiler settings.
C++ provides support for anonymous functions, also known as lambda expressions, with the following form:
Since C++20, the keyword is optional for template parameters of lambda expressions:
If the lambda takes no parameters, and no return type (returns <code>void</code>) or other specifiers are used, the () can be omitted; that is,
The return type of a lambda expression can be automatically inferred, if possible; e.g.:
The list supports the definition of closures. Such lambda expressions are defined in the standard as syntactic sugar for an unnamed function object.
Exception handling is used to communicate the existence of a runtime problem or error from where it was detected to where the issue can be handled. It permits this to be done in a uniform manner and separately from the main code, while detecting all errors. Should an error occur, an exception is thrown (raised), which is then caught by the nearest suitable exception handler. The exception causes the current scope to be exited, and also each outer scope (propagation) until a suitable handler is found, calling in turn the destructors of any objects in these exited scopes. At the same time, an exception is presented as an object carrying the data about the detected problem.
Some C++ style guides, such as Google's, LLVM's, and Qt's, forbid the usage of exceptions.
The exception-causing code is placed inside a block. The exceptions are handled in separate blocks (the handlers); each block can have multiple exception handlers, as it is visible in the example below.
It is also possible to raise exceptions purposefully, using the keyword; these exceptions are handled in the usual way. In some cases, exceptions cannot be used due to technical reasons. One such example is a critical component of an embedded system, where every operation must be guaranteed to complete within a specified amount of time. This cannot be determined with exceptions as no tools exist to determine the maximum time required for an exception to be handled. Unlike languages like Java, C# and D, which only allows objects that extend <code>Throwable</code> (whose subclasses are <code>Error</code> and <code>Exception</code>), C++ allows anything, both primitive types and objects, to be thrown and caught. C++ does not have an Error class like those languages, but has an Exception class (<code>std::exception</code>). In the aforementioned languages, the distinction between Error and Exception is made in that Errors usually represent irrecoverable states, while Exceptions are more acceptable to catch and represent circumstances that are normal to occur throughout the execution of a program.
Unlike signal handling, in which the handling function is called from the point of failure, exception handling exits the current scope before the catch block is entered, which may be located in the current function or any of the previous function calls currently on the stack.
Concepts are an extension to the templates feature provided by the C++ programming language. Concepts are named Boolean predicates on template parameters, evaluated at compile time. A concept may be associated with a template (class template, function template, member function of a class template, variable template, or alias template), in which case it serves as a constraint: it limits the set of arguments that are accepted as template parameters.
The main uses of concepts are:
There are five different places in a function template signature where a constraint can be used (labeled below from 1 through 5):
The constraint forms <code>Concept1</code> and <code>Concept2</code> can be used in all kinds of templates.
Traditionally (prior to C++20), code inclusion in C++ followed the ways of C, in which code was imported into another file using the preprocessor directive <code>#include</code>, which would copy the contents of the file into the other file.
Traditionally, C++ code would be divided between a header file (typically with extension , or ) and a source file (typically with extension or ). The header file usually contained declarations of symbols while the source file contained the actual implementation, such as function implementations. This separation was often enforced because <code>#include</code>ing code into another file would result in it being reprocessed for each file it was included by, resulting in increased compilation times if the compiler had to reprocess the same source repeatedly.
Headers often also forced the usage of guards or to prevent a header from potentially being included into a file multiple times.
The C++ standard library remains accessible through headers, however since C++23 it has been made accessible using modules as well. Even with the introduction of modules, headers continue to play a role in modern C++, as existing codebases have not completely migrated to modules.
Headers are traditionally included via textual inclusion by the preprocessor using <code>#include</code>, while modules are included during compilation through <code>import</code>. However, headers may also be imported using <code>import</code>, even if they are not declared as modules â these are called "header units", and they are designed to allow existing codebases to migrate from headers to modules more gradually. The syntax is similar to including a header, with the difference being that <code>#include</code> is replaced with <code>import</code> and a semicolon is placed at the end of the statement. Header units automatically export all symbols, and differ from proper modules in that they allow the emittance of macros, meaning all who import the header unit will obtain its contained macros. This offers minimal breakage between migration to modules. The semantics of searching for the file depending on whether quotation marks or angle brackets are used apply here as well. For instance, one may write to import the <code><string></code> header, or to import the file <code>"MyHeader.h"</code> as a header unit. Most build systems, such as CMake, do not support this feature yet.
C++26 adds the <code>#embed</code> preprocessor directive, for binary resource inclusion. The <code>#embed</code> directive can be used to embed binary content into a file, even if it is not valid C++ code.
Modules do not use the C preprocessor at all, and are instead handled directly by the compiler. A module is declared using <code>export module</code>, and the beginning of the module preamble begins with <code>module;</code>. Exported symbols which will be made accessible to importing translation units are marked <code>export</code>, and a module is imported into the translation unit using <code>import</code>. Modules do not export macros, due to being handled after the preprocessing step.
Modules may also have partitions, which cannot be imported individually but are owned by a larger module.
Since C++11, C++ has supported attribute specifier sequences. Attributes can be applied to any symbol that supports them, including classes, functions/methods, and variables, and any symbol marked with an attribute will be specifically treated by the compiler as necessary. These can be thought of as similar to Java annotations for providing additional information to the compiler, however they differ in that attributes in C++ are not metadata that is meant to be accessed using reflection. C++26 adds support for annotations for reflection. Furthermore, one cannot create custom attributes in C++, unlike in Java where one may define custom annotations in addition to the standard ones. However, C++ does have implementation/vendor-specific attributes which are non-standard. These typically have a namespace associated with them. For instance, GCC and Clang have attributes under the <code>gnu::</code> namespace, and all such attributes are of the form .
One may apply multiple attributes as a list, for instance (where <code>A</code>, <code>B</code>, and <code>C</code> are attributes). Furthermore, attributes may also accept arguments, like . The following is an example of using some attributes in C++.
The C++ standard defines the following attributes:
Legend:<br>
As mentioned previously, GCC and Clang have scoped (namespaced) attributes, such as , , and . To apply multiple scoped attributes, one may write:
In addition to basic metaprogramming provided in header <code><type_traits></code>, C++26 introduces compile-time reflection. Compile-time reflection capabilities can be accessed in header <code><meta></code> and declarations are stored in namespace <code>std::meta</code>.
Most declarations can have annotations attached, which are just values associated with that declaration. Like Java annotations, annotations can be accessed using reflection. Annotations are different from attributes as attributes are primarily a means to communicate information to the compiler, while annotations are a feature of reflection and allow arbitrary constants and metadata to be attached, making them customisable to programs, unlike attributes. This allows for bridging the communication between the library API and the user.
The annotations have no initial meaning unless some implementations use those annotations to identify some characteristics and features.
Creating an annotation to generate a specialisation for <code>std::formatter<T></code> is as follows: