9.6. Management of MySQL ClusterManaging a MySQL Cluster involves a number of tasks, the first of which is to configure and start MySQL Cluster. This is covered in Section 9.4, "MySQL Cluster Configuration," and Section 9.5, "Process Management in MySQL Cluster." The following sections cover the management of a running MySQL Cluster. There are essentially two methods of actively managing a running MySQL Cluster. The first of these is through the use of commands entered into the management client whereby cluster status can be checked, log levels changed, backups started and stopped, and nodes stopped and started. The second method involves studying the contents of the cluster log ndb_node_id_cluster.log in the management server's DataDir directory. (Recall that node_id represents the unique identifier of the node whose activity is being logged.) The cluster log contains event reports generated by ndbd. It is also possible to send cluster log entries to a Unix system log. 9.6.1. MySQL Cluster Startup PhasesThis section describes the steps involved when the cluster is started. There are several different startup types and modes, as shown here:
Prior to startup, each data node (ndbd process) must be initialized. Initialization consists of the following steps:
After each data node has been initialized, the cluster startup process can proceed. The stages which the cluster goes through during this process are listed here:
After this process is completed for an initial start or system restart, transaction handling is enabled. For a node restart or initial node restart, completion of the startup process means that the node may now act as a transaction coordinator. 9.6.2. Commands in the Management ClientIn addition to the central configuration file, a cluster may also be controlled through a command-line interface available through the management client ndb_mgm. This is the primary administrative interface to a running cluster. Commands for the event logs are given in Section 9.6.3, "Event Reports Generated in MySQL Cluster." Commands for creating backups and restoring from backup are provided in Section 9.6.5, "Online Backup of MySQL Cluster." The management client has the following basic commands. In the listing that follows, node_id denotes either a database node ID or the keyword ALL, which indicates that the command should be applied to all of the cluster's data nodes.
9.6.3. Event Reports Generated in MySQL ClusterIn this section, we discuss the types of event logs provided by MySQL Cluster, and the types of events that are logged. MySQL Cluster provides two types of event log. These are the cluster log, which includes events generated by all cluster nodes, and node logs, which are local to each data node. Output generated by cluster event logging can have multiple destinations including a file, the management server console window, or syslog. Output generated by node event logging is written to the data node's console window. Both types of event logs can be set to log different subsets of events. Note: The cluster log is the log recommended for most uses because it provides logging information for an entire cluster in a single file. Node logs are intended to be used only during application development, or for debugging application code. Each reportable event can be distinguished according to three different criteria:
Both the cluster log and the node log can be filtered on these properties. 9.6.3.1. Logging Management CommandsThe following management commands are related to the cluster log:
The following table describes the default setting (for all data nodes) of the cluster log category threshold. If an event has a priority with a value lower than or equal to the priority threshold, it is reported in the cluster log. Note that events are reported per data node, and that the threshold can be set to different values on different nodes.
Thresholds are used to filter events within each category. For example, a STARTUP event with a priority of 3 is not logged unless the threshold for STARTUP is changed to 3 or lower. Only events with priority 3 or lower are sent if the threshold is 3. The following table shows the event severity levels. (Note: These correspond to Unix syslog levels, except for LOG_EMERG and LOG_NOTICE, which are not used or mapped.)
Event severity levels can be turned on or off (using CLUSTERLOG FILTERsee above). If a severity level is turned on, all events with a priority less than or equal to the category thresholds are logged. If the severity level is turned off, no events belonging to that severity level are logged. 9.6.3.2. Log EventsAn event report reported in the event logs has the following format: datetime [string] severity -- message For example: 09:19:30 2005-07-24 [NDB] INFO -- Node 4 Start phase 4 completed This section discusses all reportable events, ordered by category and severity level within each category. In the event descriptions, GCP and LCP mean "Global Checkpoint" and "Local Checkpoint," respectively. Connection EventsThese events are associated with connections between Cluster nodes.
Checkpoint EventsThe logging messages shown here are associated with checkpoints.
Startup EventsThe following events are generated in response to the startup of a node or of the cluster and of its success or failure. They also provide information relating to the progress of the startup process, including information concerning logging activities.
Noderestart EventsThe following events are generated when restarting a node and relate to the success or failure of the node restart process.
Statistics EventsThe following events are of a statistical nature. They provide information such as numbers of transactions and other operations, amount of data sent or received by individual nodes, and memory usage.
Error EventsThese events relate to Cluster errors and warnings. The presence of one or more of these generally indicates that a major malfunction or failure has occurred.
INFO EventsThese events provide general information about the state of the cluster and activities associated with Cluster maintenance, such as logging and heartbeat transmission.
9.6.4. Single-User ModeSingle-user mode allows the database administrator to restrict access to the database system to a single MySQL server (SQL node). When entering single-user mode, all connections to all other MySQL servers are closed gracefully and all running transactions are aborted. No new transactions are allowed to be started. Once the cluster has entered single-user mode, only the designated SQL node is granted access to the database. You can use the ALL STATUS command to see when the cluster has entered single-user mode. Example: NDB> ENTER SINGLE USER MODE 5
After this command has executed and the cluster has entered single-user mode, the SQL node whose node ID is 5 becomes the cluster's only permitted user. The node specified in the preceding command must be a MySQL Server node; An attempt to specify any other type of node will be rejected. Note: When the preceding command is invoked, all transactions running on the designated node are aborted, the connection is closed, and the server must be restarted. The command EXIT SINGLE USER MODE changes the state of the cluster's data nodes from single-user mode to normal mode. MySQL Servers waiting for a connection (that is, for the cluster to become ready and available), are again permitted to connect. The MySQL Server denoted as the single-user SQL node continues to run (if still connected) during and after the state change. Example: NDB> EXIT SINGLE USER MODE
There are two recommended ways to handle a node failure when running in single-user mode:
9.6.5. Online Backup of MySQL ClusterThis section describes how to create a backup and how to restore the database from a backup at a later time. 9.6.5.1. Cluster Backup ConceptsA backup is a snapshot of the database at a given time. The backup consists of three main parts:
Each of these parts is saved on all nodes participating in the backup. During backup, each node saves these three parts into three files on disk:
In the listing above, backup_id stands for the backup identifier and node_id is the unique identifier for the node creating the file. 9.6.5.2. Using the Management Server to Create a BackupBefore starting a backup, make sure that the cluster is properly configured for performing one. (See Section 9.6.5.4, "Configuration for Cluster Backup.") Creating a backup using the management server involves the following steps:
To abort a backup that is already in progress:
It is also possible to abort a backup in progress from the system shell using this command: shell> ndb_mgm -e "ABORT BACKUP backup_id" Note: If there is no backup with ID backup_id running when it is aborted, the management server makes no explicit response. However, the fact that an invalid abort command was sent is indicated in the cluster log. 9.6.5.3. How to Restore a Cluster BackupThe cluster restoration program is implemented as a separate command-line utility ndb_restore, which reads the files created by the backup and inserts the stored information into the database. The restore program must be executed once for each set of backup files. That is, as many times as there were database nodes running when the backup was created. The first time you run the ndb_restore restoration program, you also need to restore the metadata. In other words, you must re-create the database tables. (Note that the cluster should have an empty database when starting to restore a backup.) The restore program acts as an API to the cluster and therefore requires a free connection to connect to the cluster. This can be verified with the ndb_mgm command SHOW (you can accomplish this from a system shell using ndb_mgm -e SHOW). The -c connectstring option may be used to locate the MGM node (see Section 9.4.4.2, "The MySQL Cluster connectstring." for information on connectstrings). The backup files must be present in the directory given as an argument to the restoration program. It is possible to restore a backup to a database with a different configuration than it was created from. For example, suppose that a backup with backup ID 12, created in a cluster with two database nodes having the node IDs 2 and 3, is to be restored to a cluster with four nodes. Then ndb_restore must be run twiceonce for each database node in the cluster where the backup was taken. Note: For rapid restoration, the data may be restored in parallel, provided that there is a sufficient number of cluster connections available. However, the data files must always be applied before the logs. 9.6.5.4. Configuration for Cluster BackupFour configuration parameters are essential for backup:
More detailed information about these parameters can be found in Section 9.4, "MySQL Cluster Configuration." 9.6.5.5. Backup TroubleshootingIf an error code is returned when issuing a backup request, the most likely cause is insufficient memory or insufficient disk space. You should check that there is enough memory allocated for the backup. Also check that there is enough space on the hard drive partition of the backup target. NDB does not support repeatable reads, which can cause problems with the restoration process. Although the backup process is "hot," restoring a MySQL Cluster from backup is not a 100% "hot" process. This is due to the fact that, for the duration of the restore process, running transactions get non-repeatable reads from the restored data. This means that the state of the data is inconsistent while the restore is in progress. |