@Beta public class HighAvailabilityManagerImpl extends java.lang.Object implements HighAvailabilityManager
Multiple brooklyn nodes can be started to form a single management plane, where one node is
designated master and the others are "warm standbys". On termination or failure of the master,
the standbys deterministically decide which standby should become master (see MasterChooser
).
That standby promotes itself.
The management nodes communicate their health/status via the ManagementPlaneSyncRecordPersister
.
For example, if using ManagementPlaneSyncRecordPersisterToObjectStore
with a shared blobstore or
filesystem/NFS mount, then each management-node periodically writes its state.
This acts as a heartbeat, being read by the other management-nodes.
Promotion to master involves:
RebindManager.rebind(ClassLoader, brooklyn.entity.rebind.RebindExceptionHandler, ManagementNodeState)
to read all persisted entity state, and thus reconstitute the entities.
Future improvements in this area will include brooklyn-managing-brooklyn to decide + promote the standby.
Modifier and Type | Class and Description |
---|---|
static interface |
HighAvailabilityManagerImpl.PromotionListener |
Modifier and Type | Field and Description |
---|---|
ConfigKey<Duration> |
HEARTBEAT_TIMEOUT |
ConfigKey<Duration> |
POLL_PERIOD |
Constructor and Description |
---|
HighAvailabilityManagerImpl(ManagementContextInternal managementContext) |
Modifier and Type | Method and Description |
---|---|
void |
changeMode(HighAvailabilityMode startMode)
changes the mode that this HA server is running in
|
void |
changeMode(HighAvailabilityMode startMode,
boolean preventElectionOnExplicitStandbyMode,
boolean failOnExplicitModesIfUnusual) |
void |
disabled()
Indicates that HA is disabled: this node will act as the only management node in this management plane,
and will not persist HA meta-information (meaning other nodes cannot join).
|
Duration |
getHeartbeatTimeout() |
ManagementPlaneSyncRecord |
getLastManagementPlaneSyncRecord()
Returns a snapshot of the management-plane's current / most-recently-known status,
as last read from
HighAvailabilityManager.loadManagementPlaneSyncRecord(boolean) , or null if none read. |
long |
getLastStateChange()
The time in milliseconds when the state was last changed.
|
ManagementPlaneSyncRecord |
getManagementPlaneSyncState() |
java.util.Map<java.lang.String,java.lang.Object> |
getMetrics()
Returns a collection of metrics
|
ManagementNodeState |
getNodeState() |
ManagementPlaneSyncRecordPersister |
getPersister() |
long |
getPriority() |
ManagementNodeState |
getTransitionTargetNodeState()
returns the node state this node is trying to be in
|
boolean |
isRunning()
Whether HA mode is operational
|
ManagementPlaneSyncRecord |
loadManagementPlaneSyncRecord(boolean useLocalKnowledgeForThisNode) |
void |
publishAndCheck(boolean initializing)
invoked manually when initializing, and periodically thereafter
|
void |
publishClearNonMaster()
deletes non-master node records; active nodes (including this) will republish,
so this provides a simple way to clean out the cache of dead brooklyn nodes
|
HighAvailabilityManagerImpl |
setHeartbeatTimeout(Duration val)
Overrides
HEARTBEAT_TIMEOUT from brooklyn config,
including e.g. |
HighAvailabilityManagerImpl |
setLocalTicker(com.google.common.base.Ticker val)
A ticker that reads in milliseconds, for populating local timestamps.
|
HighAvailabilityManagerImpl |
setMasterChooser(MasterChooser val) |
HighAvailabilityManagerImpl |
setPersister(ManagementPlaneSyncRecordPersister persister) |
HighAvailabilityManagerImpl |
setPollPeriod(Duration val)
Overrides
POLL_PERIOD from brooklyn config,
including e.g. |
void |
setPriority(long priority)
sets the priority, and publishes it synchronously so it is canonical
|
HighAvailabilityManagerImpl |
setPromotionListener(HighAvailabilityManagerImpl.PromotionListener val) |
HighAvailabilityManagerImpl |
setRemoteTicker(com.google.common.base.Ticker val)
A ticker that reads in milliseconds, for overriding remote timestamps.
|
void |
start(HighAvailabilityMode startMode)
Starts the monitoring of other nodes (and thus potential promotion of this node from standby to master).
|
void |
stop()
Stops this node, then publishes own status (via
ManagementPlaneSyncRecordPersister of ManagementNodeState.TERMINATED . |
java.lang.String |
toString() |
public HighAvailabilityManagerImpl(ManagementContextInternal managementContext)
public HighAvailabilityManagerImpl setPersister(ManagementPlaneSyncRecordPersister persister)
setPersister
in interface HighAvailabilityManager
public ManagementPlaneSyncRecordPersister getPersister()
getPersister
in interface HighAvailabilityManager
public HighAvailabilityManagerImpl setPollPeriod(Duration val)
POLL_PERIOD
from brooklyn config,
including e.g. Duration.PRACTICALLY_FOREVER
to disable polling;
or null
to clear a local overridepublic HighAvailabilityManagerImpl setMasterChooser(MasterChooser val)
public Duration getHeartbeatTimeout()
public HighAvailabilityManagerImpl setHeartbeatTimeout(Duration val)
HEARTBEAT_TIMEOUT
from brooklyn config,
including e.g. Duration.PRACTICALLY_FOREVER
to prevent failover due to heartbeat absence;
or null
to clear a local overridepublic HighAvailabilityManagerImpl setLocalTicker(com.google.common.base.Ticker val)
public HighAvailabilityManagerImpl setRemoteTicker(com.google.common.base.Ticker val)
If this is supplied, one must also set ManagementPlaneSyncRecordPersisterToObjectStore#useRemoteTimestampInMemento()
.
public HighAvailabilityManagerImpl setPromotionListener(HighAvailabilityManagerImpl.PromotionListener val)
public boolean isRunning()
HighAvailabilityManager
isRunning
in interface HighAvailabilityManager
public void disabled()
HighAvailabilityManager
Subsequently can expect HighAvailabilityManager.getNodeState()
to be ManagementNodeState.MASTER
and HighAvailabilityManager.loadManagementPlaneSyncRecord(boolean)
to show just this one node --
as if it were running HA with just one node --
but HighAvailabilityManager.isRunning()
will return false.
Currently this method is intended to be called early in the lifecycle,
instead of HighAvailabilityManager.start(HighAvailabilityMode)
. It may be an error if
this is called after this HA Manager is started.
disabled
in interface HighAvailabilityManager
public void start(HighAvailabilityMode startMode)
HighAvailabilityManager
When this method returns, the status of this node will be set,
either ManagementNodeState.MASTER
if appropriate
or ManagementNodeState.STANDBY
/ ManagementNodeState.HOT_STANDBY
/ ManagementNodeState.HOT_BACKUP
.
start
in interface HighAvailabilityManager
startMode
- mode to start withpublic void changeMode(HighAvailabilityMode startMode)
HighAvailabilityManager
note it will be an error to HighAvailabilityManager.changeMode(HighAvailabilityMode)
to ManagementNodeState.MASTER
when there is already a master; to promote a node explicitly set its priority higher than
the others and invoke HighAvailabilityManager.changeMode(HighAvailabilityMode)
to a standby mode on the existing master
changeMode
in interface HighAvailabilityManager
@Beta public void changeMode(HighAvailabilityMode startMode, boolean preventElectionOnExplicitStandbyMode, boolean failOnExplicitModesIfUnusual)
public void setPriority(long priority)
HighAvailabilityManager
setPriority
in interface HighAvailabilityManager
public long getPriority()
getPriority
in interface HighAvailabilityManager
public void stop()
HighAvailabilityManager
ManagementPlaneSyncRecordPersister
of ManagementNodeState.TERMINATED
.stop
in interface HighAvailabilityManager
public ManagementNodeState getTransitionTargetNodeState()
public ManagementNodeState getNodeState()
getNodeState
in interface HighAvailabilityManager
public ManagementPlaneSyncRecord getLastManagementPlaneSyncRecord()
HighAvailabilityManager
HighAvailabilityManager.loadManagementPlaneSyncRecord(boolean)
, or null if none read.getLastManagementPlaneSyncRecord
in interface HighAvailabilityManager
public ManagementPlaneSyncRecord getManagementPlaneSyncState()
getManagementPlaneSyncState
in interface HighAvailabilityManager
public void publishAndCheck(boolean initializing)
public void publishClearNonMaster()
HighAvailabilityManager
publishClearNonMaster
in interface HighAvailabilityManager
public ManagementPlaneSyncRecord loadManagementPlaneSyncRecord(boolean useLocalKnowledgeForThisNode)
loadManagementPlaneSyncRecord
in interface HighAvailabilityManager
useLocalKnowledgeForThisNode
- - if true, the record for this mgmt node will be replaced with the
actual current status known in this JVM (may be more recent than what is persisted);
for most purposes there is little difference but in some cases the local node being updated
may be explicitly wanted or not wantedpublic java.lang.String toString()
toString
in class java.lang.Object
public java.util.Map<java.lang.String,java.lang.Object> getMetrics()
HighAvailabilityManager
getMetrics
in interface HighAvailabilityManager
public long getLastStateChange()
HighAvailabilityManager
getLastStateChange
in interface HighAvailabilityManager