我们有一个有两个分区的演员服务。所有空闲的辅助副本都处于警告状态,并带有来自 Service Fabric Explorer 的以下消息:
Unhealthy event: SourceId='System.RA', Property='ReplicaOpenStatus', HealthState='Warning', ConsiderWarningAsError=false.
Replica had multiple failures during open on _cp_3. API call: IStatefulServiceReplica.Open(); Error = System.ArgumentException (-2147024809)
An item with the same key has already been added.
at System.RuntimeTypeHandle.GetTypeByName(String name, Boolean throwOnError, Boolean ignoreCase, Boolean reflectionOnly, StackCrawlMarkHandle stackMark, IntPtr pPrivHostBinder, Boolean loadTypeFromPartialName, ObjectHandleOnStack type)
at System.RuntimeTypeHandle.GetTypeByName(String name, Boolean throwOnError, Boolean ignoreCase, Boolean reflectionOnly, StackCrawlMark& stackMark, IntPtr pPrivHostBinder, Boolean loadTypeFromPartialName)
at System.RuntimeType.GetType(String typeName, Boolean throwOnError, Boolean ignoreCase, Boolean reflectionOnly, StackCrawlMark& stackMark)
at System.Type.GetType(String typeName, Boolean throwOnError)
at System.Type.GetType(String typeName, Boolean throwOnError)
at System.Fabric.BackupRestore.BackupRestoreManagerFactory.GetBackupRestoreManager(IBackupRestoreReplica replica)
at System.Fabric.BackupRestore.BackupRestoreManagerFactory.GetBackupRestoreManager(IBackupRestoreReplica replica)
at System.Fabric.KeyValueStoreReplica..ctor(String storeName, LocalStoreSettings localStoreSettings, ReplicatorSettings replicatorSettings, KeyValueStoreReplicaSettings kvsSettings)
at System.Fabric.KeyValueStoreReplica..ctor(String storeName, LocalStoreSettings localStoreSettings, ReplicatorSettings replicatorSettings, KeyValueStoreReplicaSettings kvsSettings)
at Microsoft.ServiceFabric.Actors.Runtime.KvsActorStateProvider.OnCreateAndInitializeReplica(StatefulServiceInitializationParameters initParams, Action`1 copyHandler, Action`1 replicationHandler, Func`2 onDataLossHandler, Func`2 restoreCompletedHandler)
at Microsoft.ServiceFabric.Actors.Runtime.KvsActorStateProviderBase.Microsoft.ServiceFabric.Data.IStateProviderReplica.Initialize(StatefulServiceInitializationParameters initializationParameters)
at System.Fabric.ServiceFactoryBroker.CreateHelper[TFactory,TReturnValue](IntPtr nativeServiceType, IntPtr nativeServiceName, UInt32 initializationDataLength, IntPtr nativeInitializationData, Guid partitionId, Func`3 creationFunc, Action`2 initializationFunc, ServiceInitializationParameters initializationParameters)
For more information see: http://aka.ms/sfhealth
我们在从一个分区移动到两个分区时首先遇到了这个错误。停用节点并删除数据是一个临时修复,因为问题在下一次部署中重新出现。
该问题导致 Service Fabric 不断尝试启动服务,该服务一直处于“启动、终止、重复”的永久循环中。
我了解处于 IdleSecondary 状态的副本意味着它当前正在从其他节点获取数据,以便提升为 ActiveSecondary。在我看来,这个问题与从演员的其他节点获取数据有关。
是什么导致了这个问题,我以后如何防止它?