SAP Knowledge Base Article - Preview

2712064 - SAP HANA System Replication Error port 4#### already in use

Symptom

  • You are setting up SAP HANA System Replication;
  • The secondary site registration command fails with error:
    nameserver <secondary_site_hostname>:30001 not responding.
    collecting information ...
    unable to contact primary site host xx.xxx.xx.xx (<primary_site_hostname>):40006. internal error,location=xx.xxx.xx.xx:40002. Trying old-style port (port offset +100)...xx.xxx.xx.xx (<primary_site_hostname>):40006
    error: unable to contact primary site; to xx.xxx.xx.xx (<primary_site_hostname>):30106; original error: internal error,location=xx.xxx.xx.xx:30102;
    failed. trace file nameserver_<secondary_site_hostname>.00000.000.trc may contain more error details.
  • In the trace file nameserver_<hostname>.00000.000.trc you can see similar entries as follows:
    [119501]{-1}[-1/-1] 2020-06-21 08:09:56.980152 e commlib commlibImpl.cpp(00969) : ERROR: comm::connect to Host: 127.0.0.1, port: 30001, Error: exception 1: no.2110017 (Basis/IO/Stream/impl/NetworkChannel.cpp:3038)
    System error: SO_ERROR has pending error for socket. rc=111: Connection refused. channel={<NetworkChannelSSLFilter>={<NetworkChannelBase>={this=140527165233176, fd=4, refCnt=1, local=127.0.0.1/52257_tcp, remote=127.0.0.1/30001_tcp, state=ConnectWait, pending=[----]}}}; $Context$=[e6249f376cbd0016,127.0.0.1:52257,127.0.0.1:30001,TRN,0]; $Context$=[e6249f376cbd0016,127.0.0.1:52257,127.0.0.1:30001,TRN,0]; $channel$={<NetworkChannelSSLFilter>={<NetworkChannelBase>={this=140527165233176, fd=INVALID, refCnt=1, local=127.0.0.1/52257_tcp, remote=127.0.0.1/30001_tcp, state=Closed, pending=[----]}}}
    exception throw location:
    1: 0x00007fcf3126b279 in .LTHUNK27.lto_priv.2295+0x4c5 at NetworkChannel.cpp:3038 (libhdbbasis.so)
    2: 0x00007fcf31265e5f in Stream::NetworkChannelSSLFilter::initiateConnection()+0x5b at NetworkChannelSSLFilter.cpp:204 (libhdbbasis.so)
    3: 0x00007fcf31265eb9 in Stream::NetworkChannelSSLFilter::initClientChannel()+0x5 at NetworkChannelSSLFilter.cpp:75 (libhdbbasis.so)
    4: 0x00007fcf3129680e in Stream::NetworkChannelManager::connect(NetworkAccess::NetworkAddress const*, NetworkAccess::NetworkAddress const&, Stream::NetworkChannelParameters const&, bool, ltt::smartptr_handle<Stream::ChannelCallback>*, int, Stream::NetworkChannelComponent)+0x18a at NetworkChannelManager.cpp:180 (libhdbbasis.so)
    5: 0x00007fcf31296f86 in Stream::NetworkChannelManager::connect(NetworkAccess::NetworkAddress const*, NetworkAccess::NetworkAddress const&, Stream::NetworkChannelParameters const&, ltt::smartptr_handle<Stream::ChannelCallback>*, int, Stream::NetworkChannelComponent)+0x12 at NetworkChannelManager.cpp:127 (libhdbbasis.so)
    6: 0x00007fcf33632cf6 in comm::connect(void*, char const*, unsigned short, int, Crypto::Configuration*, Stream::NetworkChannelComponent)+0x252 at commlibImpl.cpp:957 (libhdbbasement.so)
    7: 0x00007fcf338513a1 in TrexNet::Channel::open(char const*, ltt::smartptr_handle<Crypto::Configuration>&)+0x2e0 at Channel.cpp:298 (libhdbbasement.so)
    8: 0x00007fcf33852370 in TrexNet::ServerRep::openNewChannel(char const*, ltt::smartptr_handle<Crypto::Configuration>&)+0x60 at EndPoint.cpp:292 (libhdbbasement.so)
    9: 0x00007fcf336db9bc in TrexNet::Requestor::getChannel(char const*, unsigned short, char, ltt::smartptr_handle<Crypto::Configuration>&)+0x128 at Requestor.cpp:174 (libhdbbasement.so)
    10: 0x00007fcf3384de4c in TrexNet::Request::Request(char const*, TRexUtils::HostAndPort const&, char, ltt::smartptr_handle<Crypto::Configuration>)+0x2c8 at Request.cpp:550 (libhdbbasement.so)
    11: 0x00007fcf3384e728 in TrexNet::Request::Request(char const*, char const*, unsigned short, char)+0xd4 at Request.cpp:502 (libhdbbasement.so)
    12: 0x00007fcf33992adc in NameServer::TNSInfo::sendRequestTo(NameServer::Request const&, NameServer::Response&, ltt_adp::basic_string<char, ltt::char_traits<char>, ltt::integral_constant<bool, true> > const&, unsigned short, NameServer::TrexNetRequestHolder*, bool, char)+0xe8 at TNSInfo.cpp:472 (libhdbbasement.so)
    13: 0x00007fcf33993bc4 in NameServer::TNSInfo::processRequest(NameServer::Request const&, NameServer::Response&)+0x290 at TNSInfo.cpp:381 (libhdbbasement.so)
    14: 0x00007fcf33998230 in NameServer::TNSClient::processRequest(NameServer::Request const&, NameServer::Response&)+0x20 at TNSClient.cpp:519 (libhdbbasement.so)
    15: 0x00007fcf339b6a7e in NameServer::TNSClient::storeTrees(ltt_adp::vector<NameServer::TNode, ltt::integral_constant<bool, true> > const&)+0x10a at TNSClient.cpp:625 (libhdbbasement.so)
    16: 0x000055bb05004177 in NameServerCmd::TopologyCmdAction::isNsActive(ltt_adp::basic_string<char, ltt::char_traits<char>, ltt::integral_constant<bool, true> >&, ltt_adp::basic_string<char, ltt::char_traits<char>, ltt::integral_constant<bool, true> > const&, bool) [clone .constprop.185]+0x93 at TopologyCmdAction.cpp:86 (hdbnsutil)
    17: 0x000055bb04f71cc5 in registerNewDatacenter(ltt_adp::map<ltt_adp::basic_string<char, ltt::char_traits<char>, ltt::integral_constant<bool, true> >, ltt_adp::vector<ltt_adp::basic_string<char, ltt::char_traits<char>, ltt::integral_constant<bool, true> >, ltt::integral_constant<bool, true> >, NameServerCmd::CommandUtil::CaseInsensitiveArgmapCompare, ltt::integral_constant<bool, true> >&)+0x15f1 at TopologyUtil.cpp:1918 (hdbnsutil)
    18: 0x000055bb04f80c24 in main+0x11b0 at TopologyUtil.cpp:3172 (hdbnsutil)
    19: 0x00007fcf3137b516 in System::mainWrapper(int, char**, char**)+0x72 at IsInMain.cpp:333 (libhdbbasis.so)
    20: 0x00007fcf2f702725 in __libc_start_main+0xf1 (libc.so.6)
    21: 0x000055bb04f85fdd in global constructors keyed to 65535_0_TREXNameserverAllocator.cpp.o.212682+0x109 at start.S:103 (hdbnsutil)
    [119501]{-1}[-1/-1] 2020-06-21 08:09:56.998982 e TrexNet EndPoint.cpp(00299) : ERROR: failed to open channel 127.0.0.1:30001! reason: (internal error)
  • Nameserver trace file of primary site has similar messages as below:
    [80695]{-1}[-1/-1] 2020-06-21 09:35:54.174749 i sr_dataaccess DisasterRecoveryPrimaryImpl.cpp(01180) : checkAndStartListener(): no listener found...
    [80695]{-1}[-1/-1] 2020-06-21 09:35:54.174801 i sr_dataaccess DisasterRecoveryPrimaryImpl.cpp(01184) : checkAndStartListener(): try start listener...
    [80695]{-1}[-1/-1] 2020-06-21 09:35:54.174876 i sr_dataaccess DisasterRecoveryPrimaryImpl.cpp(01224) : Start listen to global interface port:40001
    [80695]{-1}[-1/-1] 2020-06-21 09:35:54.175451 e sr_dataaccess DisasterRecoveryPrimaryImpl.cpp(01259) : Listener cannot be started, because port 40001 is already in use!
    [80695]{-1}[-1/-1] 2020-06-21 09:35:54.175462 e sr_dataaccess DisasterRecoveryPrimaryImpl.cpp(01260) : A system replication primary uses replication ports in the range of instance number(s) from 00 to 00
    [80695]{-1}[-1/-1] 2020-06-21 09:35:54.175466 e sr_dataaccess DisasterRecoveryPrimaryImpl.cpp(01261) : Please check, that there is no other system on this machine using instancenr 00! This is just a hint and possibly not the root cause ..
    [80695]{-1}[-1/-1] 2020-06-21 09:35:54.175469 e sr_dataaccess DisasterRecoveryPrimaryImpl.cpp(01262) : In general the port range 40000-40099 must not be used by any other process when system replication is turned on!
    [80695]{-1}[-1/-1] 2020-06-21 09:35:54.175472 e sr_dataaccess DisasterRecoveryPrimaryImpl.cpp(01263) : You may need to set ip_local_port_range as Multitenant Database, please check "System Replication with Tenant Databases" section in admin guide and SAP note 2382421, 401162
    [80695]{-1}[-1/-1] 2020-06-21 09:35:54.175484 e sr_dataaccess DisasterRecoveryPrimaryImpl.cpp(01192) : checkAndStartListener(): listener start failed: exception 1: no.2110008 (Basis/IO/Stream/impl/NetworkChannel.cpp:1261)
    Error address in use: $msg$, rc=98: Address already in use; $[1]$=NetworkChannelBase::bindLocal. bind failed; $Context$=[2514cd6e9cfc7bdc,0.0.0.0:40001,-,UNK,0]; $channel$={<NetworkChannelBase>={this=140174047475464, fd=134, refCnt=1, local=0.0.0.0/40001_tcp, remote=(invalid), state=New, pending=[----]}}
    exception throw location:
    1: 0x00007f8132f9774e in Stream::NetworkChannelBase::bindLocal()+0x13a at NetworkChannel.cpp:1261 (libhdbbasis.so)
    2: 0x00007f8132fa3119 in Stream::NetworkChannelBase::NetworkChannelBase(Stream::NetworkChannelCompletionHandler&, Stream::NetworkChannelParameters const&, NetworkAccess::NetworkAddress const&, Stream::CompletionThreadType)+0x235 at NetworkChannel.cpp:730 (libhdbbasis.so)
    3: 0x00007f8132fa3e6e in Stream::NetworkListener::NetworkListener(Stream::NetworkChannelCompletionHandler&, Stream::NetworkChannelParameters const&, int, NetworkAccess::NetworkAddress const&, ltt::smartptr_handle<Stream::ConnectionCallback>&, Stream::CompletionThreadType)+0x3a at NetworkChannel.cpp:3390 (libhdbbasis.so)
    4: 0x00007f8132fbdd23 in Stream::NetworkChannelManager::listen(NetworkAccess::NetworkAddress const&, Stream::NetworkChannelParameters const&, unsigned int, ltt::smartptr_handle<Stream::ConnectionCallback>&, Stream::CompletionThreadType, Stream::NetworkChannelComponent)+0xf0 at NetworkChannelSSLFilter.hpp:300 (libhdbbasis.so)
    5: 0x00007f8134181e6d in DataAccess::DisasterRecoveryPrimaryHandlerImpl::startListener()+0x549 at DisasterRecoveryPrimaryImpl.cpp:1239 (libhdbdataaccess.so)
    6: 0x00007f81341e92fc in DataAccess::DisasterRecoveryPrimaryHandlerImpl::checkAndStartListener()+0x258 at DisasterRecoveryPrimaryImpl.cpp:1187 (libhdbdataaccess.so)
    7: 0x00007f81341e95a1 in DataAccess::PrimaryTimerCallback::timeoutReached()+0x20 at DisasterRecoveryTimers.cpp:60 (libhdbdataaccess.so)
    8: 0x00007f8132f7f45a in Execution::TimerThread::TimerCallback::execProcessTime()+0x16 at LockedScope.hpp:54 (libhdbbasis.so)
    9: 0x00007f8132f4c4b7 in Execution::JobObjectImpl::run(Execution::JobWorker*)+0x1463 at JobExecutorImpl.cpp:1136 (libhdbbasis.so)
    10: 0x00007f8132f4ea76 in Execution::JobWorker::run(void*&)+0x6e2 at JobExecutorThreads.cpp:327 (libhdbbasis.so)
    11: 0x00007f8132f0e416 in Execution::Thread::staticMainImp(void**)+0x3f2 at Thread.cpp:540 (libhdbbasis.so)
    12: 0x00007f8132f623d6 in Execution::Thread::staticMain(void*)+0x22 at ThreadMain.cpp:31 (libhdbbasis.so)
    13: 0x00007f8132f0c9e9 in Execution::pthreadFunctionWrapper(Execution::PthreadWrapperInfo*)+0x375 at Thread.cpp:1083 (libhdbbasis.so)
    14: 0x00007f8132a14724 in start_thread+0xc0 (libpthread.so.0)
    15: 0x00007f81316a9e8d in __clone+0x69 (libc.so.6)
    , try later


Read more...

Environment

  • SAP HANA, Platform Edition;
  • SAP HANA Database;

Product

SAP HANA, platform edition all versions

Keywords

secondary, primary, hsr, hana_system_replication, replication, site, registration, fails, failing, err, can't register, registry, syncmem, sync, async, log_replay, logreplay , KBA , HAN-DB-HA , SAP HANA High Availability (System Replication, DR, etc.) , Problem

About this page

This is a preview of a SAP Knowledge Base Article. Click more to access the full version on SAP ONE Support launchpad (Login required).

Search for additional results

Visit SAP Support Portal's SAP Notes and KBA Search.