:: Re: [Libbitcoin] Large amount of fa…
Kezdőlap
Delete this message
Reply to this message
Szerző: Eric Voskuil
Dátum:  
Címzett: 'Police Terror', libbitcoin
Tárgy: Re: [Libbitcoin] Large amount of failed connections
Note that a sync that fails to resize due to insufficient disk space or otherwise shuts down hard should not be restarted. There is currently no hard shutdown detection and in these cases results are unpredictable.

Also restart should not be applied to a chain that was partially downloaded using a previous version. The database structure is identical, but in order for restart to properly detect gaps the block index must be zeroized, which was not the case in previous builds.

e

-----Original Message-----
From: Libbitcoin [mailto:libbitcoin-bounces@lists.dyne.org] On Behalf Of Police Terror
Sent: Thursday, July 28, 2016 10:12 AM
To: libbitcoin@???
Subject: Re: [Libbitcoin] Large amount of failed connections

ok, trying now. Thanks.

Eric Voskuil:
> Restartable parallel initial block download is complete once this clears:
>
> https://github.com/libbitcoin/libbitcoin-node/pull/135
>
> e
>
> -----Original Message-----
> From: Libbitcoin [mailto:libbitcoin-bounces@lists.dyne.org] On Behalf
> Of Police Terror
> Sent: Thursday, July 28, 2016 4:14 AM
> To: libbitcoin@???
> Subject: Re: [Libbitcoin] Large amount of failed connections
>
> Well the main problem is that blockchain sync stops and doesn't resume.
>
> Eric Voskuil:
>> If you have a variety of host IP addresses getting selected, and none of them are connecting, and this doesn't start until after almost 300k blocks, I don't have any explanation except that the network is failing. This is not the same as what I described below, which is connections being dropped - as opposed to connections failing to establish.
>>
>> e
>>
>> -----Original Message-----
>> From: Libbitcoin [mailto:libbitcoin-bounces@lists.dyne.org] On Behalf
>> Of Police Terror
>> Sent: Thursday, July 28, 2016 3:46 AM
>> To: libbitcoin@???
>> Subject: Re: [Libbitcoin] Large amount of failed connections
>>
>> I've restarted the sync, and it's reached ~299k blocks, and again it's cycling through connections but failing to connect. Blockchain sync has stopped which is preventing me from testing the subscription, and other new features.
>>
>> debug.log just looks like this continually scrolling along:
>>
>> 10:42:00.165745 DEBUG [node] Failure connecting slot (5) operation
>> failed
>> 10:42:00.165753 DEBUG [network] Connecting to [9.80.215.236:8333]
>> 10:42:00.165782 DEBUG [network] Connecting to [60.241.21.52:8333]
>> 10:42:00.165800 DEBUG [node] Starting slot (5).
>> 10:42:00.165156 DEBUG [network] Connecting to [114.198.84.220:8333]
>> 10:42:00.165739 DEBUG [network] Failure connecting to
>> [92.193.51.52:8333] operation failed
>> 10:42:00.165883 DEBUG [network] Connecting to [72.205.178.163:8333]
>> 10:42:00.165963 DEBUG [network] Connecting to [77.57.26.110:8333]
>> 10:42:00.166037 DEBUG [network] Connecting to [95.141.29.45:8333]
>> 10:42:00.166095 DEBUG [network] Failure connecting to
>> [77.57.26.110:8333] operation failed
>> 10:42:00.166137 DEBUG [network] Connecting to [178.203.234.25:8333]
>> 10:42:00.166180 DEBUG [network] Connecting to [213.211.135.7:8333]
>> 10:42:00.166196 DEBUG [network] Connecting to [129.27.153.152:8333]
>> 10:42:00.166207 DEBUG [network] Connecting to [140.112.29.201:17737]
>> 10:42:00.166114 DEBUG [network] Failure connecting to [76.6
>>
>>
>> Eric Voskuil:
>>> I should point out that there is another situation that you may be
>>> noticing. During parallel initial block download the node performs
>>> load balancing across the active nodes and from a reserve of any
>>> unallocated hashes. When load is split from the channel with the
>>> most work to a channel with no work, the latter is restarted.
>>> Otherwise we would receive the previously-requested blocks on both
>>> channels. But this eventually results in out of order requests to
>>> the faster peer (the one that was empty and therefore picked up half the load of the slowest).
>>>
>>> There is no way to fix the ordering because the preceding requests
>>> have already been made. I believe this results in typical clients
>>> dropping the faster channel. In other words both channels get dropped.
>>> This starts to churn pretty heavily toward the end of the sync. From
>>> a performance standpoint it's not a significant issue, maybe a 10%
>>> hit or so. The peers really shouldn't drop the channel simply
>>> because blocks are being requested out of order, but that's my best
>>> guess as to what's happening - since the behavior is very predictable.
>>>
>>> I haven't had a chance to track it down in Core, etc. If this is
>>> treated as bad behavior I would issue a pull request to Core to fix
>>> it, and if it wasn't accepted I would just leave it. If someone
>>> wants to take on this task that would be great :).
>>>
>>> e
>>>
>>> On 07/27/2016 05:46 AM, Police Terror wrote:
>>>> ok, it was master branch btw.
>>>>
>>>> Once resume sync is done, then things should be much better.
>>>>
>>>> Eric Voskuil:
>>>>> Version?
>>>>>
>>>>> FYI the odds of a successful connection are about 1 in 5. This is why version3 uses batch connection when generating connections from the pool.
>>>>>
>>>>> It is however possible that the pool can become starved of good connections, as the connections are supplied by peers (not just at startup) and there is no DoS protection. I have found that actual problems here are rare and given other more significant necessary work I haven't prioritized making the host pool more robust. The first step is to actually manage the pool, aging connections out, requesting more when needed, falling back to seed nodes at some point. I wouldn't restart to do this as it's very disruptive and there's really no reason to do so.
>>>>>
>>>>> e
>>>>>
>>>>> -----Original Message-----
>>>>> From: Libbitcoin [mailto:libbitcoin-bounces@lists.dyne.org] On
>>>>> Behalf Of Police Terror
>>>>> Sent: Wednesday, July 27, 2016 12:55 AM
>>>>> To: libbitcoin@???
>>>>> Subject: Re: [Libbitcoin] Large amount of failed connections
>>>>>
>>>>> Police Terror:
>>>>>> I notice there's no longer a hosts.cache file. Has it been
>>>>>> disabled because it was getting filled with bad hosts?
>>>>>
>>>>> This is wrong. Of course it was created when I stopped the node.
>>>>>
>>>>> However the node getting stuck cycling through hosts that don't work makes initial sync difficult.
>>>>>
>>>>> Maybe there can be a piece of code whereby if there isn't a connection after X attempts, then it will restart the bootstrap process and clear the hosts cache. How does that sound?
>>>>> _______________________________________________
>>>>> Libbitcoin mailing list
>>>>> Libbitcoin@???
>>>>> https://mailinglists.dyne.org/cgi-bin/mailman/listinfo/libbitcoin
>>>>>
>>>> _______________________________________________
>>>> Libbitcoin mailing list
>>>> Libbitcoin@???
>>>> https://mailinglists.dyne.org/cgi-bin/mailman/listinfo/libbitcoin
>>>>
>>>
>> _______________________________________________
>> Libbitcoin mailing list
>> Libbitcoin@???
>> https://mailinglists.dyne.org/cgi-bin/mailman/listinfo/libbitcoin
>>
> _______________________________________________
> Libbitcoin mailing list
> Libbitcoin@???
> https://mailinglists.dyne.org/cgi-bin/mailman/listinfo/libbitcoin
>

_______________________________________________
Libbitcoin mailing list
Libbitcoin@???
https://mailinglists.dyne.org/cgi-bin/mailman/listinfo/libbitcoin