I've documented the stall issue in detail here:
https://github.com/libbitcoin/libbitcoin-server/issues/87
It's a node issue, but I created a tracking issue in server.
I'm about to push my logging and other refactoring to master. The
resolution to the problem at this point should not be terribly
difficult, but I may not have time to complete the work this month as
I'll be travelling. Most of my work has been in node, so I'm going to
update server to derive from node. Otherwise we will continue to have
duplication of effort and drift.
The above changes are fairly deep, and I've added partial handling for
for additional messages. Also logging is fairly intensive, which
hinders performance. The debug log grows very quickly. On the other
hand the quality of the information is dramatically improved. I'm
planning to pare this back once the issues are resolved and we can get
a 2.1 release out. For version3 we should be making some deeper
modifications to logging.
Aside from the stall there are issues of getting the protocol up to
snuff. We don't handle some messages, and handle others incorrectly.
This doesn't appear to be getting us blocked by DOS protection, but
it's still problematic. Neill is presently working on a thorough
protocol review.
Regarding the issue below, it was a cut/paste error. A line of code
got lost. I rolled back version2 to the release and later found the issue.
e
On 06/06/2015 04:14 PM, Eric Voskuil wrote:
> Because of a regression in the mempool:
>
> https://github.com/libbitcoin/libbitcoin-server/issues/82
>
> ...which I can't reproduce until getting fully-synced, I've just
> rolled back the server stack to the version2 release.
>
> I'm planning to continue to advance master until the server stall
> issue is verified (at which point I'll push to version2 and issue a
> release update). There is a lot of refactoring for readability and
> improved log context. This has helped us locate the cause of the
> stall.
>
> The issue is that under certain conditions, we stop asking for
> blocks. This includes conditions where we encounter certain network
> faults or even bad payloads, but also normal conditions. So there
> are no visible errors or obvious incorrect behavior until one looks
> at the interactions holistically. Thanks to pmienk for deducing the
> issue during our bug-bash yesterday.
>
> The CPU consumption issue has previously been resolved and it
> appears that more recent changes have resolved the memory
> consumption issue as well. Logging is much improved as well. It
> will take some time to resolve the stall issue more elegantly, but
> I hope to have a patch up in the next few days.
>
> e
>