:: Re: [Libbitcoin] libbitcoin server …
Top Page
Delete this message
Reply to this message
Author: Kobi Gurkan
Date:  
To: Eric Voskuil
CC: libbitcoin
Subject: Re: [Libbitcoin] libbitcoin server update [was: Syncing a version 2 server]
Hi Eric,

I appreciate the detailed answer a lot - it clarifies things for sure!
I saw that you just released 2.2.0, I'm going to test with it and will
report if there are any issues.

Kobi

On Sun, Dec 20, 2015 at 2:52 AM, Eric Voskuil <eric@???> wrote:

> TL;DR - version 2.2.0 is close and the sync problems are gone.
>
> ================================================================
>
> The open issues affecting server are always listed here:
>
> https://github.com/libbitcoin/libbitcoin/issues
> https://github.com/libbitcoin/libbitcoin-consensus/issues
> https://github.com/libbitcoin/libbitcoin-blockchain/issues
> https://github.com/libbitcoin/libbitcoin-node/issues
> https://github.com/libbitcoin/libbitcoin-server/issues
>
> Apart from missing features there is only one material issue that
> affects version2:
>
> https://github.com/libbitcoin/libbitcoin-server/issues/100
>
> The master branch and sync feature sub branch have been focused on
> permanently resolving this issue. This has required a significant rework
> of the networking stack, which has been completed in the libbitcoin repo
> of those branches. The bulk of the other work is in libbitcoin-node,
> which is where I was working until Hong Kong. This work in master is not
> consider stable, although it should always build.
>
> Upon returning from Hong Kong the BIP65 enforcement threshold had
> arrived. I decided to take the time to implement both both BIP66 (which
> activated over the summer) and BIP65 in version2. Note that these are
> both soft forks, so libbitcoin continues to work despite their
> activation, though it does not enforce the new rules. This work is
> complete, but it took me down a rabbit hole over the past couple of weeks.
>
> First, it was necessary to adopt the latest libsecp256k1 in order to
> update libbitcoin-consensus. This was necessary in order to track core
> 0.12.0 consensus updates. This forced an update all version2 repos to
> change the dependency on libsecp256k1, an update to the wrapper code in
> libbitcoin, an update to the MSVC libsecp256k1 build, and a new
> libbitcoin/libsecp256k1/version4 branch to build from. In doing this I
> forgot to branch and re-version libbitcoin-consensus, which caused
> issues with master and version2.1 builds. pmienk fixed that recently.
>
> Second, it was necessary to implement the additional consensus rules in
> our native script implementation. This work exposed a problem with our
> softfork activation code. Formerly the only soft forks that we
> implemented were BIP16, BIP30 and BIP34. BIP30 is applied retroactively
> to all blocks except two well-known blocks and BIP16 was activated as of
> a calendar date. But BIP34 defined the method that is used by BIP66 and
> BIP65. This was not implemented in libbitcoin, but instead fixed block
> ids were used as thresholds (similar to BIP16 and BIP30). This is not
> ideal, especially in the case where we intend to cross thresholds with
> running code. So I implemented the BIP34 activation technique, applied
> it retroactively to BIP34 and implemented and tested BIP66 and BIP65.
>
> Testing required that I sync and validate the full chain. This has been
> pretty much impossible in version2. But in my master|version3 work I had
> recently discovered the primary reason for the documented libbitcoin
> sync issue. There is a long-standing bug in the expected use of the
> subscriber template, where resubscription causes loss of messages. This
> dropping of incoming blocks leads to excessive "reorg" processing by the
> node, since it can't connect subsequent blocks to the chain. This causes
> both stalls and the re-requesting of missing blocks by new connections,
> significantly compounding the backlog. The larger the backlog, the worse
> is the resubscription issue, resulting in more dropped blocks. This
> spirals out of control, bringing down the server.
>
> So I decided to patch this issue in version2 so that I could validate
> the new softfork activation code. In patching this I ended up patching a
> couple of race conditions that I had already resolved in
> master|version3. Once this was resolved I got to the point where we
> never missed a block from peers. This is a bit more of a challenge than
> it may seem, because of the asynchronous architecture of libbitcoin.
> There are still issues that I'm aware of in version2, but they are
> pushed to the margins, generally startup and shutdown. These may still
> cause failures but should be rare. The net major release should resolve
> these issues entirely.
>
> The benefit of the asynchronous architecture, and memory-mapped hash
> table blockchain implementation, is speed. We should be extremely fast
> at sync, but these problems in bc::network and libbitcoin-node have
> prevented that from being realized. Yesterday, as I tested the first
> runs with no dropped blocks, I witnessed outstanding sync performance in
> version2 for the first time:
>
> I was able to sync the first 200,000 blocks in 90 minutes.
>
> One of the problems with version2 is that it does not make a significant
> distinction between sync and post-sync. This is defined by the last
> checkpoint. The major distinction is the level of validation. But there
> are networking considerations as well. When syncing there should be no
> orphan pool accumulation, but because we cannot detect block
> announcements without having first performed header downloads, we do
> start to accumulate orphans. This slows down sync a lot, since orphan is
> tested against each incoming block. Also, we want to disable transaction
> relay by peers during sync. This can only be done during the handshake,
> which again requires a distinction between sync and post-sync.
> Connections need to be dropped and reconfigured once sync is complete.
> Furthermore you want only 1-3 peers during sync, but 8 or so post-sync.
> 1 fast peer would be ideal, but 2 peers guards against 1 peer
> under-performing, and three peers starts to create excessive work.
> Version2 has no mechanism to do make these changes automatically, so you
> need to do it manually.
>
> So I've defaulted the configuration file for sync... 1 orphan, no relay,
> and ~2 outgoing peers, 0 incoming peers. You will need to set a high
> checkpoint yourself. Once the last sync is complete you will need to
> switch the config... ~20 orphans, enable relay, ~8/~8 peers. I'll
> probably provide a second post-sync config file that can be manually
> swapped.
>
> Also, there is a new configuration setting to enable mempool consistency
> enforcement. This was a new feature that was always enabled in version2
> post v2.1.0. This may cause significant mempool delays, so it's now
> configurable and disabled by default.
>
> I have a couple of things to finish up on version2, which will be tagged
> as v2.2.0. But I encourage people to start testing on the head of
> version2 presently. I hope to release in the next couple of days, then
> back to version3 work :).
>
> e
>
> On 12/19/2015 10:52 AM, Kobi Gurkan wrote:
> > Mimikael,
> >
> > Thanks - I eventually gave up on version2 and currently I'm syncing with
> > master with checkpoints in the config which sped it up enormously.
> > Apart from Merkle Root Mismatch on block 322670, on which I just started
> > version2 again to get through, and then master again, it works pretty
> great!
> >
> > Kobi
> >
> > On 19/12/15 20:33, mlmikael wrote:
> >> Kobi,
> >>
> >> Short answer: Wait until January
> >>
> >> Long answer: Version2 has an "off-by-1" bug that will eat all your
> >> memory and crash the backend. Therefore the current best practice is
> >> to put an "ulimit" on total RAM (e.g. to 800MB) as well as do restarts
> >> continuously - I don't know - every 1 minute?? 10 seconds?
> >>
> >> For the interim, would you mind trying the workaround behavior
> >> mentioned here and let us know your results?
> >>
> >> Cheers,
> >> Mlmikael
> >>
> >> On 2015-12-19 04:36, Kobi Gurkan wrote:
> >>> Hi,
> >>>
> >>> I'm using the latest stable release (version 2) of libbitcoin-server,
> >>> and I find that it takes about 30-60 seconds per block since around
> >>> block 273038.
> >>> Is it realistic to sync version 2 right now? Maybe I should improve
> >>> the specs of my machine?
> >>>
> >>> I know that Eric has been working on headers-first sync, but I would
> >>> like to use a stable release.
> >>>
> >>> Kobi
>
>