:: [unSYSTEM] Fwd: libbitcoin update: …
Top Page
Delete this message
Reply to this message
Author: Amir Taaki
Date:  
To: unsystem
Subject: [unSYSTEM] Fwd: libbitcoin update: infrastructure
more stuff

- -------- Original Message --------
Subject: libbitcoin update: infrastructure
Date: Tue, 18 Jun 2013 21:02:14 +0200
From: Amir Taaki <genjix@???>
To: caedes@???

I was talking with the Tor devs and hearing skilled people talk about
what they need with Bitcoin was enlightening.

Hearing how they conceptualised the Bitcoin payment system they'd
like, helped me tear out what concepts are logical and how I need to
cater to that.

Liad needs balances and histories for addresses. Using the basic
blockchain interface is not enough, and I kept telling them to make a
caching layer on top. I've got my design for a master with slaves, and
using ZeroMQ to build a highly scalable infrastructure but we need more.

A lot of what Bitcoin is, is finding what people's use case is there,
what are the essentials they need and optimising around that. I have
the basic primitives in libbitcoin. Having the basics gives the
maximum power and most stability (data is always in one place, never
duplicated). Unfortunately that makes the API difficult to use and it
might not even be basic enough! (for performance reasons).

2 new methods were added to libbitcoin:

  /**
   * Fetches the output points and corresponding input point spends
   * associated with a Bitcoin address. The length of the fetched
   * outputs should always match the length of the inputs.
   *
   * Output points and input points are matched by their corresponding
index.
   * The spend of outpoint[i] is inpoint[i].
   *
   * If an output is unspent then the corresponding input spend hash
   * will be equivalent to null_hash.
   *
   * @code
   *  if (inpoints[i].hash == null_hash)
   *    // The ith output point is unspent.
   * @endcode
   *
   * @param[in]   chain           Blockchain service
   * @param[in]   address         Bitcoin address to fetch history for.
   * @param[in]   handle_fetch    Completion handler for fetch operation.
   * @code
   *  void handle_fetch(
   *      const std::error_code& ec,          // Status of operation
   *      const output_point_list& outpoints, // Output points (deposits)
   *      const input_point_list& inpoint     // Input points (spends)
   *  );
   * @endcode
   */
  void fetch_history(blockchain& chain, const payment_address& address,
      blockchain_fetch_handler_history handle_fetch);


  typedef std::vector<uint64_t> output_value_list;
  typedef std::function<
      void (const std::error_code&, const output_value_list&)>
          blockchain_fetch_handler_output_values;


  /**
   * Fetches the output values given a list of output points. These can
   * be summed to give the balance for a list of outputs.
   *
   * @param[in]   chain           Blockchain service
   * @param[in]   outpoints       Output points to fetch values for.
   * @param[in]   handle_fetch    Completion handler for fetch operation.
   * @code
   *  void handle_fetch(
   *      const std::error_code& ec,          // Status of operation
   *      const output_value_list& values     // Values for outputs.
   *  );
   * @encode
   */
  void fetch_output_values(blockchain& chain, const output_point_list&
outpoints,
      blockchain_fetch_handler_output_values handle_fetch);


See more on composed operations:
http://libbitcoin.dyne.org/doc/blockchain.html#fetch-block-and-composed-operations

Also the problem with the current API is that we have one generic
interface (blockchain) and depdending on the internal representation,
the interface strongly reflects that. If you want some piece of data,
you need to make several calls knowing about the internal model
underneath to get what you want.

I realised that the way the Django ORM works and the way I made that
Python based query system can also be translated to asynchronous C++!

https://github.com/genjix/query/blob/master/tutorial.py

You build the query first to signal what you want, then you execute it
passing a handler to get the result. The data structure figures out
exactly what you need then fetches it without doing unnecessary
operations.

blocks[110].transactions[10].hash

-> view blocks [give us a blocks view to start]
-> select transactions 10 [select the 10th transaction]
-> select hash [give us the block hash]

The above operation should then never fetch the transaction (just the
10th hash in block 110). We should also support many different
indexing types like hash .etc

outputs["1Afgy..."].spend.(hash, index)

Fetch all the (hash, index) from the outputs spends.

That way you can have different blockchain implementations (bdb, SQL,
LevelDB, memory store, Redis, ...) each with different internal model
representations. You make adapters for processing these sequences.

- From the user perspective, they are just telling the fetch engine what
they want and are getting it returned in a single operation. In this
way we can offer atomic operations even (by comparing block hash at
start and finish of operation and if it's changed then restarting the
query).

####################
LIAD
####################

Liad needs a way to get the unspent outputs (to create transactions),
the balance for outputs and some rudimentary history of credits and
debits.

This is easily doable with libbitcoin (especially since I added
fetch_history() and fetch_output_values() composed operations).
However with lots of outputs (like Satoshi Dice) this is a huge
operation. Fetching outputs is very quick. Fetching spends is
moderately OK. Fetching the output values (fetch an entire
transaction, lookup just the output, then return the uint64_t - very
wasteful) is quite slow (nearly a minute on my fast server).

My first reaction is to optimise libbitcoin - OK we need a new
fetch_output() method call in blockchain and should deprecate
fetch_transaction(). However we need to still fetch entire
transactions a lot for various things. So fetch_transaction is needed
for the time being.

Then I imagined a big flat file where blocks are appended. New block
entries in the database just point to an offset in the file, and the
API can read off a block header, a whole block, a transaction, an
output, ... whatever they need. However now the API would need to
change to accommodate these changes. We looked above at my thoughts on
this and how we can make a generic API that isn't tied to any one data
model (new blockchain API).

However for Liad, he needs the data there in a simple format with the
future ability to scale up easily.

He needs:

struct row
{
    payment_address addr;
    output_point outpoint;
    uint64_t value;
    bool confirmed_credit;
    input_point spend{null_hash, std::numeric_limits<uint32_t>::max};
    bool confirmed_debit = false;
};


The ability to add new credits (confirmed or not), and their relevant
spends (confirmed or not).

We also need a process queue for new addresses. When a new address
comes into the program, it is processed and the current history loaded
so the Fullnode can monitor for changes involved with that address
(new credits or debits).

PostgreSQL is perfect for this. That's what I'm going to make them.
You push a new address which synchronises the history in the database
and updates the last_update_timest for their pushed item in the queue.
Once they see that's been updated, they can simply SELECT the new
history changes they want and do whatever they need.

With that basic data they have most of everything needed. We can add
more specific functions to examine the transaction in more depth later
as needed. In general I think there's very little of a transaction
needed for a wallet site though (blockchain.info only shows
input/output addresses in a transaction which we could easily cache).

So the fullnode is running, polling for new unprocessed addresses
added to its queue. Once they come in, it fetches the history and
updates their processed timestamp. When a new block comes in, it's
inputs/outputs are checked against the history of all addresses for
changes.

SQL Below:
- ------------------------

DROP DOMAIN IF EXISTS amount_type CASCADE;
CREATE DOMAIN amount_type AS NUMERIC(16, 8) CHECK (VALUE < 21000000
AND VALUE >= 0);
DROP DOMAIN IF EXISTS hash_type CASCADE;
CREATE DOMAIN hash_type AS bytea; -- 32*3 because "aa 0f ca ..."

DROP TABLE IF EXISTS history;
DROP SEQUENCE IF EXISTS history_id_sequence;

CREATE SEQUENCE history_id_sequence;
CREATE TABLE history (
    id INT NOT NULL DEFAULT NEXTVAL('history_id_sequence') PRIMARY KEY,
    address hash_type NOT NULL,
    output_point_hash hash_type NOT NULL,
    output_point_index INT NOT NULL,
    value amount_type NOT NULL,
    confirmed_credit BOOL NOT NULL,
    spend_point_hash hash_type,
    spent_point_index INT,
    confirmed_debit BOOL
);


CREATE INDEX ON history (address);
CREATE INDEX ON history (output_point_hash, output_point_index);

DROP TABLE IF EXISTS queue;

CREATE TABLE queue (
    address hash_type NOT NULL,
    last_update_time TIMESTAMP
);


CREATE INDEX ON queue (address);

DROP TABLE IF EXISTS blocks;

CREATE TABLE blocks (
    depth INT NOT NULL UNIQUE,
    hash hash_type NOT NULL UNIQUE
);


CREATE INDEX ON blocks (depth);