Skip to content
This repository has been archived by the owner on Jan 8, 2024. It is now read-only.

GraphSense Transformation Output Tables

Bernhard Haslhofer edited this page Mar 18, 2019 · 11 revisions

The GraphSense transformation pipeline reads raw block data, which is ingested into Cassandra by the graphsense-blocksci component, and computes de-normalized views, which are again stored in Cassandra.

This is a documentation of all generated output tables and fields.

summary_statistics

Provides summary statistics for a computed currency-specific dataset.

  • no_blocks: number of processed blocks
  • no_address_relations: number of address relations
  • no_addresses: total number of distinct addresses
  • no_clusters: number of computed clusters (with > 1 addresses)
  • no_transactions: number of transactions
  • timestamp: timestamp of the most recent block considered in dataset

address

Provides summary statistics for a given cryptocurrency address.

  • address_prefix: first five characters of the address; used for internal dataset partitioning and lookup only (e.g., t1XBa)
  • address: cryptocurrency address (e.g., t1XBa17NfzHrCN8Kn6NtnVdXkGxpjoyZKPr)
  • first_tx: the first transaction the address has been involved in (e.g., "{height: 481827, tx_hash: 0xbae701eb8e4a552fff840ece1d84ffbcdfc6be650fcb351c921344c0041da4cd, timestamp: 1550216321}")
  • last_tx: the last transaction the address has been involved in (e.g., "{height: 481827, tx_hash: 0xbae701eb8e4a552fff840ece1d84ffbcdfc6be650fcb351c921344c0041da4cd, timestamp: 1550216321}")
  • no_incoming_txs: the number of transactions using this address as input (e.g., 1)
  • no_outgoing_txs: the number of transactions using this address as output (e.g., 0)
  • in_degree: the number of incoming address graph edges
  • out_degree: the number of outgoing address graph edges
  • total_received: total amount of currency units received (e.g., "{satoshi: 73450000, eur: 2341.41, usd: 2644.98}")
  • total_spent: total amount of currency units spent ("{satoshi: 0, eur: 0, usd: 0}")

Note: GraphSense stores cryptocurrency subunits for maintaining precision in computations. The field name "satoshi" is for legacy reasons.

address_cluster

The assignment of an address to a cluster computed via the multiple-input heuristics

  • address_prefix: first five characters of the address; used for internal dataset partitioning and lookup only (e.g., t1XBa)
  • address: cryptocurrency address (e.g., t1XBa17NfzHrCN8Kn6NtnVdXkGxpjoyZKPr)
  • cluster: GraphSense specific cluster id (e.g., 2993355)

address_incoming_relations

The set of weighted, directed edges between two addresses in the address graph.

  • dst_address_prefix: first five characters of the address; used for internal dataset partitioning and lookup only (e.g., t1QZX)
  • dst_address: the destination node (address) of an edge (e.g., t1QZX18FLxsSTqzEuUNeApmcfrqVo3sBVjn)
  • estimated_value: the estimated flow of currency units from the source to the destination address (e.g., "{satoshi: 4137938, eur: 300.97, usd: 358.97}")
  • src_address: the source node (address) of an edge (e.g., t1L872tHAgBEzn4a26i6trKf5Dr3RyvBdBV)
  • no_transactions: the number of transactions from src_address to dst_address
  • src_properties: a selection of statistical properties of the src_address (e.g., "{total_received: 191116171031804, total_spent: 182349215096398}")

address_outgoing_relations

Opposite direction of address_incoming_relations

address_tags

  • address: tagged cryptocurrency address (e.g., t1ZmpK4QFcvyQZ3ghTgSboBW8b4HgiZHQF9)
  • tag: the human-readable tag name (e.g., Internet Archive)
  • source: tag source (e.g., Internet Archive Web Site)
  • source_uri: tag source URI (e.g., https://archive.org/donate/cryptocurrency/)
  • actor_category: a field for categorizing the real-world actor behind an address (e.g., organization, exchange, miner, etc.)
  • description: a human-readable description (e.g., "Internet Archive Zcash address")
  • tag_uri: tag URI (e.g., https://archive.org/donate/cryptocurrency/)
  • timestamp: UNIX timestamp indicating when a tag has been created (e.g., 1552912648)

address_transactions

cluster

cluster_addresses

cluster_incoming_relations

cluster_outgoing_relations

Opposite direction of cluster_incoming_relations

cluster_tags

Clone this wiki locally