-
Notifications
You must be signed in to change notification settings - Fork 9
GraphSense Transformation Output Tables
The GraphSense transformation pipeline reads raw block data, which is ingested into Cassandra by the graphsense-blocksci
component, and computes de-normalized views, which are again stored in Cassandra.
This is a documentation of all generated output tables and fields.
Provides summary statistics for a computed currency-specific dataset.
- no_blocks: number of processed blocks
- no_address_relations: number of address relations
- no_addresses: total number of distinct addresses
- no_clusters: number of computed clusters (with > 1 addresses)
- no_transactions: number of transactions
- timestamp: timestamp of the most recent block considered in dataset
Provides summary statistics for a given cryptocurrency address.
- address_prefix: first five characters of the address; used for internal dataset partitioning and lookup only (e.g., t1XBa)
- address: cryptocurrency address (e.g., t1XBa17NfzHrCN8Kn6NtnVdXkGxpjoyZKPr)
-
first_tx: the first transaction the address has been involved in (e.g.,
"{height: 481827, tx_hash: 0xbae701eb8e4a552fff840ece1d84ffbcdfc6be650fcb351c921344c0041da4cd, timestamp: 1550216321}
") -
last_tx: the last transaction the address has been involved in (e.g., "
{height: 481827, tx_hash: 0xbae701eb8e4a552fff840ece1d84ffbcdfc6be650fcb351c921344c0041da4cd, timestamp: 1550216321}
") - no_incoming_txs: the number of transactions using this address as input (e.g., 1)
- no_outgoing_txs: the number of transactions using this address as output (e.g., 0)
- in_degree: the number of incoming address graph edges
- out_degree: the number of outgoing address graph edges
-
total_received: total amount of currency units received (e.g.,
"{satoshi: 73450000, eur: 2341.41, usd: 2644.98}"
) -
total_spent: total amount of currency units spent (
"{satoshi: 0, eur: 0, usd: 0}"
)
Note: GraphSense stores cryptocurrency subunits for maintaining precision in computations. The field name "satoshi" is for legacy reasons.
The assignment of an address to a cluster computed via the multiple-input heuristics
- address_prefix: first five characters of the address; used for internal dataset partitioning and lookup only (e.g., t1XBa)
- address: cryptocurrency address (e.g., t1XBa17NfzHrCN8Kn6NtnVdXkGxpjoyZKPr)
- cluster: GraphSense specific cluster id (e.g., 2993355)
The set of weighted, directed edges between two addresses in the address graph.
- dst_address_prefix: first five characters of the address; used for internal dataset partitioning and lookup only (e.g., t1QZX)
- dst_address: the destination node (address) of an edge (e.g., t1QZX18FLxsSTqzEuUNeApmcfrqVo3sBVjn)
-
estimated_value: the estimated flow of currency units from the source to the destination address (e.g.,
"{satoshi: 4137938, eur: 300.97, usd: 358.97}"
) - src_address: the source node (address) of an edge (e.g., t1L872tHAgBEzn4a26i6trKf5Dr3RyvBdBV)
-
no_transactions: the number of transactions from
src_address
todst_address
-
src_properties: a selection of statistical properties of the source address (e.g.,
"{total_received: 191116171031804, total_spent: 182349215096398}"
)
Same as address_incoming_relations
but opposite direction (src_address
and dst_address
switched)
- address: tagged cryptocurrency address (e.g., t1ZmpK4QFcvyQZ3ghTgSboBW8b4HgiZHQF9)
- tag: the human-readable tag name (e.g., Internet Archive)
- source: tag source (e.g., Internet Archive Web Site)
- source_uri: tag source URI (e.g., https://archive.org/donate/cryptocurrency/)
- actor_category: a field for categorizing the real-world actor behind an address (e.g., organization, exchange, miner, etc.)
- description: a human-readable description (e.g., "Internet Archive Zcash address")
- tag_uri: tag URI (e.g., https://archive.org/donate/cryptocurrency/)
- timestamp: UNIX timestamp indicating when a tag has been created (e.g., 1552912648)
The transactions an address was involved in either as input or output
- address_prefix: first five characters of the address; used for internal dataset partitioning and lookup only (e.g., t1fB3)
- address: the address (e.g., t1fB36H7W9f6aHqFwR2NdKYrzqo1dKK7GWf)
- height: height of the block the transaction belongs to (e.g., 128680)
- tx_hash: the transaction hash (e.g., 0x42229e12cdec6b13d704799533dc140784cf4d220780d32c93c2b302f24e7b1e)
- timestamp: the transaction timestamp (e.g., 1496983505)
- tx_index: GraphSense internal transaction index
- value: value (in cryptocurrency sub-units) assigned to an address (negativ if address was used input; positive if address was used as output)
Provides summary statistics for a given cryptocurrency address cluster.
- cluster: GraphSense internal cluster identifier
- first_tx: the first transaction an address of this cluster has been involved in
- last_tx: the most recent transaction an address of this cluster had been involved in
- no_addresses: the number of addresses in this cluster
- no_incoming_txs: the number of transactions using cluster addresses as input
- no_outgoing_txs: the number of transactions using cluster addresses as input
- in_degree: the number of incoming cluster graph edges
- out_degree: the number of incoming cluster graph edges
- total_received: total amount of currency units received by the cluster
- total_spent: total amount of currency units spent by the cluster
Statistical summary of addresses contained in a cluster.
- cluster: GraphSense internal cluster identifier
- address: cryptocurrency address contained in a cluster
- first_tx: the first transaction the address has been involved in
- last_tx: the most recent transaction the address has been involved in
- no_incoming_txs: the number of transactions using this address addresses as input
- no_outgoing_txs: the number of transactions using this address addresses as output
- in_degree: the number of incoming address graph edges
- out_degree the number of outgoing address graph edges
- total_received: total amount of currency units received by the cluster
- total_spent: total amount of currency units spent by the cluster
This table follows the same structure as address_incoming_relations
with src and dst nodes being cluster nodes instead of addresses.
Same as cluster_incoming_relations
but opposite direction (src_cluster
and dst_cluster
switched)
Same structure as address_tags
, with additional cluster identifiers for addresses