Skip to content

Go snowflake #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 13 commits into
base: empty_for_pr
Choose a base branch
from
9 changes: 9 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
/.project
ruby_snowflake_client.h
ruby_snowflake_client.so
/.rakeTasks
.idea/*

# ruby gems
*.gem
/.DS_Store
6 changes: 6 additions & 0 deletions Gemfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# frozen_string_literal: true

source 'https://rubygems.org'

# Specify your gem's dependencies in scatter.gemspec
gemspec
22 changes: 22 additions & 0 deletions Gemfile.lock
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
PATH
remote: .
specs:
ruby_snowflake_client (0.2.4-x86_64-darwin-18)
ffi

GEM
remote: https://rubygems.org/
specs:
ffi (1.11.3)
rake (13.0.1)

PLATFORMS
ruby

DEPENDENCIES
bundler
rake
ruby_snowflake_client!

BUNDLED WITH
1.17.3
21 changes: 21 additions & 0 deletions LICENSE.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
The MIT License (MIT)

Copyright (c) 2015 Dotan Nahum

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
91 changes: 91 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
# Snowflake Connector for Ruby

Uses [gosnowflake](https://github.com/snowflakedb/gosnowflake/) to more efficiently query snowflake than ODBC. We found
at least 2 significant problems with ODBC which this resolves:
1. For large result sets, ODBC would get progressively slower per row as it would retrieve all the preceding
pages in order to figure out the offset. This new gem uses a streaming interface alleviating the need for
offsets and limit when paging through result sets.
2. ODBC mangled timezone information.

In addition, this gem is a lot faster for all but the most trivial queries.

## Tech Overview

This gem works by deserializing each row into an array of strings in Go. It then converts it to an array
of C strings (`**C.Char`) which it passes back through the FFI (foreign function interface) to Ruby.
There's a slight penalty for the 4 time type conversion (from the db type to Go string, from Go string
to C string, from C string to the Ruby string, and then from Ruby string to your intended type).

## How to use

Look at [examples](https://github.com/dmitchell/go-ruby-snowflake-connector/blob/master/examples)

1. add as gem to your project (`gem 'ruby_snowflake_client', '~> 0.2.2'`)
2. put `require 'go_snowflake_client'` at the top of your files which use it
3. following the pattern of the [example connect](https://github.com/dmitchell/go-ruby-snowflake-connector/blob/master/examples/table_crud.rb),
call `GoSnowflakeClient.connect` with your database information and credentials.
4. use `GoSnowflakeClient.exec` to execute create, update, delete, and insert queries. If it
returns `nil`, call `GoSnowflakeClient.last_error` to get the error. Otherwise, it will return
the number of affected rows.
5. use `GoSnowflakeClient.select` with a block to execute on each row to query the database. This
will return either `nil` or an error string.
9. and finally, call `GoSnowflakeClient.close(db_pointer)` to close the database connection

### Our use pattern

In our application, we've wrapped this library with query generators and model definitions somewhat ala
Rails but with less dynamic introspection although we could add it by using
``` ruby
GoSnowflakeClient.select(db, 'describe table my_table') do |col_name, col_type, _, nullable, *_|
my_table.add_column_description(col_name, col_type, nullable)
end
```

Each snowflake model class inherits from an abstract class which instantiates model instances
from each query by a pattern like
``` ruby
GoSnowflakeClient.select(db, query) do |row|
entry = self.new(fields.zip(row).map {|field, value| cast(field, value)}.to_h)
yield entry
end

def cast(field_name, value)
if value.nil?
[field_name, value]
elsif column_name_to_cast.include?(field_name)
cast_method = column_name_to_cast[field_name]
if cast_method == :to_time
[field_name, value.to_time(:local)]
elsif cast_method == :to_utc
[field_name, value.to_time(:utc)]
elsif cast_method == :to_date
[field_name, value.to_date]
elsif cast_method == :to_utc_date
[field_name, value.to_time(:utc).to_date]
else
[field_name, value.public_send(cast_method)]
end
else
[field_name, value]
end
end

# where each model declares column_name_to_cast ala
COLUMN_NAME_TO_CAST = {
id: :to_i,
ad_text_id: :to_i,
is_mobile: :to_bool,
is_full_site: :to_bool,
action_element_count: :to_i,
created_at: :to_time,
session_idx: :to_i,
log_idx: :to_i,
log_date: :to_utc_date}.with_indifferent_access.freeze

def self.column_name_to_cast
COLUMN_NAME_TO_CAST
end
```

Of course, instantiating an object for each row adds expense and gc stress; so, it may not always
be a good approach.
3 changes: 3 additions & 0 deletions Rakefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# frozen_string_literal: true

require 'bundler/gem_tasks'
34 changes: 34 additions & 0 deletions examples/common_sample_interface.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# frozen_string_literal: true

$LOAD_PATH << File.expand_path('..', __dir__)
require 'lib/go_snowflake_client'
require 'logger'

class CommonSampleInterface
attr_reader :db_pointer

def initialize(database)
@logger = Logger.new(STDERR)

@db_pointer = GoSnowflakeClient.connect(
ENV['SNOWFLAKE_TEST_ACCOUNT'],
ENV['SNOWFLAKE_TEST_WAREHOUSE'],
database,
ENV['SNOWFLAKE_TEST_SCHEMA'] || 'TPCDS_SF10TCL',
ENV['SNOWFLAKE_TEST_USER'],
ENV['SNOWFLAKE_TEST_PASSWORD'],
ENV['SNOWFLAKE_TEST_ROLE'] || 'PUBLIC'
)

log_error unless @db_pointer
end

def close_db
GoSnowflakeClient.close(@db_pointer) if @db_pointer
end

def log_error
@logger ||= Logger.new(STDERR)
@logger.error(GoSnowflakeClient.last_error)
end
end
37 changes: 37 additions & 0 deletions examples/snowflake_sample_data.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# frozen_string_literal: true

require_relative 'common_sample_interface.rb' # Creates/uses test_data table in the db you point to
# Assumes you have access to snowflake_sample_data https://docs.snowflake.net/manuals/user-guide/sample-data.html
# Set env vars: SNOWFLAKE_TEST_ACCOUNT, SNOWFLAKE_TEST_USER, SNOWFLAKE_TEST_PASSWORD, SNOWFLAKE_TEST_WAREHOUSE
# optionally set SNOWFLAKE_TEST_SCHEMA, SNOWFLAKE_TEST_ROLE

class SnowflakeSampleData < CommonSampleInterface
def initialize
super('SNOWFLAKE_SAMPLE_DATA')
end

def get_customer_names(where = "c_last_name = 'Flowers'")
raise('db not connected') unless @db_pointer

query = 'select c_first_name, c_last_name from "CUSTOMER"'
query += " where #{where}" if where

GoSnowflakeClient.select(@db_pointer, query) { |row| @logger.info("#{row[0]} #{row[1]}") }
end

# @example process_unshipped_web_sales {|row| check_shipping_queue(row)}
def process_unshipped_web_sales(limit = 1_000, &block)
raise('db not connected') unless @db_pointer

query = <<~QUERY
select c_first_name, c_last_name, ws_sold_date_sk, ws_list_price
from "CUSTOMER"
inner join "WEB_SALES"
ON c_customer_sk = ws_bill_customer_sk
where ws_ship_date_sk is null
#{"limit #{limit}" if limit}
QUERY

GoSnowflakeClient.select(@db_pointer, query, &block)
end
end
47 changes: 47 additions & 0 deletions examples/table_crud.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# frozen_string_literal: true

require_relative 'common_sample_interface.rb' # Creates/uses test_data table in the db you point to
# Set env vars: SNOWFLAKE_TEST_ACCOUNT, SNOWFLAKE_TEST_USER, SNOWFLAKE_TEST_PASSWORD, SNOWFLAKE_TEST_WAREHOUSE, SNOWFLAKE_TEST_DATABASE
# optionally set SNOWFLAKE_TEST_SCHEMA, SNOWFLAKE_TEST_ROLE
# use GoSnowflakeClient.select(c.db_pointer, 'select * from test_table', field_count: 3).to_a to see the db contents
class TableCRUD < CommonSampleInterface
TEST_TABLE_NAME = 'TEST_TABLE'

def initialize
super(ENV['SNOWFLAKE_TEST_DATABASE'])
end

def create_test_table
command = <<~COMMAND
CREATE TEMP TABLE IF NOT EXISTS #{TEST_TABLE_NAME}
(id int AUTOINCREMENT NOT NULL,
some_timestamp TIMESTAMP_TZ DEFAULT CURRENT_TIMESTAMP(),
a_string string(20))
COMMAND
result = GoSnowflakeClient.exec(@db_pointer, command)
result || log_error
end

# @example insert_test_table([['2019-07-04 04:12:31 +0000', 'foo'],['2019-07-04 04:12:31 -0600', 'bar'],[Time.now, 'quux']])
def insert_test_table(time_string_pairs)
command = <<~COMMAND
INSERT INTO #{TEST_TABLE_NAME} (some_timestamp, a_string)
VALUES #{time_string_pairs.map { |time, text| "('#{time}', '#{text}')" }.join(', ')}
COMMAND
result = GoSnowflakeClient.exec(@db_pointer, command)
result || log_error
end

# @example update_test_table([[1, 'foo'],[99, 'bar'],[31, 'quux']])
def update_test_table(id_string_pairs)
id_string_pairs.map do |id, text|
command = <<~COMMAND
UPDATE #{TEST_TABLE_NAME}
SET a_string = '#{text}'
WHERE id = #{id}
COMMAND
result = GoSnowflakeClient.exec(@db_pointer, command)
result || log_error
end
end
end
6 changes: 6 additions & 0 deletions ext/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
build:
go build -buildmode=c-shared -o ruby_snowflake_client.so ruby_snowflake.go
clean:
install:

.PHONY: build
Empty file added ext/extconf.rb
Empty file.
Loading