Skip to content

Conversation

@luandy64
Copy link

Description of change

This PR fixes what appears to be a bug from #192.

Here is the stacktrace

CRITICAL 'NoneType' object has no attribute 'utcoffset'
Traceback (most recent call last):
  File "/usr/local/share/virtualenvs/tap-hubspot/bin/tap-hubspot", line 33, in <module>
    sys.exit(load_entry_point('tap-hubspot', 'console_scripts', 'tap-hubspot')())
  File "/home/vagrant/git/tap-hubspot/tap_hubspot/__init__.py", line 1164, in main
    raise exc
  File "/home/vagrant/git/tap-hubspot/tap_hubspot/__init__.py", line 1161, in main
    main_impl()
  File "/home/vagrant/git/tap-hubspot/tap_hubspot/__init__.py", line 1155, in main_impl
    do_sync(STATE, args.properties)
  File "/home/vagrant/git/tap-hubspot/tap_hubspot/__init__.py", line 1040, in do_sync
    STATE = stream.sync(STATE, ctx) # pylint: disable=not-callable
  File "/home/vagrant/git/tap-hubspot/tap_hubspot/__init__.py", line 505, in sync_contacts
    bookmark_values[row['vid']] = utils.strftime(modified_time)
  File "/usr/local/share/virtualenvs/tap-hubspot/lib/python3.9/site-packages/singer/utils.py", line 41, in strftime
    if dtime.utcoffset() != datetime.timedelta(0):
AttributeError: 'NoneType' object has no attribute 'utcoffset'

Diagnosis

The sync code flow seems to be using sync_contact_vids in order to collect a page of vids from the /contacts/v1/lists/all/contacts/all endpoint.

Those get passed off to _sync_contact_vids in order to be sent to the /contacts/v1/contact/vids/batch endpoint to get more details for specific vids.

The bug seems to be that we assume every object in the first response has a versionTimestamp key. So when we initialize modified_time = None on Line 494, nothing gives it a new value before we try bookmark_values[row['vid']] = utils.strftime(modified_time), leading to our stack trace

Rationale for change

I saw this line in _sync_contact_vids which makes me think that we don't always expect bookmark_values to have the vid. So the change in this PR is to not convert the a value to a datetime when we don't have a value.

Manual QA steps

  • I wrote a unit test that reproduces the stack trace on master
  • The test mocks the response of the first endpoint to return records that have and don't have a versionTimestamp. Then it mocks the second response to be super minimal. The actual test is that records in the second response are supplemented with the versionTimestamp of the first response, if we have it.

Risks

  • Low: The first API response seems to get paginated in full, so we shouldn't miss data from it. Now if you have a versionTimestamp, it gets attached to the objects in the second response instead of killing the tap.
    • It makes you wonder if this versionTimestamp is the right choice as a replication key though since it can be null.

Rollback steps

  • revert this branch

@techieshark
Copy link

If this PR was approved 2 years ago, is there a reason it hasn't been merged?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants