Skip to content

Docs fail to document thread-safety of genai client, and it fails irrecoverably on multi-threaded use #211

Closed
@jpdaigle

Description

@jpdaigle

Description of the bug:

The Generative Service Client or GenerativeModel classes don't document thread safety assumptions, and don't appear to be usable in a multithreaded environment for making concurrent API requests.

I'd suggest either:

  • documenting thread safety assumptions and guarantees, or
  • investigating behaviour when a client is shared between threads

Behaviour observed: After trying to make concurrent calls to the generative text api, most calls failed with a 60s timeout. The client never recovered (that is, every new call attempt also froze for 60s then ultimately timed out with an error).

Sample error output:

 10%|▉         | 199/2047.0 [29:31<5:46:51, 11.26s/it]
HTTPConnectionPool(host='localhost', port=46423): Read timed out. (read timeout=60.0)
 10%|▉         | 204/2047.0 [30:22<4:59:27,  9.75s/it]
('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
 10%|█         | 209/2047.0 [31:10<6:08:26, 12.03s/it]
('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
 11%|█         | 216/2047.0 [31:43<3:43:00,  7.31s/it]
('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
 11%|█         | 225/2047.0 [32:48<3:52:42,  7.66s/it]
('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
 11%|█▏        | 231/2047.0 [33:38<4:22:00,  8.66s/it]
('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
 12%|█▏        | 245/2047.0 [35:55<6:14:28, 12.47s/it]
HTTPConnectionPool(host='localhost', port=46423): Read timed out. (read timeout=60.0)
('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
 14%|█▍        | 296/2047.0 [43:38<4:30:46,  9.28s/it]
HTTPConnectionPool(host='localhost', port=46423): Read timed out. (read timeout=60.0)

Example snippet:

# [... regular imports ...]
from concurrent.futures import ThreadPoolExecutor
import tqdm

safety_settings = ...
executor = ThreadPoolExecutor(max_workers=5)

def build_data_batch():
  ## build batches of data to process
  pass


def generate(data_batch):
  model_out = 'error'
  try:
    # this ends up failing whether or not the model client is 
    # created freshly per-request, or shared across threads

    model = genai.GenerativeModel('models/gemini-pro', safety_settings=safety_settings)
    model_out = model.generate_content(build_prompt(data_batch)).text
  except Exception as e:
    print(e)
  return model_out

all_outputs = []
all_outputs = executor.map(generate, build_data_batch())

with open('./outputs.txt', 'w') as f:
  for result in tqdm.tqdm(all_outputs, total=totalbatches):
    f.write(result)


Actual vs expected behavior:

Actual: all calls fail.

Expected: this case should either work, or client docs should document as non-thread-safe for concurrent usage given how common batch inference scenarios are likely to be.

Any other information you'd like to share?

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions