Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Service mars has timed out #39

Closed
nicrie opened this issue Jan 31, 2022 · 3 comments
Closed

Service mars has timed out #39

nicrie opened this issue Jan 31, 2022 · 3 comments

Comments

@nicrie
Copy link

nicrie commented Jan 31, 2022

I'm trying to download data from Mars via metview but it seems my request always times out before being processed. Is there a way of increasing the time out period? Would it help to reduce the size of the requested file i.e. requesting smaller batches of data? The file size I'm currently requesting is ~ 6 GB.

Below the code I use to download the data:

import numpy as np
import pandas as pd
import metview as mv


# Issue dates for ECMWF forecasts
first_forecast = '2015-01-01'
last_forecast = '2021-07-31'
weekmask = 'Mon Thu'
forecast_dates = pd.bdate_range(
    first_forecast,
    last_forecast,
    freq='C',
    weekmask=weekmask
)
steplist = ["0-168", "168-336", "336-504", "504-672"]

# Retrieve data
fs_data = mv.retrieve(
    Class= "od",
    date      = forecast_dates.strftime('%Y%m%d').values.tolist(),
    time      = '00',  #forecast_dates.strftime('%H').values.tolist(),
    expver= 1,
    levtype= "sfc",
    param= "228.131",
    quantile= ["1:3", "2:3", "3:3", "1:10", "2:10", "3:10", "4:10", "5:10", "6:10", "7:10", "8:10", "9:10", "10:10"],
    step= steplist,
    grid=[1,1],
    stream= "enfo",
    type= "pd"
)
mv.write("fcst-prob-quantile.grib",fs_data)

Output from metview

MetviewInvoker: Invoking Metview
Starting Metview using these command args:
['metview', '-nocreatehome', '-slog', '-python-serve', '/tmp/tmpidvpezc0', '63210']
Metview 5.13.1  (2021.09.16) @ s2s
Installed in /home/nrieger/anaconda3/envs/ecmwf/lib/metview-bundle
event - INFO   - 20220131.143051 - Starting server: port is 34825
event - INFO   - 20220131.143051 - NOTE: $EVENT_HOST is now 'localhost'!
event - INFO   - 20220131.143051 - Got connection
event - INFO   - 20220131.143051 - incoming host is localhost (127.0.0.1)
event - INFO   - 20220131.143051 - Register: PythonServe nrieger s2s 63247 as ref 1
event - INFO   - 20220131.143051 - Maximum value for PythonServe : 1 
event - INFO   - 20220131.143051 - Got connection
event - INFO   - 20220131.143051 - incoming host is localhost (127.0.0.1)
event - INFO   - 20220131.143051 - Register: /Process@63210/PythonScript, line 0:1 nrieger s2s 63210 as ref 3
event - INFO   - 20220131.143051 - Alive value for /Process@63210/PythonScript, line 0:1 : on 
event - INFO   - 20220131.143051 - Starting service: mars
event - INFO   - 20220131.143051 - With command    : " /home/nrieger/anaconda3/envs/ecmwf/lib/metview-bundle/bin/Mars"
event - INFO   - 20220131.143051 - Got connection
event - INFO   - 20220131.143051 - incoming host is localhost (127.0.0.1)
event - INFO   - 20220131.143051 - Register: mars nrieger s2s 63260 as ref 5
event - INFO   - 20220131.143051 - mars : Timeout = 300 Alive = 0, Max = 1
event - INFO   - 20220131.143051 - Maximum value for mars : 1 
event - INFO   - 20220131.143051 - Maximum value for mars : 1 
event - INFO   - 20220131.143051 - Maximum value for mars : 1 
event - INFO   - 20220131.143051 - Maximum value for mars : 1 
event - INFO   - 20220131.143051 - Maximum value for mars : 1 
event - INFO   - 20220131.143051 - Maximum value for mars : 1 
event - INFO   - 20220131.143051 - Maximum value for mars : 1 
event - INFO   - 20220131.143051 - Maximum value for mars : 1 
event - INFO   - 20220131.143051 - Maximum value for mars : 1 
event - INFO   - 20220131.143051 - Maximum value for mars : 1 
event - INFO   - 20220131.143051 - Maximum value for mars : 1 
Mars - INFO   - 20220131.143051 - package mir version: 1.9.6
event - INFO   - 20220131.143051 - Got connection
event - INFO   - 20220131.143051 - incoming host is localhost (127.0.0.1)
event - INFO   - 20220131.143051 - Register: mars@63272 nrieger s2s 63272 as ref 6
event - INFO   - 20220131.143051 - Starting service: pool
event - INFO   - 20220131.143051 - With command    : " /home/nrieger/anaconda3/envs/ecmwf/lib/metview-bundle/bin/pool"
event - INFO   - 20220131.143051 - Got connection
event - INFO   - 20220131.143051 - incoming host is localhost (127.0.0.1)
event - INFO   - 20220131.143051 - Register: pool nrieger s2s 63274 as ref 8
event - INFO   - 20220131.143051 - pool : Timeout = 86400 Alive = 0, Max = 0
pool - INFO   - 20220131.143051 - [mars@63272] - Object /Process@63210/PythonScript, line 0:0 [RETRIEVE] not cached
--> Retrieve::serve in -->
MARS HOME:           
MARS LANGUAGE FILE:  /home/nrieger/anaconda3/envs/ecmwf/lib/metview-bundle/share/metview/etc/MarsDef
MARS RULES FILE:     /home/nrieger/anaconda3/envs/ecmwf/lib/metview-bundle/share/metview/etc/MarsRules

Mars - INFO   - 20220131.143051 - Requesting 35724 fields
Mars - INFO   - 20220131.143051 - ECMWF API is at https://api.ecmwf.int/v1
Mars - INFO   - 20220131.143051 - Using MARS service at services/mars/
Mars - INFO   - 20220131.143052 - ECMWF user id is 'monr'
Mars - INFO   - 20220131.143052 - In case of problems, please check https://confluence.ecmwf.int/display/WEBAPI/Web+API+FAQ or contact servicedesk@ecmwf.int
Mars - INFO   - 20220131.143052 - Request ID is 61f7f29c921bdfd281bb91fa
Mars - INFO   - 20220131.143052 - Request is submitted
Mars - INFO   - 20220131.143053 - Request is queued
event - INFO   - 20220131.143554 - Service mars has timed out
event - INFO   - 20220131.143554 - Closing service mars
event - INFO   - 20220131.143554 - 0 outstanding replies
Mars - FATAL  - 20220131.143554 - Server localhost port 34825 is dead
@iainrussell
Copy link
Member

Hi @nicrie,
Not sure if this is exactly the same as #41, but what you can try is setting the environment variable
METVIEW_TIMEOUT=120
for example, for a 120-minute timeout instead of the default 5 minutes. I'm trying to reproduce this just now on my system (with the default 5-minute timeout), and although it says it has timed out, it is still running, so I'll check it again later to see if it got anywhere...

@nicrie
Copy link
Author

nicrie commented Feb 11, 2022

Thanks @iainrussell for pointing to #41
You're probably right and it is due to the default timeout. The above script finally worked after several tries even without changing the timeout, but I forgot to mention it here. I consider the issue solved. If the issue pops up again I'll try the timeout solution :) Thank you!

@nicrie nicrie closed this as completed Feb 11, 2022
@iainrussell
Copy link
Member

Cheer @nicrie , thanks for letting me know! Have a good weekend!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants