Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add some of the helper functions from experiment scripts into package e #48

Open
abigailsnyder opened this issue Mar 30, 2022 · 2 comments
Labels
enhancement New feature or request

Comments

@abigailsnyder
Copy link
Collaborator

eg

# helper function to pull the GSAT data of the specific ensemble members in
# a target data frame:
def get_orig_GSAT_data(target_df): 
    ...
# Function to remove any ensemble members from a target data frame that
# stop before 2099, for example, ending in 2014 like some MIROC6 SSP245:
def prep_target_data(target_df):
    ...
@abigailsnyder abigailsnyder added the enhancement New feature or request label Mar 30, 2022
@abigailsnyder
Copy link
Collaborator Author

turn code block for pulling off first 5 (N) numerical ensemble members from a target set into a package function:

# we will target  with the first 5 numerical ensemble members.
    # Not all models start the ensemble count at 1,
    # And not all experiments of a given model report the
    # same realizations.
    # select 5 ensemble realizations to
    # look at if there are more than 5.
    f = lambda x: x.ensemble[:x.idx]

    if not target_data.empty:
        ensemble_list = pd.DataFrame({'ensemble': target_data["ensemble"].unique()})
        ensemble_list['idx'] = ensemble_list['ensemble'].str.index('i')
        ensemble_list['ensemble_id'] = ensemble_list.apply(f, axis=1)
        ensemble_list['ensemble_id'] = ensemble_list['ensemble_id'].str[1:].astype(int)
        ensemble_list = ensemble_list.sort_values('ensemble_id').copy()
        if len(ensemble_list) > 5:
            ensemble_keep = ensemble_list.iloc[0:5].ensemble
        else:
            ensemble_keep = ensemble_list.ensemble

        target_data = target_data[target_data['ensemble'].isin(ensemble_keep)].copy()
        del (ensemble_keep)
        del (ensemble_list)

@abigailsnyder
Copy link
Collaborator Author

filtering one data frame based on another data frame

# Keep only the entries that appeared in pangeo_good_ensembles:
keys =['model', 'experiment', 'ensemble']
i1 = data.set_index(keys).index
i2 = pangeo_good_ensembles.set_index(keys).index
data= data[i1.isin(i2)].copy()
del(i1)
del(i2)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant