Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

offset >= 100000 not accepted by API but 100001 is !!! #40

Closed
sebdalgarno opened this issue Dec 2, 2020 · 13 comments
Closed

offset >= 100000 not accepted by API but 100001 is !!! #40

sebdalgarno opened this issue Dec 2, 2020 · 13 comments

Comments

@sebdalgarno
Copy link
Member

No description provided.

@joethorley
Copy link
Member

Yes this is from the API - I confirm its still happening with a test

@joethorley joethorley changed the title offset >= 100000 not accepted by API offset >= 100000 not accepted by API but 100001 is !!! Sep 4, 2021
@joethorley
Copy link
Member

test_that("fwa_collection offset errors at 100,000", {
  collectionid <- "whse_basemapping.fwa_stream_networks_sp"

  expect_silent(fwa_collection(collectionid, offset = 99999, limit = 1))
  expect_error(fwa_collection(collectionid, offset = 100000, limit = 1),
               "Invalid value for parameter offset")
})

test_that("fwa_collection offset works at 100,001 !!!!", {
  collectionid <- "whse_basemapping.fwa_stream_networks_sp"
  expect_silent(fwa_collection(collectionid, offset = 100001, limit = 1))
})

@joethorley
Copy link
Member

@smnorris - what do you think of this behaviour

@joethorley
Copy link
Member

For now I've added a check on offset < 100000 because I'm not sure that the result returned with offset 100001 is valid!

@smnorris
Copy link

smnorris commented Sep 7, 2021

Behaviour is weird! I'll see if I can debug.

@smnorris
Copy link

smnorris commented Sep 7, 2021

Well, they provide a response anyway. The requests aren't very useful because they have to be ordered by something for the offset to work reliably.

Ordering by linear_feature_id should fix this... but the server returns a different feature each time below request is made, the orderBy doesn't seem to be working:

https://www.hillcrestgeo.ca/fwapg/collections/whse_basemapping.fwa_stream_networks_sp/items.html?orderBy=linear_feature_id&limit=1&offset=100000

This is the log output from pg_featureserv

INFO[0270] 127.0.0.1:36858 GET /collections/whse_basemapping.fwa_stream_networks_sp/items.json?orderBy=linear_feature_id&limit=1&offset=100000
DEBU[0270] Features query: SELECT ST_AsGeoJSON( ST_Transform( "geom", 4326)  ) AS _geojson , "linear_feature_id","watershed_group_id","edge_type","blue_line_key","watershed_key","fwa_watershed_code"::text,"local_watershed_code"::text,"watershed_group_code"::text,"downstream_route_measure","length_metre","feature_source"::text,"gnis_id","gnis_name"::text,"left_right_tributary"::text,"stream_order","stream_magnitude","waterbody_key","blue_line_key_50k","watershed_code_50k"::text,"watershed_key_50k","watershed_group_code_50k"::text,"gradient","feature_code"::text,"wscode_ltree"::text,"localcode_ltree"::text,"upstream_route_measure","upstream_area_ha" FROM "whse_basemapping"."fwa_stream_networks_sp"     LIMIT 1 OFFSET 10000;
DEBU[0270] Database query result: 1 rows in 383.705893ms
DEBU[0270] ---- Request complete in 385.779509ms

@smnorris
Copy link

smnorris commented Sep 7, 2021

The documentation I was referring to (https://access.crunchydata.com/documentation/pg_featureserv/latest/usage/query_data/) refers to the wrong parameter name, it should be sortBy, not orderBy - this is getting closer:

https://www.hillcrestgeo.ca/fwapg/collections/whse_basemapping.fwa_stream_networks_sp/items.html?sortBy=linear_feature_id&limit=1&offset=11000

The issue seems to be that the max offset value input to the query from the server is 10k. This is the log output from this request:

INFO[1020] 127.0.0.1:36904 GET /collections/whse_basemapping.fwa_stream_networks_sp/items.json?sortBy=linear_feature_id&limit=1&offset=11000
DEBU[1020] Features query: SELECT ST_AsGeoJSON( ST_Transform( "geom", 4326)  ) AS _geojson , "linear_feature_id","watershed_group_id","edge_type","blue_line_key","watershed_key","fwa_watershed_code"::text,"local_watershed_code"::text,"watershed_group_code"::text,"downstream_route_measure","length_metre","feature_source"::text,"gnis_id","gnis_name"::text,"left_right_tributary"::text,"stream_order","stream_magnitude","waterbody_key","blue_line_key_50k","watershed_code_50k"::text,"watershed_key_50k","watershed_group_code_50k"::text,"gradient","feature_code"::text,"wscode_ltree"::text,"localcode_ltree"::text,"upstream_route_measure","upstream_area_ha" FROM "whse_basemapping"."fwa_stream_networks_sp"   ORDER BY "linear_feature_id"   LIMIT 1 OFFSET 10000;
DEBU[1020] Database query result: 1 rows in 427.282033ms
DEBU[1020] ---- Request complete in 431.749204ms

I don't think this is a config issue, these config paging params shouldn't affect the offset?

[Paging]
# The default number of features in a response
LimitDefault = 10000
# Maxium number of features in a response
LimitMax = 10000

@joethorley
Copy link
Member

These are the two queries: the first with an offset of 100001 returns an empty data set but the second with a smaller offset of exactly 100000 throws an error about the offset being too big.

I think the first query should also throw an error.

> fwa_query_collection(collection_id, offset = 100001)
Response [https://hillcrestgeo.ca/fwapg/collections/whse_basemapping.fwa_named_streams]
  Date: 2021-09-17 23:51
  Status: 200
  Content-Type: application/json
  Size: 1.18 kB

Response [https://hillcrestgeo.ca/fwapg/collections/whse_basemapping.fwa_named_streams/items.json?sortBy=fwa_stream_networks_label_id&limit=10000&offset=100001]
  Date: 2021-09-17 23:51
  Status: 200
  Content-Type: application/geo+json
  Size: 452 B
<BINARY BODY>
Simple feature collection with 0 features and 0 fields
Bounding box:  xmin: NA ymin: NA xmax: NA ymax: NA
Geodetic CRS:  WGS 84
# A data frame: 0 × 1
# … with 1 variable: geometry <GEOMETRY [°]>
>  fwa_query_collection(collection_id, offset = 100000)
Response [https://hillcrestgeo.ca/fwapg/collections/whse_basemapping.fwa_named_streams]
  Date: 2021-09-17 23:51
  Status: 200
  Content-Type: application/json
  Size: 1.18 kB

Response [https://hillcrestgeo.ca/fwapg/collections/whse_basemapping.fwa_named_streams/items.json?sortBy=fwa_stream_networks_label_id&limit=10000&offset=1e%2B05]
  Date: 2021-09-17 23:51
  Status: 400
  Content-Type: text/plain; charset=utf-8
  Size: 42 B
Invalid value for parameter offset: 1e+05
Error: API request failed [400]: Invalid value for parameter offset: 1e+05
.

@smnorris
Copy link

Not totally sure what the issue is here - both these requests return an empty response:

https://www.hillcrestgeo.ca/fwapg/collections/whse_basemapping.fwa_named_streams/items.json?limit=1&offset=100000

https://www.hillcrestgeo.ca/fwapg/collections/whse_basemapping.fwa_named_streams/items.json?limit=1&offset=100001

Perhaps error noted above is an encoding issue in the request?
Error: API request failed [400]: Invalid value for parameter offset: 1e+05

Empty responses for these requests are expected, there are only 11,767 records in that table but pg_fs doesn't know that.

To make paging work better I can add a record count meta table that the R client can query (smnorris/fwapg#56), but hopefully it won't be necessary in future versions of pg_fs

@joethorley
Copy link
Member

Hold off on the record count meta table for now - I'll let you know if I think we need it.

@sebdalgarno
Copy link
Member Author

this was an issue on our end caused by conversion of numeric to character for the url string:

d> as.character(100000)
[1] "1e+05"
d> as.character(100001)
[1] "100001"

fixed now in pgfsr with format(x, scientific = FALSE)

@joethorley
Copy link
Member

Hilarious!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants