Skip to content

Commit eb66ef2

Browse files
committed
mentions rawinventors are gone and exanded CPC example
1 parent 44f14eb commit eb66ef2

File tree

1 file changed

+26
-10
lines changed

1 file changed

+26
-10
lines changed

vignettes/api-changes.Rmd.orig

+26-10
Original file line numberDiff line numberDiff line change
@@ -87,22 +87,33 @@ Things to note
8787
1. Four of the new endpoints, brf_sum_text, claims, draw_desc_text and detail_desc_text, are marked 'beta' on the Swagger UI page and their data is not fully populated. See this [page](https://search.patentsview.org/docs/docs/Search%20API/TextEndpointStatus) to see what data is currently populated.
8888
1. Currently some endpoints do not return all the attributes listed in the API’s OpenAPI object. Some throw 500 errors when requested (see test-search-pv.R)
8989
1. There are two rel_app_text endpoints, one under patent/ and one under publication/ They return different entities, rel_app_texts (patent) and rel_app_text_publications (publication)
90-
1. attributes went away, some have new names. Most significantly, patent_number is now patent_id. Requesting patent_number will result in an error being thrown. Note also that the CPC related attributes have new names.
9190
1. Some endpoints now return [HATEOAS links](#HATEOAS)
91+
1. Some fields went away, like the rawinventor fields, and some have new names, most significantly, patent_number is now patent_id. Requesting patent_number will result in an error being thrown. Note also that the CPC related fields have new names, see the next example.
92+
9293

9394
### HATEOAS Links <a name="HATEOAS">
9495
Some of the returned fields are HATEOAS (Hypermedia as the Engine of Application State) links to retrieve more information about that field. Slightly funky is the cpc_current's cpc_group, returned by the patents endpoint. Here the slash in the CPC is turned into a colon. This is a peculiarity of two of the new convience urls that shouldn't be noticable in the r package, unless you are trying to infer the USPC and CPC values from the returned urls, without actually calling back for this data.
9596

96-
Here we'll call the patent endpoint to get cpc_group HATEOAS links:
97+
Here we'll call the patent endpoint to get CPC fields for a particular patend, some of
98+
the fields, like the cpc_group, are HATEOAS links:
9799

98100
```{r}
99101

100102
library(patentsview)
101103

102-
result <- search_pv('{"patent_id": "11530080"}', fields=c( "cpc_current.cpc_group_id"))
104+
query <- '{"patent_id": "11530080"}'
105+
fields <- c('patent_id', get_fields('patent', groups = 'cpc_current'))
106+
fields
107+
108+
result <- search_pv(query, fields=fields)
103109

104-
# as noted below, the returned attribute is cpc_group, not the requested cpc_group_id
105-
print (result$data$patents$cpc_current[[1]]$cpc_group)
110+
# As noted above, the CPC related fields aren't the same as they were in the
111+
# original version of the API. Also note that not all requested fields were
112+
# returned and that _id-less, HATEOAS fields were returned.
113+
unnested <- unnest_pv_data(result$data)
114+
z <- lapply(names(unnested $cpc_current), function(x) {
115+
print(paste0(x,': ', unnested$cpc_current[[x]][[1]]))
116+
})
106117

107118
```
108119
Note that going to these links in a browser will result in a 403 Unauthorized, as no API key is sent.
@@ -152,6 +163,11 @@ Slight weirdness/sleight of hand where the returned field name looses the _id of
152163
### Throttling <a name="throttling"></a>
153164
The API will now allow 45 requests per minute, making more requests will anger the API. It will send back an error code with a header indicating how many seconds to wait before sending more queries. The R package will take care of this for you. It will sleep for the required number of seconds before resubmitting your query, seemlessly to your script.
154165

166+
This means that queries could take a lot longer to run now. Ex. a query that would
167+
return 100,000 rows would now take 3.7 hours to run as each request can return at most 1,000 rows.
168+
169+
100,000 row result set -> 10,000 requests x ( 1 minute/45 requests) = 222.2 minutes = 3.7 hours
170+
155171

156172
### A Note on Paging <a name="a-note-on-paging">
157173
The API team changed how paging works and there is an important subtility that the R package
@@ -256,12 +272,12 @@ range_query
256272
```
257273
* The matched_subentities_only option went away along with subent_cnts and page
258274
* The R package uses the subdomain and paths for the new version of the API
259-
* A hex logo was created using GuangchuangYu’s [hexSticker](https://github.com/GuangchuangYu/hexSticker)
260275
* The original [ropensci_blog_post](ropensci_blog_post.html) was reworked using the new version of the API
261-
* There's a new [tech note](ropensci_tech_note.html) that could be used when the new version of the R package is ready.
262-
* get_fields and search_pv - throw a specialized error if a plural endpoint is passed?
263-
* have get_endpoints() come back in alphabetical order?
264-
276+
* A hex logo was created using GuangchuangYu’s [hexSticker](https://github.com/GuangchuangYu/hexSticker)
265277

278+
Possible Package Improvements <a name="possible-improvements">
266279

280+
* Have get_fields() and search_pv() throw a specialized error if a plural endpoint is passed
281+
* Add an issue template that warns users not to share their API keys
282+
* There's a new [tech note](ropensci_tech_note.html) that could be used when the new version of the R package is ready.
267283

0 commit comments

Comments
 (0)