Happy Friday everyone! Here's some news from the world of R this week:
- Academic Paper Interfaces: Stephen Turner has written a blog post explaining how to scrape from the bioRxiv API using R. The article explains how to use {httr2} and tidyverse packages to efficiently and cleanly gather metadata and details for over 200,000 preprint publications.
- Reading magic: David Schoch has developed a new package {paperwizard}, which builds on the Readability.js Javascript library to enable easy extraction of readable content (i.e. news articles) from web pages. It is an add-on to the existing {paperboy} package - check out the article for more, or the source code here.
- CRAN you believe it: Ari Lamstein has written an interesting article discussing the role of CRAN and how it compares to its Python equivalent, PyPI. Many R package maintainers have strong views on the CRAN submission process and whether you think it is unnecessarily difficult, or justifiably robust, I think it is always something worth discussing - given how critical CRAN is to the R ecosystem.
- Mock the week: ROpenSci have released v2.0 of the {webmockr} package for stubbing and setting expectations on HTTP requests. The new version contains some useful features like showing diffs of request bodies and more logical error handling. Check out Scott Chamberlain's release article above, or the package changelog for the full details.
- Did you know:?you can use the {statquotes} package to generate random quotes about statistics!
#install.packages("statquotes")
statquotes::statquote()
#> Since all models are wrong the scientist cannot obtain a 'correct' one
#> by excessive elaboration. On the contrary following William of Occam he
#> should seek an economical description of natural phenomena. Just as the
#> ability to devise simple but evocative models is the signature of the
#> great scientist so overelaboration and overparameterization is often
#> the mark of mediocrity.
#> --- George E P Box, Science and Statistics, Journal of the American
#> Statistical Association 71, 1976
- {bushtucker} v0.1.0 - (a self-plug) my new data package is now on CRAN containing data on the UK TV show 'I'm a Celebrity, Get Me Out of Here'.
- {dials}1.4.0 - the {tidymodels} infrastructure package gets some bug fixes and improvements.
I post updates like this every week so if you're interested feel free to follow. Comment below if there's something interesting you found out this week too!
Senior Researcher | Public and global health | Data management and data visualization | Advanced statistical analysis of both real-world evidence databases, clinical trials and surveys
1 个月You have a great weekly R review! In my experience, it could win if you will produce not as an "article" LinkedIn type, but as a "newsletter" - in this case people can subscribe to your newsletter, and it will arrive to email boxes as well. You just need to create a newsletter on LinkedIn (it is free), and choose this newsletter when start to write an "article". And every time this newsletter articles will be published both as an article and arrive to the subscribers emails. It will looks like https://www.dhirubhai.net/newsletters/%E2%9A%95-health-care-evidence-funds-7111366212063813632/
Thanks for mentioning my blog post!
Professor chez York University
1 个月Thanks for mentioning {statquotes}