programming.dev
  • Communities
  • Create Post
  • Create Community
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
ericjmorey to R ProgrammingEnglish · 1 year ago

datasauRus: Datasets from the Datasaurus Dozen

cran.r-project.org

external-link
message-square
0
link
fedilink
  • cross-posted to:
  • [email protected]
9
external-link

datasauRus: Datasets from the Datasaurus Dozen

cran.r-project.org

ericjmorey to R ProgrammingEnglish · 1 year ago
message-square
0
link
fedilink
  • cross-posted to:
  • [email protected]
The Datasaurus Dozen is a set of datasets with the same summary statistics. They retain the same summary statistics despite having radically different distributions. The datasets represent a larger and quirkier object lesson that is typically taught via Anscombe's Quartet (available in the 'datasets' package). Anscombe's Quartet contains four very different distributions with the same summary statistics and as such highlights the value of visualisation in understanding data, over and above summary statistics. As well as being an engaging variant on the Quartet, the data is generated in a novel way. The simulated annealing process used to derive datasets from the original Datasaurus is detailed in "Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing" <doi:10.1145/3025453.3025912>.
alert-triangle
You must log in or register to comment.

R Programming

r_programming

Subscribe from Remote Instance

Create a post
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: [email protected]

Please use this as a forum to discuss R, and learn more about it. If you have any questions about how to do specific things in R, this is the place to ask.

Getting Started

You can download R here.

You can download RStudio here. RStudio IDE, which is supported by Posit PBC, is a powerful and well-developed IDE for R. Other development environment options include Emacs addon Emacs Speak Statistics and VSCode.

Other Communities

Other communities that may be of interest across the fediverse:

  • https://lemmy.ml/c/rstats
  • https://lemmy.ml/c/dataisbeautiful
  • https://lemmy.world/c/dataisbeautiful
  • https://code4lib.net/c/datascience
  • https://discuss.tchncs.de/c/data_engineering

Please send @a_statistician a message to recommend additional communities to add to this list.

Learning resources:

  • R for Data Science - a good introductory book for learning R. Start here if you’re overwhelmed.
  • Big Book of R - collection of more than 500 online books/tutorials covering various aspects of R. Some links are to paid books with previews, but most links are to free online textbooks.
Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 1 user / day
  • 2 users / week
  • 2 users / month
  • 1 user / 6 months
  • 79 local subscribers
  • 298 subscribers
  • 46 Posts
  • 29 Comments
  • Modlog
  • mods:
  • snowe
  • a_statistician
  • SamC@lemmy.nz
  • BE: 0.19.11
  • Modlog
  • Legal
  • Instances
  • Docs
  • Code
  • join-lemmy.org