Ggplot2

Summary

ggplot2 is an open-source data visualization package for the statistical programming language R. Created by Hadley Wickham in 2005, ggplot2 is an implementation of Leland Wilkinson's Grammar of Graphics—a general scheme for data visualization which breaks up graphs into semantic components such as scales and layers. ggplot2 can serve as a replacement for the base graphics in R and contains a number of defaults for web and print display of common scales. Since 2005, ggplot2 has grown in use to become one of the most popular R packages.[2][3][4]

ggplot2
Original author(s)Hadley Wickham, Winston Chang
Initial release10 June 2007; 16 years ago (2007-06-10)
Stable release
3.5.0[1] / 23 February 2024; 46 days ago (23 February 2024)
Repository
  • github.com/tidyverse/ggplot2 Edit this at Wikidata
Written inR
LicenseMIT license
Websiteggplot2.tidyverse.org
ggplot2
ggplot2
Base graphics
Base graphics
ggplot2 and base graphics defaults for a simple scatterplot image

Updates edit

On 2 March 2012, ggplot2 version 0.9.0 was released with numerous changes to internal organization, scale construction and layers.[5]

On 25 February 2014, Hadley Wickham formally announced that "ggplot2 is shifting to maintenance mode. This means that we are no longer adding new features, but we will continue to fix major bugs, and consider new features submitted as pull requests. In recognition [of] this significant milestone, the next version of ggplot2 will be 1.0.0".[6]

On 21 December 2015, ggplot 2.0.0 was released. In the announcement, it was stated that "ggplot2 now has an official extension mechanism. This means that others can now easily create their [own] stats, geoms and positions, and provide them in other packages."[7]

Comparison with base graphics and other packages edit

In contrast to base R graphics, ggplot2 allows the user to add, remove or alter components in a plot at a high level of abstraction.[8] This abstraction comes at a cost, with ggplot2 being slower than lattice graphics.[9]

Creating a different plot for various subsets of the data requires for loops and manual management in base R graphics, whereas ggplot2 simplifies that process with a collection of "facet" functions to choose from.[10]

One potential limitation of base R graphics is the "pen-and-paper model" utilized to populate the plotting device.[11] Graphical output from the interpreter is added directly to the plotting device or window, rather than separately for each distinct element of a plot.[12] In this respect it is similar to the lattice package, though Wickham argues ggplot2 inherits a more formal model of graphics from Wilkinson.[13] As such, it allows for a high degree of modularity; the same underlying data can be transformed by many different scales or layers.[14][15]

Plots may be created via the convenience function qplot() where arguments and defaults are meant to be similar to base R's plot() function.[16][17] More complex plotting capacity is available via ggplot() which exposes the user to more explicit elements of the grammar.[18]

Related projects edit

  • ggpy, ggplot for Python,[19] but has not been updated since 2016-11-20
  • plotnine[20] started as an effort to improve the scalability of ggplot for Python and is largely compatible with ggplot2 syntax.
  • Plotly - Interactive, online ggplot2 graphs[21]
  • gramm, a plotting class for MATLAB inspired by ggplot2[22]
  • gadfly, a system for plotting and visualization written in Julia, based largely on ggplot2[23]
  • Chart::GGPlot - ggplot2 port in Perl[24]
  • The Lets-Plot for Python library includes a native backend and a Python API, which was mostly based on the ggplot2 package well-known to data scientists who use R.[25]
  • Lets-Plot Kotlin API is an open-source plotting library for statistical data implemented using the Kotlin programming language, and is built on the principles of layered graphics first described in the Leland Wilkinson's work The Grammar of Graphics.[26]
  • ggplotnim, plotting library using the Nim programming language inspired by ggplot2.[27]

References edit

  1. ^ "Release 3.5.0". 23 February 2024. Retrieved 22 March 2024.
  2. ^ Wickham, Hadley (July 2010). "ggplot2: Elegant Graphics for Data Analysis". Journal of Statistical Software. 35 (1).
  3. ^ Wilkinson, Leland (June 2011). "ggplot2: Elegant Graphics for Data Analysis by WICKHAM, H". Biometrics. 67 (2): 678–679. doi:10.1111/j.1541-0420.2011.01616.x.
  4. ^ "CRAN - Package ggplot2". 12 October 2023.
  5. ^ ggplot2 Development Team. "Changes and Additions to ggplot2-0.9.0" (PDF). Archived from the original (PDF) on 26 January 2015. Retrieved 31 October 2017.{{cite web}}: CS1 maint: numeric names: authors list (link)
  6. ^ Wickham, Hadley. "ggplot2 development". ggplot2 Google Group. Retrieved 26 February 2014.
  7. ^ "ggplot 2.0.0". 21 December 2015. Archived from the original on 7 February 2021. Retrieved 21 June 2021.
  8. ^ Smith, David. "Create beautiful statistical graphics with ggplot2". Revolutions. Revolution Analytics. Retrieved 11 July 2011.
  9. ^ "ggplot2 Version of Figures in "Lattice: Multivariate Data Visualization with R" (Final Part)". 25 August 2009.
  10. ^ Yau, Nathan (22 March 2016). "Comparing ggplot2 and R Base Graphics". FlowingData. Retrieved 17 April 2022.
  11. ^ Wickham, Hadley (2009). ggplot2: Elegant Graphics for Data Analysis. Springer. p. 5. ISBN 978-0-387-98140-6.
  12. ^ Murrell, Paul (August 2009). "R Graphics". Wiley Interdisciplinary Reviews: Computational Statistics. 1 (2): 216–220. doi:10.1002/wics.22. S2CID 37743308.
  13. ^ Sarkar, Deepayan (2008). Lattice: multivariate data visualization with R. Springer. pp. xi. ISBN 978-0-387-75968-5.
  14. ^ Teetor, Paul (2011). R Cookbook. O'Reilly. p. 223. ISBN 978-0-596-80915-7.
  15. ^ Wickham, Hadley (March 2010). "A Layered Grammar of Graphics" (PDF). Journal of Computational and Graphical Statistics. 19 (1): 3–28. doi:10.1198/jcgs.2009.07098. S2CID 58971746.
  16. ^ R Development Core Team (2011). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. ISBN 978-3-900051-07-5.
  17. ^ Ginestet, Cedric (January 2011). "ggplot2: Elegant Graphics for Data Analysis". Journal of the Royal Statistical Society, Series A. 174 (1): 245–246. doi:10.1111/j.1467-985X.2010.00676_9.x.
  18. ^ Muenchen, Robert A.; Hilbe, Joseph M (2010). "Graphics with ggplot2". R for Stata Users. Statistics and Computing. Springer. pp. 385–452. doi:10.1007/978-1-4419-1318-0_16. ISBN 978-1-4419-1317-3.
  19. ^ "yhat/ggpy: ggplot port for python". GitHub. yhat. Retrieved 1 February 2024.
  20. ^ "plotnine". Retrieved 2 August 2023.
  21. ^ "Plotly graphing library for ggplot2 in ggplot2". Plotly Graphing Libraries. Plotly. Retrieved 1 February 2024.
  22. ^ "ggplot for Matlab". GitHub. Pierre Morel (@piermorel). Retrieved 11 December 2015.
  23. ^ "Gadfly.jl". Gadfly.jl. Retrieved 11 September 2018.
  24. ^ "Stephan Loyd/Chart-GGPlot-0.0001". MetaCPAN. Retrieved 30 March 2019.
  25. ^ "JetBrains/lets-plot". GitHub. JetBrains. Retrieved 3 April 2021.
  26. ^ "JetBrains/lets-plot-kotlin". GitHub. JetBrains. Retrieved 4 April 2021.
  27. ^ "ggplotnim". GitHub. Vindaar. Retrieved 1 August 2023.

Further reading edit

  • Wilkinson, Leland (2005). The Grammar of Graphics. Springer. ISBN 978-0-387-98774-3.
  • Wickham, Hadley (2017). R for Data Science. O'Reilly Media. ISBN 978-1491910399.
  • Wickham, Hadley (6 June 2011). Engineering Data Analysis (with R and ggplot2). Google Tech Talks.

External links edit

  • Official website
  • ggplot2 on GitHub