Christian Lawson-Perfect @christianp

1 post1 participant0 posts today

**Statistics Globe** @StatisticsGlobe@mastodon.social · 1d

Statistics Globe @StatisticsGlobe@mastodon.social

Mean imputation is a straightforward method for handling missing values in numerical data, but it can significantly distort the relationships between variables.

For a detailed explanation of mean imputation, its drawbacks, and better alternatives, check out my full tutorial here: https://statisticsglobe.com/mean-imputation-for-missing-data/

More details are available at this link: http://eepurl.com/gH6myT

#research #datastructure #businessanalyst

**Piotr Nowak** @hyperplane@c.im · Apr 20

Apr 20

Piotr Nowak @hyperplane@c.im

Designing an Efficient Tree Index on Disaggregated Memory

https://cacm.acm.org/research-highlights/designing-an-efficient-tree-index-on-disaggregated-memory/

Communications of the ACM · Apr 15Designing an Efficient Tree Index on Disaggregated MemoryBy David Roman

#computing #datastructure #algorithm

**Statistics Globe** @StatisticsGlobe@mastodon.social · Apr 1

Apr 1

Statistics Globe @StatisticsGlobe@mastodon.social

gganimate is a powerful extension for ggplot2 that transforms static visualizations into dynamic animations. By adding a time dimension, it allows you to illustrate trends, changes, and patterns in your data more effectively.

The attached animated visualization, which I created with gganimate, showcases a ranked bar chart of the top 3 countries for each year based on inflation since 1980.

More information: https://statisticsglobe.com/online-course-data-visualization-ggplot2-r

GIF

#datastructure #datavisualization #tidyverse

**Statistics Globe** @StatisticsGlobe@mastodon.social · Mar 25

Mar 25

Statistics Globe @StatisticsGlobe@mastodon.social

Visualizing gene structures in R? gggenes, an extension of ggplot2, simplifies the process of creating clear and informative gene diagrams, making genomic data easier to interpret and share.

Visualization: https://cran.r-project.org/web/packages/gggenes/vignettes/introduction-to-gggenes.html

Click this link for detailed information: https://statisticsglobe.com/online-course-data-visualization-ggplot2-r

#datastructure #datavisualization #dataanalytics

**Jan** @janriemer@floss.social · Mar 22

Mar 22

Jan @janriemer@floss.social

Red Green Syntax Trees - an Overview | by Will Speak (aka Plingdollar):

https://willspeak.me/2021/11/24/red-green-syntax-trees-an-overview.html

willspeak.meRed Green Syntax Trees - an Overview - Plingdollar

#Parser #Compiler #DataStructure

Replied in thread

**Jan** @janriemer@floss.social · Mar 16

Mar 16

Jan @janriemer@floss.social

@FizzyOrange

Wow, this crate looks like the most feature-rich tree crate I've ever seen!

It seems very underrated (only ~1000 downloads and one star on GitHub (by me)).

Thank you for the suggestion!

#Rust #RustLang #DataStructure

**Karsten Schmidt** @toxi@mastodon.thi.ng · Mar 10 *

Mar 10 *

Karsten Schmidt @toxi@mastodon.thi.ng

#ReleaseMonday — One of the recent (already very useful!) new package additions to #ThingUmbrella is:

https://thi.ng/leaky-bucket

Leaky buckets are commonly used in communication networks for rate limiting, traffic shaping and bandwidth control, but are equally useful in other domains requiring similar constraints.

A Leaky Bucket is a managed counter with an enforced maximum value (i.e. bucket capacity). The counter is incremented for each a new event to check if it can/should be processed. If the bucket capacity has already been reached, the bucket will report an overflow, which we can then handle accordingly (e.g. by dropping or queuing events). The bucket also has a configurable time interval at which the counter is decreasing (aka the "leaking" behavior) until it reaches zero again (i.e. until the bucket is empty). Altogether, this setup can be utilized to ensure both an average rate, whilst also supporting temporary bursting in a controlled fashion...

Related, I've also updated/simplified the rate limiter interceptor in https://thi.ng/server to utilize this new package...

thi.ng/leaky-bucketConfigurable, counter-based Leaky Bucket abstractions

#DataStructure #RateLimiting #OpenSource

**Inautilo** @inautilo@mastodon.social · Mar 6

Mar 6

Inautilo @inautilo@mastodon.social

#Development #Guides
Bloom filter · What they are and why they are so powerful https://ilo.im/162mpl

_____
#Programming #Coding #BloomFilter #HashTable #DataStructure #JavaScript #Database #WebDev #Frontend #Backend

kirupa.comBloom Filter: A Deep DiveLearn how to quickly check if a value exists in a ridiculously large collection of data by using the world's favorite little probabilistic data structure, the Bloom filter.

**Statistics Globe** @StatisticsGlobe@mastodon.social · Feb 28

Feb 28

Statistics Globe @StatisticsGlobe@mastodon.social

I used to think that writing sophisticated R code meant using all the advanced features and chaining long functions together...

Fancy code can be fun, but clean code makes collaboration and debugging so much easier.

Stay informed on data science by joining my free newsletter. Check out this link for more details: http://eepurl.com/gH6myT

eepurl.comStatistics GlobeStatistics Globe Email Forms

#datastructure #datasciencecourse #datasciencetraining

**Statistics Globe** @StatisticsGlobe@mastodon.social · Feb 11

Feb 11

Statistics Globe @StatisticsGlobe@mastodon.social

In missing data imputation, it is crucial to compare the distributions of imputed values against the observed data to better understand the structure of the imputed values.

The visualization below can be generated using the following R code:

library(mice)
my_imp <- mice(boys)
densityplot(my_imp)

Take a look here for more details: https://statisticsglobe.com/online-workshop-missing-data-imputation-r

#datastructure #statisticalanalysis #dataanalytics

**Knowledge Zone** @kzoneind@mstdn.social · Feb 7

Feb 7

Knowledge Zone @kzoneind@mstdn.social

#ITByte: Algorithms and data structures are central to #ComputerScience.

Here is a quick refresher on #DataStructure and #Algorithms. #CS101

https://knowledgezone.co.in/trends/explorer?topic=Data-Structure-Algorithms

**Statistics Globe** @StatisticsGlobe@mastodon.social · Feb 4

Feb 4

Statistics Globe @StatisticsGlobe@mastodon.social

Avoiding text overlap in plots is essential for clarity, and R offers a great solution with the ggplot2 and ggrepel packages. By automatically repositioning labels, ggrepel keeps your plot clean and easy to interpret.

Video: https://www.youtube.com/watch?v=5lu4h_CPhi0
Website: https://statisticsglobe.com/avoid-overlap-text-labels-ggplot2-plot-r

Take a look here for more details: https://statisticsglobe.com/online-course-data-visualization-ggplot2-r

#pythonprogramminglanguage #statisticalanalysis #datascience

**naught101** @naught101@mastodon.social · Jan 30

Jan 30

naught101 @naught101@mastodon.social

Is there a data structure that can sensibly handle multiple hierarchical classification systems?

e.g. an Orange, in terms of phylogeny is
Plantae->Eudicot->...->Citrus->sinensis

and in terms of usefulness, is
Thing->Food->fruit->orange
(and it could have multiple parents in this taxonomy, e.g. cleaning product)

Bonus points for cool visualisations of this kind information.

#data #dataScience #dataStructure

**Statistics Globe** @StatisticsGlobe@mastodon.social · Jan 3

Jan 3

Statistics Globe @StatisticsGlobe@mastodon.social

In statistics, Frequentist and Bayesian approaches are two major methods of inference. While they aim to solve similar problems, they differ in their interpretation of probability and handling of uncertainty.

Frequentists interpret probability as the long-run frequency of events. Parameters (like the mean) are fixed but unknown, and inference relies on analyzing repeated samples.

Learn more: http://eepurl.com/gH6myT

#datascience #datavisualization #datastructure

**Statistics Globe** @StatisticsGlobe@mastodon.social · Dec 31, 2024

Dec 31, 2024

Statistics Globe @StatisticsGlobe@mastodon.social

Bring your visualizations to life with see, a dynamic R package from the easystats ecosystem that extends ggplot2 to create modern and intuitive graphics. Whether you're visualizing statistical models or exploring data, see simplifies the process and enhances the presentation of your insights.

Visualizations: https://github.com/easystats/see

Take a look here for more details: https://statisticsglobe.com/online-course-data-visualization-ggplot2-r

#datastructure #rprogramming #tidyverse

**Statistics Globe** @StatisticsGlobe@mastodon.social · Dec 24, 2024

Dec 24, 2024

Statistics Globe @StatisticsGlobe@mastodon.social

Dimensionality reduction simplifies high-dimensional data while retaining its essential features. It’s a powerful tool for improving data analysis, visualization, and machine learning performance.

Image credit to Wikipedia: https://en.wikipedia.org/wiki/Dimensionality_reduction#/media/File:PCA_Projection_Illustration.gif

I've developed an in-depth course on PCA theory and its application in R programming. Check out this link for more details: https://statisticsglobe.com/online-course-pca-theory-application-r

GIF

#rstudio #datastructure #programming

**Statistics Globe** @StatisticsGlobe@mastodon.social · Dec 13, 2024

Dec 13, 2024

Statistics Globe @StatisticsGlobe@mastodon.social

Understanding the difference between Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL) can be challenging!

Visualization source: https://en.wikipedia.org/wiki/Deep_learning#/media/File:AI-ML-DL.svg

#database #datastructure #datascience

**Statistics Globe** @StatisticsGlobe@mastodon.social · Nov 26, 2024

Nov 26, 2024

Statistics Globe @StatisticsGlobe@mastodon.social

Creating publication-ready plots in R is easier than ever with ggpubr. This extension for ggplot2 simplifies the process of generating clean and professional graphics, especially for exploratory data analysis and reporting.

Course link: https://statisticsglobe.com/online-course-data-visualization-ggplot2-r

#dataanalytics #rstats #dataanalytic

**Statistics Globe** @StatisticsGlobe@mastodon.social · Nov 22, 2024

Nov 22, 2024

Statistics Globe @StatisticsGlobe@mastodon.social

The Student's t-test is a crucial statistical method used to determine if there are significant differences between the means of two groups. It is widely applied in various fields to analyze small data sets, providing valuable insights when used correctly.

This visualization is based on the images of this Wikipedia article: https://en.wikipedia.org/wiki/Student%27s_t-test

Further details: https://statisticsglobe.com/online-course-statistical-methods-r

#bigdata #rprogramming #datastructure

**Statistics Globe** @StatisticsGlobe@mastodon.social · Nov 5, 2024

Nov 5, 2024

Statistics Globe @StatisticsGlobe@mastodon.social

In Bayesian inference, a credible interval is a range of values within which a parameter lies with a certain probability, given the observed data and prior beliefs. The image of this post (based on this Wikipedia image: https://en.wikipedia.org/wiki/Credible_interval#/media/File:Highest_posterior_density_interval.svg) represents a 90% highest-density credible interval of a posterior probability distribution.

More details: http://eepurl.com/gH6myT

#statistical #datasciencecourse #datascience

Recent searches

Search options

Administered by:

Server stats:

#datastructure