Python and R Tips

How to Randomly Select Groups in R with dplyr?

July 24, 2019 by cmdlinetips

Sampling, randomly sub-setting, your data is often extremely useful in many situations. If you are interested in randomly sampling without regard to the groups, we can use sample_n() function from dplyr. Sometimes you might want to sample one or multiple groups with all elements/rows within the selected group(s). However, sampling one or more groups with […]

Dimensionality Reduction with tSNE in Python

July 14, 2019 by cmdlinetips

tSNE, short for t-Distributed Stochastic Neighbor Embedding is a dimensionality reduction technique that can be very useful for visualizing high-dimensional datasets. tSNE was developed by Laurens van der Maaten and Geoffrey Hinton. Unlike, PCA, one of the commonly used dimensionality reduction techniques, tSNE is non-linear and probabilistic technique. What this means tSNE can capture non-linaer […]

How To Slice Rows and Columns of Sparse Matrix in Python?

July 4, 2019 by cmdlinetips

Sometimes, while working with large sparse matrices in Python, you might want to select certain rows of sparse matrix or certain columns of sparse matrix. As we saw earlier, there are many types of sparse matrices available in SciPy in Python. Each of the sparse matrix type is optimized for specific operations. We will see […]

9 Basic Linear Algebra Operations with NumPy

June 27, 2019 by cmdlinetips

Linear algebra is one of the most important mathematical topics that is highly useful to do a good data science. Learning the basics of linear algebra adds a valuable tool set to your data science skill. Python’s NumPy has fast efficient functions for all standard linear albegra/matrix operations. Here we will see 9 important and […]