We are in the 21st century and no matter whichever job we are in, we are trying to innovate to some extent. But it feels like there is not much to innovate, all the processes are there in front of us. Most of the problems we are trying to solve are optimization problems, how to … Continue reading Talent is distributed, there is nothing like a true genius in 21st century, it’s a rarity…
SUM ignores NULL values while calculating the sum of all values in a column - https://www.tutorialspoint.com/How-MySQL-SUM-function-evaluates-if-the-column-having-NULL-values-too What happens when UNION is applies on two different tables?Even though CITY and VENUE are two different tables, the UNION still works because: The number of columns are the same for both select statements (city, venue). The columns, in … Continue reading SQL – Quick Refer
Encoding of the CSV files through which data sets are accessed typically: By default, in Windows PC, the CSVs are UTF-8 encoded which doesn't entail all possible characters within the data. Instead, if the encoding is set to ISO-8859-1, then the pandas dataframe will be read properly without throwing any errors or warnings. This is only … Continue reading Python bugs
Y-axis represents the PDE (probability density estimate) which is defined as probability per unit value of whatever variable is on X-axis. It's values are used relatively to compare points on the graph. It is also termed as probability differential - probability of a point occurring between two values x1 and x2, represented by the area under … Continue reading What does Y-axis mean in a KDE plot?
Similarity index quantifies the closeness between entities for all the different model types (memory-based, model-based) of recommender systems.
You will find this useful when you get a message like this in your termial: ‘python’ is not recognized as an internal or external command, operable program or batch file. Note: You will have to reopen all command prompt windows in order for changes to the Path variable take effect. When you reach the System Variables window, click Edit and … Continue reading Changing the path to initiate python in your terminal
1. What is a Covariate? In statistics, a covariate is a variable that is possibly predictive of the outcome under study. A covariate may be of direct interest or it may be a confounding or interacting variable. The alternative terms explanatory variable, independent variable, or predictor, are used in a regression analysis. 2. What is Overfitting? Overfitting occurs when a … Continue reading Statistics and Statistical Analysis -Glossary
It is not possible to determine causality in a data set. For that you need to look outside data science. Correlations can be a helpful tool to intuitively understand a data set. Thinking about what the correlation actually means in reality can help you understand the data better (for example if data science tells me … Continue reading Can causality be determined from a data set or a set of observations?
This is a compilation of the petty bugs I faced in my journey of using, implementing R for data analytics. Of course, these might not be called petty if you are just starting out. R Markdown: Can't run code chunk: This can also be puts as "Nothing happens when I run a code chunk". Check if you … Continue reading R bugs
With repo - sql-data-exploration-project: README.md was created in the remote repository and tried pushing local files. As both of them don't have any common commits, --allow-unrelated-histories has to be used to synchronize local and remote. git pull origin <branch_name> - -allow-unrelated-histories With repo - exploring-hacker-news-posts: (Recommended)README.md was created after a commit and a push into the … Continue reading Different approaches of syncing local and remote repos via github